Wire protocol¶
The TCP protocol is line-based UTF-8. Every request is exactly three newline-terminated lines:
Every response is one newline-terminated line:
\r\n is accepted as a line ending; trailing \r is stripped.
Caps¶
| Constant | Value | Notes |
|---|---|---|
MaxLineBytes |
256 | Cap on cmd, key, and most arg lines. |
MaxAuthTokenBytes |
64 KiB | Larger cap for the auth token line. |
Lines longer than the cap return error with code 12. The
oversized line is drained to newline so the framing stays in sync.
Authentication¶
When --auth-token is set, the first command on a fresh connection
must be auth:
Anything else returns error_auth. The server adds a 100 ms cool-
down before closing on auth failure.
The key for auth is irrelevant; _ is convention. The token is
compared in constant time.
Commands¶
l — lock acquire (single-phase)¶
l
<key>
<timeout> [<lease_ttl>] # both in seconds
ok <token> <lease_ttl> # granted
timeout # acquire timeout fired
error_max_locks # unique-key cap reached
error_max_waiters # per-key waiter cap reached
error_lease_expired # rare; granted slot's lease expired before observation
error_limit_mismatch # this key was already created as a sem with limit != 1
timeout=0 is non-blocking (try-lock).
r — lock release¶
n — lock renew¶
n
<key>
<token> [<lease_ttl>]
ok <remaining_seconds>
error # token not held, or lease already expired
If lease_ttl is omitted, the server uses --default-lease-ttl.
e — lock enqueue (two-phase, phase 1)¶
e
<key>
[<lease_ttl>]
acquired <token> <lease_ttl> # capacity available; you hold the lock
queued # waiter registered; call `w` next
error_already_enqueued # this connection already has phase-1 state for this key
error_max_locks
error_max_waiters
error_limit_mismatch
w — lock wait (two-phase, phase 2)¶
w
<key>
<timeout>
ok <token> <lease_ttl> # grant arrived
timeout # wait timeout fired (waiter still in queue if connection stays open? no — popped)
error_not_enqueued # no matching `e` on this connection
error_lease_expired
Must be called from the same connection that issued the matching
e. The connection is what binds the two phases.
Semaphores: sl, sr, sn, se, sw¶
Identical to l/r/n/e/w but for semaphores. The differences
are the arg shapes:
sl se
<key> <key>
<timeout> <limit> [<lease>] <limit> [<lease>]
sr sn sw
<key> <key> <key>
<token> <token> [<lease>] <timeout>
The same key cannot be used as both a lock and a semaphore;
mismatching limit returns error_limit_mismatch. A semaphore
with limit=1 is wire-equivalent to a lock but the namespaces are
separate (lock: vs sem: prefix internally).
ping — keepalive¶
The two trailing fields are unused. Convention: _.
stats — server snapshot¶
Body of the <json> is the same shape as GET /v1/stats:
{
"connections": 14,
"locks": [{"key":"...","owner_conn_id":42,"lease_expires_in_s":18.5,"waiters":0}],
"semaphores": [...],
"idle_locks": [...],
"idle_semaphores": [...]
}
Status table¶
| Status | Meaning |
|---|---|
ok |
Generic success. May carry a token + lease, or stats JSON. |
acquired |
Two-phase fast path: enqueue immediately granted you the slot. |
queued |
Two-phase: waiter registered; call w/sw next. |
timeout |
Acquire/wait timeout fired without a grant. |
error |
Generic error. Token not held, or unknown protocol error. |
error_auth |
Auth handshake failed. |
error_max_locks |
--max-locks cap reached. |
error_max_waiters |
--max-waiters cap reached for this key. |
error_limit_mismatch |
Sem limit doesn't match the existing key's limit. |
error_not_enqueued |
wait without a matching enqueue on this connection. |
error_already_enqueued |
enqueue while this connection already has phase-1 state for this key. |
error_lease_expired |
Granted slot's lease expired before we could observe it. |
error_draining |
Server is in graceful shutdown. |
Token format¶
Every grant returns a 32-char lowercase-hex token:
<16 hex: fence prefix (uint64, big-endian)><16 hex: random salt>
example: 0001a3f217b3c4d87f3c1f2b3e9a8d6e
└─── fence prefix ───┘└── salt ──┘
The fence prefix is a server-monotonic uint64 seeded at
startup from time.Now().UnixNano() and incremented atomically on
every grant, so the prefix strictly increases across grants and
across grants from one server instance. With --fence-state-file,
the prefix is also strict across restarts and crashes; without it,
cross-restart monotonicity depends on a non-regressing wall clock.
Lex-comparing two tokens for the same key reflects the order
their grants were issued, which is the property a downstream
resource needs to use the token as a fencing token: store the most
recent token seen for a key, reject any write whose token compares
less.
The salt is 8 bytes from crypto/rand. It preserves ~64 bits
of unguessability so a third party who saw one token cannot trivially
forge another for the same key.
Caveats:
- Fences from different keys aren't meaningfully comparable — the global counter increments across all keys, so prefix order reflects when grants happened, not anything about the resource.
- A
Limit>1semaphore issues a distinct fence per grant; fencing orders the grants, not the resource. The classic single-writer fencing pattern doesn't directly apply. - Cross-restart monotonicity by default depends on the wall clock
not regressing across the restart (NTP step, VM snapshot, manual
change). For unconditional cross-restart monotonicity even
through crashes and clock regressions, run with
--fence-state-file=/path: dflockd pre-allocates fence ranges to a checksummed two-slot journal (onefsyncper ~1M grants, ~3 ns/op extra in the token-mint path). The next instance reads the persisted ceiling and seeds above it, so the first fence it issues is strictly greater than any fence the prior incarnation could ever have issued. Allocation failures (disk full, EIO) surface as a generic protocolerror; HTTP gets503 fence_persistence. - Tokens are not cryptographically signed. Treat the auth + TLS layer as the boundary that protects token confidentiality.
Disconnect semantics¶
Closing the TCP connection triggers LockManager.CleanupConnection:
- Pending waiters (
l/slblocking, ore/sefollowed by now/sw) are cancelled. The disconnected client can never observe a future grant. - Held tokens are released only when
--auto-release-on-disconnect=true(the default). With it set tofalse, held slots remain reachable via lease expiry.
Read errors and framing¶
If --read-timeout fires mid-frame, or a single line exceeds
MaxLineBytes, the server writes error and disconnects. The
framing can't be safely resumed after either condition.