Wire protocol
Protocol version: 2. v2 adds a per-frame HMAC envelope plus replay protection; v1 is not accepted on the wire. The canonical definitions live in z4j_core.transport.frames; see the WebSocket protocol reference for the full schema.
Transport
Section titled “Transport”- WebSocket, TLS in production (
wss://), plaintext for local dev (ws://). - One JSON object per WebSocket frame, UTF-8. There is no WS-layer batching; the
event_batchframe carries the application-level batch. - Authentication:
Authorization: Bearer <agent-token>on the WebSocket handshake. - HTTP longpoll fallback for networks that block WebSockets:
POST /api/v1/agent/eventsto upload signed frames andGET /api/v1/agent/commandsto poll for inbound commands. Samez4j_coreframe types, same HMAC envelope.
Frame envelope
Section titled “Frame envelope”Every frame carries the version, an id, and a type. Stateful frames also carry the HMAC envelope (nonce, seq, hmac); handshake frames (hello / hello_ack) are unsigned because the agent and brain are still negotiating which key to use.
{ "v": 2, "type": "event_batch", "id": "<agent-generated, 1..64 chars>", "ts": "2026-04-16T12:34:56.789Z", "nonce": "<urlsafe random>", "seq": 4281, "hmac": "<base64 HMAC over the canonical envelope>", "payload": { "events": [] }}Agent to brain
Section titled “Agent to brain”First frame the agent sends. Declares the agent’s protocol version, framework, engines, schedulers, and host info. The brain validates compatibility before accepting the connection.
event_batch
Section titled “event_batch”The hot path. payload.events is capped at 5000 entries; the agent’s batcher caps itself at 500. Signed in v2 so a stolen bearer token alone cannot forge events.
heartbeat
Section titled “heartbeat”Default cadence is 10 seconds; the brain returns its preferred heartbeat_interval_seconds in hello_ack and the agent honours that.
command_result
Section titled “command_result”Reply to a brain-initiated command. Correlated by command_id.
registry_delta
Section titled “registry_delta”Engine / scheduler registry updates after hello (e.g. a new queue appeared, a worker capability changed). Treated as additive state on the brain.
Brain to agent
Section titled “Brain to agent”hello_ack
Section titled “hello_ack”Carries agent_id, project_id, session_id, plus tuning parameters (heartbeat interval, max frame size).
event_batch_ack
Section titled “event_batch_ack”Round-trip ack so the agent knows which buffered batch it can drop. Includes the original frame id (acked_id) so the agent’s in-flight map can match precisely.
command
Section titled “command”Dispatched in response to a REST call against /api/v1/projects/{slug}/commands/.... The verb set matches the routes there: retry_task, cancel_task, bulk_retry, purge_queue, restart_worker, pool_resize, add_consumer, cancel_consumer, rate_limit. There are no schedule-CRUD commands on the wire — schedules are stored in the brain and the scheduler adapters read them directly.
command_ack
Section titled “command_ack”Brain confirms receipt of a command_result before the agent drops its in-flight record.
Fatal protocol error; the brain will close the socket immediately after.
agent_status
Section titled “agent_status”Brain pushes status changes (e.g. another worker joined or left under the same agent_id).
Sequence numbers and delivery
Section titled “Sequence numbers and delivery”- Agent to brain: the agent’s
seqis monotonic per signed-frame stream. Replay protection rejects out-of-order or repeated nonces. - The brain dedupes accepted events by
(agent_id, seq, event_id)so a retried batch after a brain-side restart is idempotent. - The
event_batch_ackcarriesreceived,accepted,rejectedso the agent can confirm exactly which rows landed. - Brain to agent:
command_idis a UUID. The agent replies with the samecommand_idincommand_result. Commands time out server-side and surface ascommand_timeouterrors (504) on the REST side.
Version negotiation
Section titled “Version negotiation”hello.payload.protocol_version must be "2". A mismatch closes the WebSocket with code 4426 (Upgrade Required). The agent logs and does not reconnect until upgraded.
Forward-compat: agents and brain ignore unknown fields. New command verbs added in a minor are rejected by older agents with ok: false, error: "unsupported_command"; the brain surfaces that as a command failure rather than a protocol error.