Audit webhook forwarding

The brain can optionally forward every audit-log row to an external webhook as it is written. The primary use cases:

SIEM ingest: Splunk HEC, Datadog Logs, Sumo Logic, Elastic. The receiver gets a JSON-per-row stream identical in shape to what the brain stores.
Out-of-band tamper detection: the receiver keeps an append-only copy on a separate trust boundary, so an attacker who compromises the brain’s database cannot also rewrite the receiver’s history.
Compliance evidence: SOC 2 / ISO 27001 auditors often want audit data in a logging stack they already control.

The brain’s primary HMAC-chained audit log remains the source of truth. The forwarder is a best-effort mirror — if the receiver is down, the row is dropped and a metric increments, but the brain’s own write succeeds.

Enabling

# In your env file or systemd unit
Z4J_AUDIT_WEBHOOK_URL=https://siem.internal.example.com/ingest
Z4J_AUDIT_WEBHOOK_HMAC_SECRET=<32+ byte random string>

Restart the brain. On boot you should see:

INFO z4j.brain.domain.audit_forwarder: forwarder started (url=https://siem.internal.example.com/ingest, buffer=1000)

If the URL fails the SSRF pre-flight (loopback, RFC1918, plaintext http on a brain that has not opted into Z4J_NOTIFICATIONS_WEBHOOK_ALLOW_HTTP), a startup WARNING fires and every forwarded row is then dropped at dispatch. Fix the URL and restart.

Settings

Variable	Default	Notes
`Z4J_AUDIT_WEBHOOK_URL`	unset	Receiver URL. Empty / unset disables the forwarder entirely. SecretStr at the Pydantic layer so a path-embedded token does not land in startup logs.
`Z4J_AUDIT_WEBHOOK_HMAC_SECRET`	unset	REQUIRED when the URL is set. At least 32 bytes. The brain refuses to start if the URL is set without an HMAC secret — an unauthenticated mirror is worse than no mirror because downstream parsers may trust it implicitly. Mint with `python -c "import secrets; print(secrets.token_urlsafe(48))"`.
`Z4J_AUDIT_WEBHOOK_TIMEOUT_SECONDS`	`10.0`	Per-row POST timeout, range 1.0..120.0. A slow receiver does NOT block the brain’s audit write path; the forwarder runs in a background drain task.
`Z4J_AUDIT_WEBHOOK_BUFFER_SIZE`	`1000`	In-memory queue size between the audit-write hook and the drain task. Spikes above this drop rows with a WARNING + a swallowed-exception metric bump. Raise on high-volume brains.

Wire format

The receiver gets a POST request with the row as canonical JSON:

POST /your/path HTTP/1.1
Host: siem.internal.example.com
Content-Type: application/json
X-Z4J-Audit-Signature: sha256=<hex>
X-Z4J-Audit-Timestamp: 1715515200
X-Z4J-Audit-Schema: 1

{"action":"user.password_changed","api_key_id":null,"event_id":null,"id":"...","metadata":{"key":"val"},"occurred_at":"2026-05-12T12:00:00.000000+00:00","outcome":"allow","prev_row_hmac":"...","project_id":null,"result":"success","row_hmac":"...","source_ip":"192.0.2.10","target_id":"user-1","target_type":"user","user_agent":"z4j-cli/1","user_id":"..."}

Fields are emitted in JSON-sorted-keys order so the signature is reproducible. The body matches the brain’s internal audit row, plus a row_hmac field so a receiver that has cached the brain’s Z4J_SECRET can re-verify the HMAC chain in its own pipeline.

Verifying the signature

Each POST carries TWO headers:

X-Z4J-Audit-Signature: sha256=<hex> — the HMAC digest
X-Z4J-Audit-Timestamp: <unix_seconds> — when the signature was minted

The signature is computed over the bytes <timestamp>.<body>. Folding the timestamp into the HMAC input gives replay-resistance: a captured POST replayed later still has its original signature, but the timestamp is stale so a receiver enforcing a skew window rejects it.

Python receiver (with replay-defence dedupe):

import hmac, hashlib, json, time

SKEW_SECONDS = 300  # 5 minute window; tune to your fleet

# Production: replace this in-memory set with a Redis SETNX or a DB
# unique-index insert. Audit row IDs are UUIDs (~10^-37 collision
# probability), so a permanent dedupe table is bounded by your audit
# retention window.
_seen_ids: set[str] = set()

def verify(body: bytes, headers: dict, secret: bytes) -> bool:
    sig = headers.get("X-Z4J-Audit-Signature", "")
    ts = headers.get("X-Z4J-Audit-Timestamp", "")
    if not sig or not ts:
        return False
    # Reject stale / future timestamps before the constant-time compare
    # so an attacker cannot use the verify call itself as a clock oracle.
    try:
        ts_int = int(ts)
    except ValueError:
        return False
    if abs(int(time.time()) - ts_int) > SKEW_SECONDS:
        return False
    digest_input = ts.encode("utf-8") + b"." + body
    expected = "sha256=" + hmac.new(secret, digest_input, hashlib.sha256).hexdigest()
    if not hmac.compare_digest(expected, sig):
        return False
    # Replay-defence: once the signature verifies, decode the body
    # and reject duplicates on row id. The brain's audit-row id is
    # a UUID, mint-once-per-row. A signed POST replayed inside the
    # 5-minute window has the SAME id; reject it here so the
    # downstream pipeline never inserts the row twice.
    try:
        row_id = json.loads(body).get("id")
    except Exception:
        return False
    if not row_id or row_id in _seen_ids:
        return False
    _seen_ids.add(row_id)
    return True

If verification fails for any reason, reject the request with 401 Unauthorized. Do NOT parse the body before verifying the signature.

One HMAC secret per brain

Set a distinct Z4J_AUDIT_WEBHOOK_HMAC_SECRET per brain instance. The signature is over <timestamp>.<body> only — it does NOT bind the brain identity. If two brains share the same secret, an attacker who captures a signed POST from brain A can replay it against brain B’s receiver (within the skew window) and it will verify. Rotate the secret on each brain replica and use a SIEM tag (e.g., a source_brain header your reverse proxy injects) if you need to attribute rows to a specific brain.

SSRF protection

Every dispatch runs through the same DNS-pin protection as the generic webhook notification channel:

Scheme must be https:// (or http:// if Z4J_NOTIFICATIONS_WEBHOOK_ALLOW_HTTP=true)
Hostname resolved to one or more IP addresses
Each IP checked against the blocked set (loopback, RFC1918, link-local, cloud metadata, CGNAT, IPv4-mapped IPv6, 6to4, NAT64, benchmark)
The validated IP is pinned for the TCP connect; Host header + TLS SNI extension stay set to the original hostname so vhost routing and TLS certificate validation still work

A configured URL that resolves to a blocked IP is rejected at dispatch time and the row is dropped with a swallowed-exception metric bump under module=audit_forwarder, site=ssrf_or_dns.

Backpressure

The forwarder owns an asyncio.Queue of size Z4J_AUDIT_WEBHOOK_BUFFER_SIZE. The audit-write hook calls enqueue(row) which is non-blocking:

Queue has space: row is queued, hook returns True.
Queue is full: row is dropped, hook returns False, audit_forwarder.dropped_count increments, a WARNING is logged (deduplicated to every 100 drops).

This is by design. The brain’s primary audit-log write must NEVER be slowed down by a misbehaving mirror; the source of truth is in the database. If you see steady-state drops, either raise the buffer size or unblock the receiver.

Observability

The forwarder exposes three counters that show up in the z4j_swallowed_exceptions_total metric under module=audit_forwarder:

site=queue_full — rows dropped at enqueue because the queue was saturated
site=ssrf_or_dns — rows dropped at dispatch because the URL failed the SSRF pre-flight (host changed DNS records, IP now blocked)
site=send_one — rows dropped at dispatch because the receiver returned a non-2xx OR the underlying HTTP call raised

A panel in z4j-notifications.json renders these alongside the notification dispatch counters.

Threat model

The forwarder is an authenticated, append-only mirror. It does not replace the brain’s primary audit log; it complements it. Specifically:

The mirror can lag (the receiver is not consulted on the audit write path).
Individual rows can be dropped (queue saturation, receiver down, DNS change).
An attacker who compromises the brain’s HMAC secret can forge rows to the mirror, so the receiver should treat the brain as one of several sources, not as a trusted oracle.

Operators wanting cryptographic non-repudiation should pair the forwarder with a receiver that re-signs every row under its own secret on receipt, so the chain extends beyond the brain’s trust boundary.

Disabling

Unset Z4J_AUDIT_WEBHOOK_URL and restart. The forwarder is not constructed when no URL is set; no background task starts, no queue is allocated.