Backups

What’s in Postgres

Everything that’s not a secret:

Tasks + events (volume depends on traffic).
Agents (names, token hashes, capabilities).
Users, memberships, sessions (sessions are ephemeral).
Audit log (unlimited retention by default - most important to back up).
Schedules.

What’s in env / secrets

Z4J_SECRET, Z4J_SESSION_SECRET, Z4J_AUDIT_SECRET.
Agent plaintext tokens (only the operator who minted them holds these - brain stores hashes).

Backup strategy

Postgres - pg_dump or managed-service snapshot, at least daily. Point-in-time recovery for RPO < 1 hour.
Secrets - version-controlled in your secret manager (Vault, AWS Secrets Manager, GCP Secret Manager).
Agent tokens - re-mintable. If lost, revoke the old agent and mint a new token. No need to back up plaintext.

Restore drill

Restore Postgres from backup.
Point a new brain container at the restored DB.
Verify the audit chain: z4j-brain audit verify.
Agents auto-reconnect (their tokens are still valid - token hashes are in the restored DB).
Spot-check: log in, list tasks, check agent status.

Run this drill at least quarterly in staging.

Audit log export

For compliance, export the audit log separately:

curl -H "Authorization: Bearer $TOKEN" \
  "https://z4j.example.com/api/v1/audit/export?format=csv&from=2026-01-01" \
  > audit-2026-q1.csv

CSV includes row_hmac and prev_row_hmac - external systems can verify the chain independently.

Retention

Tasks / events - tune Z4J_TASK_RETENTION_DAYS (default 90). Older rows are deleted by a daily sweep.
Audit log - never auto-deleted. Manual: z4j-brain audit export --older-than 1y --delete (writes a final audit entry before deleting).

Secrets rotation vs. restore

If you rotate Z4J_AUDIT_SECRET, the chain from that point onward uses the new secret. Old rows verify with the old secret. If you restore Postgres from before rotation, you need the old secret - store both while old rows exist.