Skip to content

Scaling

v1.0 runs as a single brain process. The bottleneck is:

  1. Postgres - plenty of room; sizing in the self-hosting guide.
  2. WebSocket connections - one socket per agent, ~10 KiB RAM steady-state. 1000 agents ≈ 10 MiB.
  3. Event persistence - batches to Postgres; 10k events/sec on modest hardware.

For most deployments one replica is plenty.

Planned for v1.1:

  • N brain replicas behind a load balancer.
  • Sticky session routing on /ws (each agent pins to one replica).
  • Inter-replica fan-out via Postgres LISTEN/NOTIFY for dashboard-side SSE.

Until then, do not run multiple replicas - each would open separate WebSocket channels, and dashboards connected to one replica wouldn’t see events captured by another.

  • Read replicas help dashboards but not the hot event-persist path.
  • pg_partman / native partitioning on events(ts) helps retention sweeps.
  • Set statement_timeout to prevent runaway queries.

Agents scale with your app. One agent per app process; no coordination between agents. Nothing to configure - deploy more app replicas, more agents register.

If you’re running > 500 agents or > 100M events/day, file an issue. We want the feedback.