| title | Event Plane |
|---|
The event plane provides Dynamo with a pub/sub layer for near real-time event exchange between components. It delivers KV cache updates, worker load metrics, and sequence tracking events, enabling features like KV-aware routing and disaggregated serving.
Key use cases:
- KV cache events -- Workers publish cache state so the router can make cache-aware scheduling decisions.
- Worker load metrics -- Workers report utilization so the router can balance load.
- Sequence tracking -- Coordinates active sequences across router replicas for fault-tolerant routing.
The event plane supports two transports:
| NATS (default) | ZMQ | |
|---|---|---|
| External infrastructure | Requires a NATS server | None (peer-to-peer) |
| Setup complexity | Simple -- point at a NATS server | Automatic -- workers bind sockets and register via discovery |
| Best for | Large-scale deployments | Low operational overhead |
Set the DYN_EVENT_PLANE environment variable to choose a transport:
# Use NATS (default -- no need to set explicitly)
export DYN_EVENT_PLANE=nats
# Use ZMQ
export DYN_EVENT_PLANE=zmqPython components also accept this as a CLI flag:
# SGLang backend
python3 -m dynamo.sglang --event-plane zmq --model Qwen/Qwen3-0.6B
# vLLM backend
python3 -m dynamo.vllm --event-plane zmq --model Qwen/Qwen3-0.6B| Variable | Description | Default |
|---|---|---|
DYN_EVENT_PLANE |
Transport: nats or zmq |
nats |
NATS_SERVER |
NATS server URL (NATS transport only) | nats://localhost:4222 |
When using NATS (DYN_EVENT_PLANE=nats or unset):
- Requires a running NATS server. Set
NATS_SERVERif it is not onlocalhost:4222. - Events are published to NATS subjects scoped by namespace and component.
- Built-in reconnection and message buffering during brief disconnections.
Example setup:
export NATS_SERVER=nats://nats-server:4222
export DYN_EVENT_PLANE=nats
# Start workers -- explicitly enable KV event publishing
python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B \
--kv-events-config '{"publisher":"nats","topic":"kv-events","enable_kv_cache_events":true}'
# Start frontend -- it subscribes to events from NATS automatically
python3 -m dynamo.frontend --router-mode kvWhen using ZMQ (DYN_EVENT_PLANE=zmq):
- No external server required. Each worker binds a ZMQ PUB socket and advertises its address through the discovery system.
- Subscribers automatically discover and connect to all active publishers.
- When publishers come and go (e.g., workers scaling up/down), subscribers dynamically adjust their connections.
Example setup:
export DYN_EVENT_PLANE=zmq
# Start workers -- each binds a ZMQ socket, registers with discovery
python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B \
--kv-events-config '{"publisher":"zmq","endpoint":"tcp://*:20080","enable_kv_cache_events":true}'
# Start frontend -- discovers workers and connects directly
python3 -m dynamo.frontend --router-mode kvIf you do not need KV-aware routing, you can disable the event plane entirely:
python3 -m dynamo.frontend --router-mode kv --no-router-kv-eventsWith --no-router-kv-events:
- The router falls back to prediction-based cache-aware routing (estimates cache state from routing decisions).
- No NATS server or ZMQ sockets are needed.
- TTL-based expiration and LRU pruning keep predicted state from growing stale.
Both transports work out of the box:
# NATS (requires nats-server running)
export NATS_SERVER=nats://localhost:4222
# OR ZMQ (no extra infrastructure)
export DYN_EVENT_PLANE=zmqThe operator can inject DYN_EVENT_PLANE into pods. The same transport options apply. If using NATS, deploy a NATS server in the cluster and set NATS_SERVER accordingly.
- Discovery Plane -- Service discovery and coordination (etcd, Kubernetes)
- Distributed Runtime -- Runtime architecture
- Request Plane -- Request transport configuration
- Fault Tolerance -- Failure handling