Bridle is a runtime control plane for production AI agents. The architecture has three planes (control / data / telemetry) and one identity model that links them.
The central FastAPI service. Owns policy authoring artifacts (bundles in Postgres), gateway registration, audit ingestion, mode-flip operations, and the two reports (shadow + trace).
POST /v1/bundles publish + sign
GET /v1/bundles/{id} fetch by id
GET /v1/bundles/active latest for gateway
POST /v1/gateways/register register + model_list
GET /v1/gateways/{id}/status active bundle + last-seen
POST /v1/gateways/{id}/heartbeat gateway pings on activate
POST /v1/audit batch audit ingest
POST /v1/policies/{id}/mode shadow ↔ enforce flip
GET /v1/reports/shadow aggregated would-have-action
GET /v1/reports/trace/{trace_id} ordered obs/decision/outcome
GET /v1/public_key bootstrap the signature verifier
The CP signs bundles with an ed25519 key. Gateways are bootstrapped with the matching public key and verify every bundle they activate.
Two enforcement surfaces, one GatewayInterceptor instance:
- LLM gateway: LiteLLM Proxy +
BridleLogger(CustomLoggerin callback position 0). The logger'sasync_pre_call_hookcalls the interceptor, which evaluates policy and returnsallow / modified dict / block-string / raise. - Tool surface:
@bridle.tool("issue_refund")decorator wraps any async tool function. Identity (session_id,agent_id,trace_id, ...) flows viacontextvarsset bysession_context(...). The decorator calls the same interceptor before invoking the tool.
Both surfaces share _pending, state_service, audit_ledger,
policy_engine, classifier. This is what makes Bridle "one
session, two enforcement surfaces, one policy engine, one audit
ledger" — the unifying invariant.
The data plane connects to Postgres for audit + session state, and
to the CP via HTTPBundleLoader for signed bundle distribution.
Every decision lands as one append-only row in audit_rows:
tenant_id, agent_id, actor_id, session_id, trace_id, request_id, observation_type, matched_policy_ids, mode_at_evaluation, final_action, would_have_action, final_outcome, cost_at_decision_usd, record_hash, previous_record_hash.
Rows are chained by record_hash for tamper evidence. Queries:
- Shadow report (
/v1/reports/shadow): group by policy across a tenant + window, sumcost_at_decision_usdas a v0 proxy for "prevented spend." - Trace report (
/v1/reports/trace/{trace_id}): ordered events for one agent turn — the incident-review primitive.
Every observation, decision, outcome, and audit row carries the same envelope. The fields that link surfaces together are:
| Field | Purpose |
|---|---|
tenant_id |
Customer-level isolation |
session_id |
Per-product session; joins LLM + tool events for the same agent run |
trace_id |
Per-call trace; can be set by the agent to link a turn across surfaces |
agent_id |
Which agent made the call |
actor_id |
The end-user/service the agent is acting on behalf of |
request_id |
Joins one observation + decision + outcome triple |
session_id is the v0.6 grouping primitive. trace_id was added in
v0.5.1 hardening — set X-Trace-Id on an LLM call and pass the same
value to session_context(trace_id=...) for the tool call, and one
trace report links the whole turn.
operator control plane
───────── ─────────────
edit policy.yaml
│
bridle policy publish *.yaml
│
│ POST /v1/bundles
▼
(CP validates + signs + persists)
│
│
▼
gateway HTTPBundleLoader polls
│
│ GET /v1/bundles/active
▼
(verify signature with cached public key)
(run bundle_validator)
(check expires_at — refuse if past)
(engine.set_active_bundle(bundle))
│
▼
runtime: next request evaluates against the new bundle
YAML compiles to the existing signed PolicyBundle. No runtime
changes — only an authoring layer. The six supported type:
values map to canonical kebab-case policy IDs the engine already
recognizes.
| Failure | Behavior | Test |
|---|---|---|
| Bundle signature invalid | Loader refuses; cached stays | test_http_bundle_loader.test_loader_rejects_bundle_with_bad_signature |
| Bundle expired | Loader refuses; cached stays | test_failure_modes.test_loader_rejects_expired_bundle_and_keeps_cached |
| CP unreachable | Loader returns None; cached stays | test_failure_modes.test_cp_* |
| Audit shipper unavailable | Re-buffers in memory | test_audit_shipper.* |
| Policy engine raises | Synthetic policy-engine-error decision via worst-severity fail_modes.on_engine_error; raw exception never propagates |
test_failure_modes.test_engine_error_* |
| Postgres restart | All five durable tables survive | test_postgres_durability.* |
All five operational stores are Postgres-backed via asyncpg:
| Table | Holds |
|---|---|
audit_rows |
Every decision |
sessions |
Per-session cost + counters |
tool_intents |
Loop-detector window |
policy_bundles |
Signed bundles + signature blob |
gateway_registry |
Gateway model_list + last-seen + active bundle |
Migrations are flat SQL under bridle/migrations/, mounted into the
Postgres container as init scripts via docker-compose.yml.
The original LiteLLM Path-A spike that picked the architecture lives
at tests/spikes/litellm_enforcement/. It pins
litellm==1.86.0 and re-runs 16 tests against a live LiteLLM Proxy +
mock OpenAI upstream to verify the async_pre_call_hook contract
the rest of the product depends on. Run it before any LiteLLM bump:
bash tests/spikes/litellm_enforcement/run_spike.shSee ADR-006 §"What v0.5.1 deliberately does NOT do" and ADR-005 §"What v0.5 deliberately does NOT do":
- No web UI
- No RBAC
- No billing
- No arbitrary policy logic / full DSL
- No additional providers beyond LiteLLM
- No per-rule targeting in the runtime (bundle-level only)
- No auto-rollback / canary on bundle activation (mode-flip endpoint is the rollout mechanism)
These wait for design-partner pain.