Tag: v0.3-control-loop-demo
Commit: c70e3c8 week3-control-plane-loop-green
Date: 2026-05-25
The first runnable version of Bridle that demonstrates the company thesis end-to-end: a central control plane, two enforcement surfaces, signed policy distribution, and an audit-backed shadow report that proves a central mode flip changes runtime behavior without redeploy.
This is a demo release. It is intentionally not pilot-ready yet — see "What must become durable in Week 4" below.
+--------------------------+
| Bridle Control Plane |
| (FastAPI, in-memory) |
| - signed bundles |
| - gateway registry |
| - audit ingest |
| - shadow report |
| - mode-flip endpoint |
+-----------+--------------+
| signed bundles (poll)
| audit rows (push, batched)
+-----------------+---------------------+
| |
+--------+------------+ +---------+---------+
| LiteLLM Proxy + | | Tool middleware |
| BridleLogger (#0) | | @bridle.tool |
+---------------------+ +-------------------+
| |
v v
upstream provider business logic
Both surfaces share:
- one
GatewayInterceptor - one
LocalDemoPolicyEngine(bundle-driven config) - one
InMemorySessionStateService - one
InMemoryAuditLedger - and emit identity envelopes with the same
trace_id / session_id / agent_id / actor_id / tenant_id.
| Suite | Tests | Result |
|---|---|---|
| Contract models | 4 | ✅ |
| Bundle validator | 4 | ✅ |
| Interceptor unit | 4 | ✅ |
| Week 1 budget lifecycle | 2 | ✅ |
| Week 1 e2e (real LiteLLM proxy) | 5 | ✅ |
| Week 2 tool-call acceptance | 9 | ✅ |
| Week 3 CP server | 10 | ✅ |
| Week 3 HTTP bundle loader | 3 | ✅ |
| Week 3 audit shipper | 5 | ✅ |
| Week 3 central loop + trace linkage | 3 | ✅ |
Spike regression (separate run_spike.sh) |
16 | ✅ |
| Total | 65 | 65/65 |
| Metric | Measured | Target |
|---|---|---|
| LLM e2e p95 (Week 1, real LiteLLM) | 10.63 ms | < 25 ms |
| Tool decision p95 (Week 2) | 0.08 ms | < 25 ms |
| CP publish + signature roundtrip | < 5 ms | n/a |
| Loader fetch + verify + swap | ~2 ms | n/a |
Week 1: LLM gateway enforcement with allow / mutate / block / fail-closed and audit row per request. ✅
Week 2: Tool-call enforcement via @bridle.tool decorator + contextvars.
Refund-agent demo proves one session, two enforcement surfaces, one
policy engine, one audit ledger. ✅
Week 3: From the central control plane, flip a policy from shadow to enforce, publish a signed bundle, gateway picks it up, next request changes behavior, audit/report proves it. ✅
One command runs the full Week 3 DoD demo end-to-end (CP + interceptor
- poller + audit ship + report + mode flip):
python -m bridle.demo.control_loopA Week 2 refund-agent demo also runs standalone:
python -m bridle.demo.refund_agentFor the LLM gateway path through a real LiteLLM proxy (Week 1 demo),
see tests/test_week1_e2e.py — the proxy command is documented in
that file's header.
These are deliberate v0 trade-offs documented in ADR-001 / ADR-002 / ADR-003. They do not block Week 4.
- LiteLLM 1.86.0 pin. The spike regression suite must pass before
any LiteLLM version bump.
tests/spikes/litellm_enforcement/run_spike.sh. - Single CustomLogger position #0. Observability loggers register
after enforcement; only callback
[0]receivesasync_pre_call_hook. - Mutation targets must be in
model_list. Bundle validator enforces this at publish and mode-flip time. - Callback module must live next to the LiteLLM config YAML. LiteLLM resolves dotted paths relative to the config file's directory.
- Bundle validator runs against the union of all registered model lists for a tenant at publish time. The gateway-side validator is the binding check.
prevented_spend_usdin the shadow report is a v0 proxy. It usescost_at_decision_usdas the proxy for what the policy would have prevented. Exact per-action calculation is a v1 task.- Trace propagation requires the agent to set
X-Trace-Idand pass the sametrace_idtosession_context. Framework adapters (OpenAI Agents SDK, LangGraph) will bridge this automatically when written; v0 requires the agent author to do it.
These are the surfaces that must become durable before a pilot:
| Surface | Today | Week 4 target |
|---|---|---|
| Audit ledger | InMemoryAuditLedger, lost on gateway restart |
Postgres |
| Audit store (CP) | InMemoryAuditStore, lost on CP restart |
Postgres |
| Session state | InMemorySessionStateService, lost on gateway restart |
Postgres |
| Policy bundles | InMemoryBundleStore, lost on CP restart |
Postgres |
| Gateway registry | InMemoryGatewayRegistry, lost on CP restart |
Postgres |
| CP signing key | KeyManager.from_env or generated ephemeral |
Persist in CP filesystem or KMS |
| Auth | Static bearer token (BRIDLE_CP_MASTER_KEY) |
Tracked but not in Week 4 |
| RBAC, web UI, billing | Absent | Not in Week 4 |
Per the Week 4 brief, in order:
- Postgres durability for
audit_rows,events,sessions,policy_bundles,gateway_registry. No ClickHouse, no warehouse. - Failure-mode tests that demonstrate trust: bad signature, bundle expired, CP unreachable, audit shipper unreachable, policy engine error → fail-by-severity.
- One-command demo:
make demo-control-loop(or equivalent) spins up CP + LiteLLM proxy + mock + runs the loop + tears down. - Trace report helper: query
trace_id→ ordered LLM/tool observation/decision/outcome rows. This is the incident-review primitive. - Design-partner walkthrough at
docs/demo/design_partner_walkthrough.mdin buyer language. - Pilot-readiness review — Week 4 ADR + updated release note.
Week 4 DoD: a fresh machine can run one command and observe the full loop, with durable audit rows surviving a CP and gateway restart, plus a trace report that links LLM and tool decisions for one agent turn.
c70e3c8 week3-control-plane-loop-green
8aa4b02 w3-s1: bridle control plane HTTP server + bundle publish/sign
779626b rename: billion-baby/controlplane → Bridle
4398a45 week2-tool-call-enforcement-green
0fcd9e5 week1-litellm-gateway-enforcement-green