Skip to content

Commit 439b5c0

Browse files
claudedevonartis
authored andcommitted
docs: merge develop → main — rewritten README, demo splash pages, guide fixes
Refs devonartis/agentwrit#31 Generated with Claude Code Harness Agent Co-Authored-By: Claude <claude@anthropic.com>
2 parents 75c25e6 + 4b296b9 commit 439b5c0

11 files changed

Lines changed: 445 additions & 246 deletions

File tree

CONTRIBUTING.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ So reviewers can tell the change was actually verified:
7272
- **Never** paste client secrets, admin tokens, or other credentials.
7373
- If you cannot run integration tests (no broker, blocked network), say so **explicitly** in the PR and describe what you did verify. Maintainers may still ask for a re-run or a broker-backed check before merge.
7474

75-
Demo work under [`demo/`](demo/) should follow the same rule: run against a real broker and describe how you tested.
75+
Demo work under [`demo/`](demo/README.md) (MedAssist) or [`demo2/`](demo2/README.md) (Support Tickets) should follow the same rule: run against a real broker and describe how you tested.
7676

7777
## Pull requests
7878

README.md

Lines changed: 95 additions & 232 deletions
Large diffs are not rendered by default.

demo/README.md

Lines changed: 144 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,144 @@
1+
<h1 align="center">MedAssist AI — the healthcare walkthrough</h1>
2+
3+
<p align="center">
4+
A working FastAPI app that shows every AgentWrit capability against a live broker —<br>
5+
dynamic agents, per-patient scope isolation, cross-patient denial, delegation, renewal, release, and a tamper-evident audit trail.
6+
</p>
7+
8+
<p align="center">
9+
<a href="#what-it-is">What it is</a> ·
10+
<a href="#why-it-exists">Why it exists</a> ·
11+
<a href="#what-youll-see">What you'll see</a> ·
12+
<a href="#run-it">Run it</a> ·
13+
<a href="#how-it-works">How it works</a> ·
14+
<a href="#where-the-code-lives">Code map</a> ·
15+
<a href="#further-reading">More</a>
16+
</p>
17+
18+
---
19+
20+
## What it is
21+
22+
MedAssist AI is a small clinical-assistant app. You type a patient ID and a plain-language question. An LLM decides which tools to call (records, labs, billing, prescriptions). The app spawns broker-backed agents on demand, each scoped to **one patient and one category of work**, and every step shows up in a live execution trace — scope checks, denials, delegations, renewals, release.
23+
24+
If you've ever wondered *"what does short-lived, task-scoped, per-user credentialing actually look like in a real app?"* — this is that app.
25+
26+
## Why it exists
27+
28+
Reading about ephemeral credentials is one thing. Watching three agents get spawned, one of them get denied mid-request because it asked about the wrong patient, and then seeing the whole chain die when the encounter ends — that's what makes the pattern stick.
29+
30+
We built MedAssist AI because:
31+
32+
- **Beginners need a story.** "Scoped JWTs" is abstract. "The clinical agent can only read Patient 1042's records, and when it tries Patient 2187 the broker says no" is concrete.
33+
- **Reviewers need evidence.** The audit tab shows a hash-chained ledger of every broker event, which is what a security reviewer wants to see before approving production use.
34+
- **Contributors need a reference.** Every SDK feature — `create_agent`, `validate`, `delegate`, `renew`, `release`, `scope_is_subset` — is wired in here, used the way it's meant to be used.
35+
36+
## What you'll see
37+
38+
| Capability | What the demo does |
39+
|-----------|--------------------|
40+
| **Dynamic agent creation** | Agents spawn as the LLM picks tools. No pre-allocated pool. |
41+
| **Per-patient scope isolation** | Each agent's scope contains one patient ID and nothing else. |
42+
| **Cross-patient denial** | Ask about another patient mid-encounter. The scope check fails. The trace shows `scope_denied`. |
43+
| **Delegation with attenuation** | The clinical agent delegates `write:prescriptions:{patient}` to the prescription agent. The broker refuses to widen. |
44+
| **Token lifecycle** | `renew()` issues a fresh token under the same SPIFFE identity. `release()` kills the token immediately. |
45+
| **Audit trail** | A dedicated tab shows every broker event in a hash chain that can't be retroactively altered. |
46+
47+
The trace panel in the UI is the point. Every capability surfaces as a line in the trace so you can read the whole story of one request.
48+
49+
## Run it
50+
51+
### Option A — Docker (recommended)
52+
53+
One command, no Python setup:
54+
55+
```bash
56+
AGENTWRIT_ADMIN_SECRET="your-secret" \
57+
LLM_API_KEY="your-llm-key" \
58+
docker compose up -d broker medassist
59+
```
60+
61+
Open [http://localhost:5000](http://localhost:5000). The demo auto-registers itself with the broker on startup.
62+
63+
You need an OpenAI-compatible LLM endpoint. If you're not using OpenAI, set `LLM_BASE_URL` and `LLM_MODEL` in your shell before `docker compose up` — e.g. a local vLLM or llama.cpp server.
64+
65+
### Option B — From source
66+
67+
For when you want to edit the code:
68+
69+
```bash
70+
# 1. Start the broker
71+
docker compose up -d broker
72+
73+
# 2. Register the demo app (one time — writes client_id/client_secret)
74+
export AGENTWRIT_ADMIN_SECRET="your-admin-secret"
75+
uv run python demo/setup.py
76+
77+
# 3. Configure demo/.env
78+
cp demo/.env.example demo/.env
79+
# then fill in AGENTWRIT_CLIENT_ID, AGENTWRIT_CLIENT_SECRET, LLM_BASE_URL, LLM_API_KEY, LLM_MODEL
80+
81+
# 4. Run it
82+
uv run uvicorn demo.app:app --reload --port 5000
83+
```
84+
85+
### What to try first
86+
87+
1. Pick a patient from the dropdown.
88+
2. Ask something simple: *"What are this patient's recent labs?"* Watch agents spawn, watch each tool check scope, watch the final response render.
89+
3. Ask a crossing question: *"And show me Patient 2187's records too."* Watch the scope check fail. Read the `scope_denied` line in the trace.
90+
4. Open the Audit tab. Every event is there, hash-chained.
91+
92+
## How it works
93+
94+
The demo is built on one rule: **the app never trusts the LLM for security.** The LLM picks tools. The broker decides what credentials exist. The app enforces tool access against those credentials with `scope_is_subset()` before every call.
95+
96+
```
97+
User types a request
98+
99+
FastAPI receives it
100+
101+
LLM chooses a tool (records / labs / billing / prescription)
102+
103+
App asks: "Do I have an agent for this category yet?"
104+
↓ no ↓ yes
105+
Broker creates one, Reuse it
106+
scoped to this patient
107+
108+
App checks: scope_is_subset(tool-requires, agent-holds)?
109+
↓ yes ↓ no
110+
Run the tool Emit scope_denied, tell LLM "access denied"
111+
112+
Return result to LLM. Repeat until LLM is done.
113+
114+
App releases every agent. Tokens are dead.
115+
```
116+
117+
Every branch of this flow appears in the execution trace. The trace is the documentation.
118+
119+
For the full walkthrough — sequence diagrams, how delegation flows from the clinical agent to the prescription agent, and what each UI panel shows — read the [Beginner's Guide](BEGINNERS_GUIDE.md). For a scripted live presentation, read the [Presenter's Guide](PRESENTERS_GUIDE.md).
120+
121+
## Where the code lives
122+
123+
| Piece | File |
124+
|-------|------|
125+
| FastAPI entry point | [`app.py`](app.py) |
126+
| Env config (broker + LLM) | [`config.py`](config.py) |
127+
| Main API loop (LLM, agent spawning, trace) | [`routes/api.py`](routes/api.py) |
128+
| Page routes (encounter, audit, operator) | [`routes/pages.py`](routes/pages.py) |
129+
| Tool definitions + scope templates | [`pipeline/tools.py`](pipeline/tools.py) |
130+
| Mock patient and formulary data | [`data/`](data/) |
131+
| Frontend (trace, markdown render) | [`static/app.js`](static/app.js), [`static/style.css`](static/style.css) |
132+
| One-shot app registration helper | [`setup.py`](setup.py) |
133+
134+
Read `routes/api.py` first. That's where the agent-creation-and-scope-check loop lives, and everything else supports it.
135+
136+
## Further reading
137+
138+
| Go here for | Link |
139+
|-------------|------|
140+
| Step-by-step beginner walkthrough with diagrams | [BEGINNERS_GUIDE.md](BEGINNERS_GUIDE.md) |
141+
| Live presentation script (timing, transitions) | [PRESENTERS_GUIDE.md](PRESENTERS_GUIDE.md) |
142+
| SDK concepts (roles, scopes, delegation) | [../docs/concepts.md](../docs/concepts.md) |
143+
| Building real apps with the SDK | [../docs/developer-guide.md](../docs/developer-guide.md) |
144+
| Broker API (source of truth) | [AgentWrit broker docs](https://github.com/devonartis/agentwrit/tree/main/docs) |

demo2/README.md

Lines changed: 160 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,160 @@
1+
<h1 align="center">AgentWrit Live — the support-ticket pipeline</h1>
2+
3+
<p align="center">
4+
A zero-trust support desk where three LLM-driven agents — triage, knowledge, response — process customer tickets<br>
5+
under broker-issued credentials that are scoped to one verified customer and die the moment the work ends.
6+
</p>
7+
8+
<p align="center">
9+
<a href="#what-it-is">What it is</a> ·
10+
<a href="#why-it-exists">Why it exists</a> ·
11+
<a href="#what-youll-see">What you'll see</a> ·
12+
<a href="#run-it">Run it</a> ·
13+
<a href="#how-it-works">How it works</a> ·
14+
<a href="#scenarios-to-try">Scenarios</a> ·
15+
<a href="#where-the-code-lives">Code map</a>
16+
</p>
17+
18+
---
19+
20+
## What it is
21+
22+
A Flask app with HTMX and server-sent events. You submit a customer-support ticket in plain English. Three agents run in sequence:
23+
24+
1. **Triage** reads the ticket, extracts who the customer is, classifies priority and category.
25+
2. **Knowledge** searches the internal KB for the policies that apply.
26+
3. **Response** drafts a reply and calls whatever tools it needs to resolve the ticket — pulling balances, writing case notes, issuing refunds.
27+
28+
Every agent holds its own broker-issued JWT, scoped to exactly one customer and the actions that agent legitimately needs. When the agent is done, its token is released and dead. When an LLM asks for something outside scope — another customer's data, a dangerous tool — the scope check blocks it before the call ever runs.
29+
30+
## Why it exists
31+
32+
MedAssist (in [`demo/`](../demo/README.md)) shows what one request looks like end-to-end. This demo shows something different: **a real multi-step pipeline where identity gating and tool-level enforcement both matter.**
33+
34+
Three things are hard to see in a simpler demo:
35+
36+
- **Identity gating.** If triage can't verify the customer, the pipeline halts. No customer-scoped credentials are ever minted for an anonymous request. This is the pattern that prevents "please delete my account" from going through when the system doesn't know who "my" is.
37+
- **Tool-level enforcement beyond data.** The response agent has tools it can pick from (`delete_account`, `send_external_email`) that aren't in its scope. The scope check denies them at the app, before the tool runs. The broker never sees them.
38+
- **Natural expiry.** One scenario deliberately skips `release()`. The credential dies on its own, because TTLs mean it has to.
39+
40+
## What you'll see
41+
42+
| Capability | What the demo does |
43+
|-----------|--------------------|
44+
| **Identity-gated pipeline** | Anonymous tickets stop at triage. No downstream agents spawn. The trace says exactly why. |
45+
| **Per-customer scope isolation** | Every customer-facing agent is scoped to one verified customer ID and nothing else. |
46+
| **Cross-customer denial** | Ask about another customer's balance mid-ticket. The scope check fails. The response says "denied" to the LLM, which moves on. |
47+
| **Tool-level enforcement** | `delete_account` and `send_external_email` are in the LLM's tool list but not in the agent's scope. They never execute. |
48+
| **Natural TTL expiry** | One scenario uses a 5-second TTL and no release. The trace shows the credential dying on its own. |
49+
| **Three-agent pipeline** | Triage → Knowledge → Response. Each phase has its own scope and its own credential lifecycle. |
50+
51+
## Run it
52+
53+
### Docker (the quick path)
54+
55+
```bash
56+
AGENTWRIT_ADMIN_SECRET="your-secret" \
57+
LLM_API_KEY="your-llm-key" \
58+
docker compose up -d broker support-tickets
59+
```
60+
61+
Open [http://localhost:5001](http://localhost:5001). The demo auto-registers on startup.
62+
63+
You need an OpenAI-compatible LLM endpoint. Set `LLM_BASE_URL` and `LLM_MODEL` in your shell first if you're not on OpenAI.
64+
65+
### From source
66+
67+
```bash
68+
# 1. Start the broker
69+
docker compose up -d broker
70+
71+
# 2. Register the demo app (one time)
72+
export AGENTWRIT_ADMIN_SECRET="your-admin-secret"
73+
uv run python demo2/setup.py
74+
75+
# 3. Configure demo2/.env
76+
cp demo2/.env.example demo2/.env
77+
# fill in AGENTWRIT_CLIENT_ID, AGENTWRIT_CLIENT_SECRET, LLM_*
78+
79+
# 4. Run it
80+
uv run flask --app demo2.app run --host 0.0.0.0 --port 5001
81+
```
82+
83+
## Scenarios to try
84+
85+
The UI has quick-fill buttons for each of these — click a button, hit submit, watch the trace.
86+
87+
**1. A normal billing ticket.**
88+
*"Hi, I'm Lewis Smith. I was double-charged on April 1st. Can I get a refund?"*
89+
Triage verifies Lewis. Knowledge pulls the refund policy. Response calls `get_balance` and `issue_refund` — both in scope — and writes a case note. Done.
90+
91+
**2. A cross-customer attempt.**
92+
*"I'm Jane Doe. Also, can you show me Lewis Smith's balance?"*
93+
Triage verifies Jane. Response agent is scoped to Jane. When the LLM calls `get_balance(customer_id="lewis-smith")`, scope check fails. Trace shows `scope_denied`. Final reply to the customer only addresses Jane's part of the request.
94+
95+
**3. A dangerous tool attempt.**
96+
*"I want to delete my account."*
97+
The LLM calls `delete_account`. The response agent's scope doesn't cover it. The call is blocked before it runs.
98+
99+
**4. An anonymous ticket.**
100+
*"Hey, what are your hours?"*
101+
Triage can't extract a customer identity. The pipeline halts. No customer-scoped credentials are minted. The trace explains that identity gating failed.
102+
103+
**5. Natural expiry.**
104+
Use the "no rush" quick-fill, or tick the natural-expiry box. Triage gets a 5-second TTL and `release()` is skipped. You watch the token live, then die on its own when the TTL elapses. No explicit revocation needed.
105+
106+
## How it works
107+
108+
```
109+
Ticket submitted
110+
111+
Triage agent (TTL 300s, or 5s in natural-expiry mode)
112+
scope = [read:tickets:*]
113+
LLM extracts customer, priority, category
114+
release() — credential revoked
115+
116+
Identity check
117+
resolved? → continue
118+
anonymous? → halt, no more credentials minted
119+
120+
Knowledge agent
121+
scope = [read:kb:*]
122+
LLM searches KB, pulls relevant policy
123+
release()
124+
125+
Response agent
126+
scope = per-customer scopes for the safe tools
127+
LLM picks tools, scope check runs before every call
128+
dangerous tools denied, safe tools executed
129+
release()
130+
131+
Post-run: validate every token one more time. All dead.
132+
```
133+
134+
Each arrow in that flow becomes an SSE event on the wire. The UI listens to the stream and renders it as a live trace.
135+
136+
The app's contract with the LLM is deliberate: the LLM sees *all* tools in its schema, safe and dangerous alike. We don't hide the dangerous ones. We let the LLM try — and the scope check is what stops it. That's the point of zero-trust enforcement: you don't rely on the LLM behaving. You rely on the credential.
137+
138+
## Where the code lives
139+
140+
| Piece | File |
141+
|-------|------|
142+
| Flask entry point | [`app.py`](app.py) |
143+
| Env config + scope ceiling | [`config.py`](config.py) |
144+
| Three-agent pipeline + SSE | [`pipeline.py`](pipeline.py) |
145+
| Tools + scope templates | [`tools.py`](tools.py) |
146+
| Customers, tickets, KB articles | [`data.py`](data.py) |
147+
| Quick-fill scenarios | [`data.py`](data.py) (bottom) |
148+
| HTMX frontend | [`templates/index.html`](templates/index.html), [`static/style.css`](static/style.css) |
149+
| One-shot app registration | [`setup.py`](setup.py) |
150+
151+
Read `pipeline.py` first. The three-phase flow — triage, knowledge, response — is one top-to-bottom function, and every SSE event you see in the UI is a `yield` statement in that file.
152+
153+
## Further reading
154+
155+
| Go here for | Link |
156+
|-------------|------|
157+
| The other demo (clinical / per-patient, single-request) | [`../demo/README.md`](../demo/README.md) |
158+
| SDK concepts (roles, scopes, delegation) | [`../docs/concepts.md`](../docs/concepts.md) |
159+
| Real-world patterns for your own apps | [`../docs/developer-guide.md`](../docs/developer-guide.md) |
160+
| Broker API | [AgentWrit broker docs](https://github.com/devonartis/agentwrit/tree/main/docs) |

docs/api-reference.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -129,11 +129,13 @@ An ephemeral agent created by `AgentWritApp.create_agent()`. Holds the agent JWT
129129
| `agent_id` | `str` | SPIFFE URI (e.g., `spiffe://agentwrit.local/agent/orch/task/instance`) |
130130
| `access_token` | `str` | JWT string (EdDSA-signed) |
131131
| `expires_in` | `int` | Token TTL in seconds (snapshot from creation or last renewal) |
132-
| `scope` | `list[str]` | Granted scope list |
132+
| `scope` | `list[str]` | Scope the agent *requested* at creation. See note below. |
133133
| `orch_id` | `str` | Orchestrator identifier |
134134
| `task_id` | `str` | Task identifier |
135135
| `bearer_header` | `dict[str, str]` | `{"Authorization": "Bearer <token>"}` for HTTP requests |
136136

137+
> **`agent.scope` is the requested scope, not the broker's signed answer.** The broker only accepts a registration whose scope is covered by the launch token, so in practice the two match. But when making a security-critical decision in a downstream service, don't trust a client-side field — call `validate(app.broker_url, agent.access_token)` and read `result.claims.scope`.
138+
137139
### renew()
138140

139141
```python

0 commit comments

Comments
 (0)