Skip to content

Commit 9f961a0

Browse files
authored
feat(0.10.1): G1-G4 + C1 + H1-H4 follow-up sweep (#94)
* G1 `agents add` diversity guard - ProfileStore.diversity_warnings warns when the reviewer / architect shares a provider family with the coder, plus a new PROVIDER_FAMILIES table and provider_family() helper. The CLI prints yellow warnings (non-fatal); --json output now includes diversity_warnings. * G2 capability filter - ProfileStore.filter_by_capability + a new specsmith agents list --capability flag. * G3 phase next auto-routes - advancing the AEE phase now pins phase:active to the new phase's preferred profile (and seeds the canonical phase:<key> entry on first advance). * G4 TraceVault seal on /agent - in-chat /agent <id> writes a decision seal chained into .specsmith/trace.jsonl so every per-turn profile pin is auditable. Best-effort: read-only fs etc. never breaks the chat loop. * C1 token threading - each provider driver now returns (text, _UsageDelta) and surfaces real token counts: Ollama prompt_eval_count + eval_count, Anthropic final_message.usage, OpenAI stream_options.include_usage, Gemini usage_metadata. A 4-chars/token heuristic fills in when the SDK omits usage. Counts flow through ChatRunResult.tokens_in/out/cost_usd into AgentState.credit() and the per-profile by_profile bucket. * H1 docs/site/agents.md - preset -> route -> per-session -> BYOE walkthrough. * H3 README elevator pitch - multi-agent + BYOE up top. * H4 docs/site/quickstart.md - reproduction script + GIF placeholder.
1 parent c9a2ee4 commit 9f961a0

9 files changed

Lines changed: 683 additions & 49 deletions

File tree

README.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,25 @@ specsmith treats belief systems like code: codable, testable, and deployable. It
1616
epistemically-governed projects, stress-tests requirements as BeliefArtifacts, runs
1717
cryptographically-sealed trace vaults, and orchestrates AI agents under formal AEE governance.
1818

19+
**0.10.0 — Multi-Agent + BYOE.** A `/plan` goes to the architect, `/fix`
20+
goes to the coder, `/review` goes to a reviewer that runs on a different
21+
model family. Each *profile* is a `(provider, model, endpoint?, fallback_chain)`
22+
bundle stored in `~/.specsmith/agents.json`; an *activity routing table*
23+
maps slash commands and AEE phases to profiles; **BYOE endpoints**
24+
(`~/.specsmith/endpoints.json`) let you point a profile at any
25+
OpenAI-v1-compatible backend you self-host (vLLM, llama.cpp `server`,
26+
LM Studio, TGI, ...). Cross-family **diversity guard**, capability
27+
filtering, transient-failure fallback chains, and TraceVault decision
28+
seals on every `/agent` pin are wired in by default. See
29+
[`docs/site/agents.md`](docs/site/agents.md) for the five-minute walkthrough.
30+
31+
```bash
32+
specsmith agents preset apply default # frontier coder + cross-family reviewer
33+
specsmith endpoints add --id home-vllm \
34+
--base-url http://10.0.0.4:8000/v1 --auth bearer-keyring
35+
specsmith run --agent opus-reviewer # one-shot per-session pin
36+
```
37+
1938
It also co-installs the standalone `epistemic` Python library for direct use in any project:
2039

2140
```python

docs/site/agents.md

Lines changed: 180 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,180 @@
1+
# Multi-Agent Profiles & Activity Routing
2+
3+
`specsmith agents` (REQ-146) lets you bind activities — a slash command, an
4+
AEE phase, an MCP tool category — to a named **profile**: a
5+
`(provider, model, endpoint_id?, prompt_prefix, capabilities, fallback_chain)`
6+
bundle. The runner consults the routing table on every turn so a `/plan`
7+
goes to the architect, `/fix` goes to the coder, and `/review` goes to a
8+
reviewer that runs on a *different* model family.
9+
10+
This page walks you from **install → preset → custom profile → per-session
11+
override → BYOE endpoint** in five minutes.
12+
13+
---
14+
15+
## 1. Install a preset
16+
17+
Profiles are stored in `~/.specsmith/agents.json`. The fastest way to seed
18+
the file is to apply one of the four built-in presets:
19+
20+
```bash
21+
specsmith agents preset list
22+
specsmith agents preset apply default # frontier + local fallback (recommended)
23+
specsmith agents preset apply local-only # 100% Ollama
24+
specsmith agents preset apply frontier-only # Claude Opus everywhere
25+
specsmith agents preset apply cost-conscious # Haiku coder, Sonnet architect
26+
```
27+
28+
After applying:
29+
30+
```bash
31+
specsmith agents list
32+
* coder role=coder anthropic/claude-sonnet-4-5
33+
fallback: mistral/codestral-latest → ollama/qwen2.5-coder:32b
34+
architect role=architect anthropic/claude-opus-4
35+
fallback: openai/gpt-5 → ollama/qwen2.5:32b
36+
reviewer role=reviewer openai/gpt-5-codex ← different family!
37+
38+
```
39+
40+
The `*` marks the **default profile**, used when no route matches.
41+
42+
---
43+
44+
## 2. Inspect & customise the routing table
45+
46+
```bash
47+
specsmith agents route show
48+
* chat → coder
49+
/plan → architect
50+
/fix → coder
51+
/review → reviewer
52+
phase:requirements → researcher
53+
54+
```
55+
56+
Re-bind any activity:
57+
58+
```bash
59+
specsmith agents route set /review opus-reviewer
60+
specsmith agents route clear /audit
61+
```
62+
63+
The `phase:<key>` routes are auto-maintained: `specsmith phase next` (G3)
64+
also pins a `phase:active` route to the new phase's preferred profile so
65+
the runner can flip the whole session by listening for one activity.
66+
67+
---
68+
69+
## 3. Add your own profile
70+
71+
```bash
72+
specsmith agents add \
73+
--id sonnet-coder \
74+
--role coder \
75+
--provider anthropic \
76+
--model claude-sonnet-4-5 \
77+
--capability code \
78+
--capability function-calling \
79+
--fallback ollama/qwen2.5-coder:32b
80+
```
81+
82+
If your new coder shares a provider family with the existing reviewer,
83+
the **diversity guard** (G1) prints a warning so the cross-check the
84+
reviewer is supposed to provide doesn't degenerate:
85+
86+
```
87+
✓ saved profile sonnet-coder
88+
⚠ reviewer (reviewer, anthropic/claude-opus-4) shares the 'anthropic'
89+
family with sonnet-coder (coder, anthropic/claude-sonnet-4-5);
90+
diversity is recommended so the reviewer can catch the coder's blind spots.
91+
```
92+
93+
The warning is non-fatal — the profile still saves — but you should
94+
either pick a reviewer in a different family or accept the trade-off
95+
deliberately.
96+
97+
### Filter by capability
98+
99+
```bash
100+
specsmith agents list --capability code-review
101+
specsmith agents list --capability mcp --json
102+
```
103+
104+
`--capability` is the easiest way to find every profile that advertises
105+
a given strength so the right `route set` command writes itself.
106+
107+
---
108+
109+
## 4. Per-session overrides
110+
111+
Three knobs override the routing table for one session:
112+
113+
```bash
114+
specsmith run --agent opus-reviewer # pin a profile
115+
specsmith chat --agent haiku-coder # one-shot
116+
specsmith run --endpoint home-vllm # pin a BYOE endpoint
117+
```
118+
119+
Inside a running session, the slash command `/agent <id>` flips the
120+
profile mid-session:
121+
122+
```
123+
nexus> /agent opus-reviewer
124+
ℹ profile = opus-reviewer
125+
```
126+
127+
Pinning a profile via `/agent` writes a **TraceVault decision seal**
128+
(G4) into `.specsmith/trace.jsonl`, so every "I switched to model X for
129+
this turn" choice is cryptographically chained into the audit trail.
130+
You can confirm with `specsmith trace log --type decision`.
131+
132+
### Token accounting (C1)
133+
134+
The runner now reports real `tokens_in` / `tokens_out` for every turn
135+
on every provider that exposes them (Ollama via `prompt_eval_count` +
136+
`eval_count`, Anthropic via `final_message.usage`, OpenAI via
137+
`stream_options.include_usage`, Gemini via `usage_metadata`). When the
138+
SDK omits usage, a 4-chars/token fallback gives the TokenMeter chip a
139+
non-zero value to show. Per-profile totals show up in
140+
`AgentState.by_profile` and the VS Code TokenMeter splits accordingly.
141+
142+
---
143+
144+
## 5. Bring-Your-Own-Endpoint (BYOE)
145+
146+
A **profile** can bind to a registered OpenAI-v1-compatible endpoint
147+
instead of a built-in provider:
148+
149+
```bash
150+
# Register the endpoint once
151+
specsmith endpoints add \
152+
--id home-vllm \
153+
--base-url http://10.0.0.4:8000/v1 \
154+
--default-model qwen2.5-coder \
155+
--auth bearer-keyring # token prompted, stored in OS keyring
156+
157+
# Bind a profile to it
158+
specsmith agents add \
159+
--id local-coder \
160+
--role coder \
161+
--provider openai-compat \
162+
--endpoint home-vllm \
163+
--fallback ollama/qwen2.5-coder:7b
164+
165+
specsmith agents route set /code local-coder
166+
```
167+
168+
The runner now routes `/code` through `home-vllm`. If the box is
169+
unreachable, the fallback chain walks `ollama/qwen2.5-coder:7b` next
170+
(see `tests/test_fallback_chain.py` for the full retry policy: 408,
171+
429, and 5xx fall through, 4xx surfaces immediately).
172+
173+
---
174+
175+
## Reference
176+
177+
- [REQ-146 — Agent profiles + activity routing](../REQUIREMENTS.md)
178+
- [`specsmith.agent.profiles`](../../src/specsmith/agent/profiles.py)`Profile`, `ProfileStore`, `apply_preset`, `provider_family`
179+
- [`specsmith.agent.fallback`](../../src/specsmith/agent/fallback.py)`run_with_fallback`, `parse_target`
180+
- [`docs/site/api-stability.md`](api-stability.md) — public surface contract

docs/site/quickstart.md

Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
# Five-Minute Quickstart
2+
This page is the **reproducible** version of the README's elevator pitch:
3+
copy the commands top-to-bottom and you'll end up with a fresh project,
4+
a multi-agent profile set, a routed `/plan` → architect → coder pipeline,
5+
and a TraceVault sealed audit chain you can verify after the fact.
6+
7+
> **GIF placeholder.** A 30-second screen recording showing the same
8+
> commands running end-to-end will live at
9+
> `docs/site/_static/quickstart.gif`. Until that lands, the script in
10+
> [scripts/quickstart.sh](#reproduction-script) is the source of truth.
11+
12+
---
13+
14+
## Prerequisites
15+
- Python 3.10+ (`pipx install specsmith` or `pip install specsmith`)
16+
- One LLM provider configured (any of):
17+
- `ANTHROPIC_API_KEY=sk-…` for Claude
18+
- `OPENAI_API_KEY=sk-…` for GPT/O-series
19+
- `GOOGLE_API_KEY=…` for Gemini
20+
- Ollama running locally (`ollama serve`) — no key needed
21+
22+
The reproduction script intentionally has *no* timing-sensitive steps so
23+
it's safe to run unattended in CI.
24+
25+
---
26+
27+
## Reproduction script
28+
```bash
29+
#!/usr/bin/env bash
30+
# scripts/quickstart.sh — five-minute walkthrough, idempotent.
31+
set -euo pipefail
32+
export SPECSMITH_NO_AUTO_UPDATE=1
33+
export SPECSMITH_PYPI_CHECKED=1
34+
35+
# 1. Scaffold a fresh project.
36+
specsmith init --output-dir /tmp \
37+
--config <(cat <<'YAML'
38+
name: quickstart-demo
39+
type: cli-python
40+
language: python
41+
description: "specsmith multi-agent quickstart demo"
42+
YAML
43+
)
44+
cd /tmp/quickstart-demo
45+
46+
# 2. Install the recommended profile preset.
47+
specsmith agents preset apply default
48+
specsmith agents list
49+
specsmith agents route show
50+
51+
# 3. Add a custom local-coder profile (diversity guard fires).
52+
specsmith agents add \
53+
--id local-coder \
54+
--role coder \
55+
--provider ollama \
56+
--model qwen2.5-coder:32b \
57+
--capability code \
58+
--fallback ollama/qwen2.5-coder:7b
59+
60+
# 4. Filter by capability — handy for finding "what can do X".
61+
specsmith agents list --capability code --json
62+
63+
# 5. Optional: register a self-hosted endpoint (BYOE).
64+
# specsmith endpoints add \
65+
# --id home-vllm \
66+
# --base-url http://10.0.0.4:8000/v1 \
67+
# --default-model qwen2.5-coder \
68+
# --auth bearer-keyring
69+
70+
# 6. Drive a single turn through the routing table.
71+
echo "/plan add a hello-world handler" | \
72+
specsmith run --json-events --task "/plan add a hello-world handler"
73+
74+
# 7. Pin a profile mid-session — emits a TraceVault decision seal.
75+
echo "/agent opus-reviewer" | specsmith run --json-events
76+
specsmith trace log --type decision
77+
78+
# 8. Advance the AEE phase — auto-routes phase:active to the new phase.
79+
specsmith phase next --force
80+
specsmith agents route show | grep phase:active
81+
```
82+
83+
Save the script anywhere on your machine and run it; the only side
84+
effects are inside `/tmp/quickstart-demo`, `~/.specsmith/agents.json`,
85+
and (if you uncomment step 5) `~/.specsmith/endpoints.json`.
86+
87+
---
88+
89+
## What you should see
90+
| Step | Expected output |
91+
|------|---------------------------------------------------------------------------------|
92+
| 1 | `Done. N files created in /tmp/quickstart-demo` |
93+
| 2 | `✓ applied preset default — 7 profiles, 22 routes` |
94+
| 3 | `✓ saved profile local-coder` *plus* a yellow `⚠ … shares the 'ollama' family…` diversity warning if a same-family reviewer exists. |
95+
| 4 | A JSON document with one entry whose `id` is `local-coder`. |
96+
| 6 | A JSONL stream beginning with `{"type": "ready", …}` followed by `block_start`, `token`, `block_complete`, `task_complete`. |
97+
| 7 | `✓ Sealed as SEAL-0001` (or whichever sequence number is next). |
98+
| 8 | A `phase:active` line in the routing table pointing at the new phase's profile. |
99+
100+
If any step fails, run `specsmith doctor --onboarding` to surface what's
101+
missing and re-run from that step.
102+
103+
---
104+
105+
## Next steps
106+
- [`docs/site/agents.md`](agents.md) — the full multi-agent walkthrough
107+
- [`docs/site/api-stability.md`](api-stability.md) — the public surface contract
108+
- [`docs/site/vscode-extension.md`](vscode-extension.md) — VS Code Workbench surfaces

docs/site/vscode-extension.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -233,6 +233,34 @@ installed model list before spawning the session.
233233

234234
---
235235

236+
## Multi-Agent + BYOE Surfaces (0.10.0)
237+
The extension exposes the CLI's `agents` (REQ-146) and `endpoints` (REQ-142)
238+
stores as two sidebar trees plus eight Command Palette entries. Each
239+
command shells out to `specsmith <subcommand> --json` so the on-disk
240+
schema lives in exactly one place.
241+
### Sidebar trees
242+
- **BYOE Endpoints** (`specsmith.endpoints` view) — every entry from
243+
`~/.specsmith/endpoints.json`; the entry marked `` is the default.
244+
- **Agent Profiles** (`specsmith.agents` view) — grouped under *Profiles*
245+
(with `` on the default) and *Routes* (`activity → profile_id`).
246+
### Commands
247+
| Command palette | Action |
248+
|--------------------------------------------------|------------------------------------------------------------------------|
249+
| `specsmith: BYOE Endpoints…` | Quick Pick over endpoints with copy-id / set-default / test actions. |
250+
| `specsmith: Test BYOE Endpoint` | Probes `/v1/models`; toast shows latency + model count. |
251+
| `specsmith: Refresh BYOE Endpoints` | Re-runs `specsmith endpoints list --json` and refreshes the tree. |
252+
| `specsmith: Agent Profiles…` | Quick Pick over profiles; copy id, set default, route to activity. |
253+
| `specsmith: Test Agent Profile` | Probes the resolved provider / endpoint and shows reachability. |
254+
| `specsmith: Refresh Agent Profiles` | Re-runs `specsmith agents list --json` and refreshes the tree. |
255+
| `specsmith: Apply Agent Preset (default / local-only / frontier-only / cost-conscious)` | Runs `specsmith agents preset apply <name>`. |
256+
| `specsmith: Route Activity to Agent Profile` | Picks an activity (`/plan`, `/fix`, `phase:requirements`, …) and a profile, then runs `specsmith agents route set`. |
257+
| `specsmith: Pick Session Profile` | Per-session pin for the active SessionPanel; appends `--agent <id>` to the bridge invocation. |
258+
The SessionPanel header chip surfaces the resolved profile + endpoint for
259+
the current turn; click it to open the picker without leaving the chat.
260+
### `/agent <id>` from chat
261+
Typing `/agent opus-reviewer` in the chat input flips the active session
262+
to the named profile and writes a TraceVault decision seal so the change
263+
is chained into `.specsmith/trace.jsonl`.
236264
## Keyboard Shortcuts
237265

238266
| Shortcut | Action |

0 commit comments

Comments
 (0)