Skip to content

Commit 8236302

Browse files
susiejojoclaude
andauthored
feat(wiki): post-campaign knowledge extraction and visualization (#271)
* feat(wiki): add post-campaign knowledge extraction and visualization Add skills and scripts for extracting structured knowledge from completed campaigns and rendering interactive HTML visualizations. Closes #270 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(viz): align Python/JS slug algorithms, extract chip-rendering helper - _make_kg_id now uses the same [^a-z0-9]+ regex as the JS side, fixing silent panel lookup failures for names with underscores or special chars - Extract renderRelationshipChips() to deduplicate ~92 lines of identical chip-rendering between Knowledge and Iterations tab branches Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 265018a commit 8236302

5 files changed

Lines changed: 3089 additions & 0 deletions

File tree

.claude/commands/post-campaign.md

Lines changed: 305 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,305 @@
1+
Index a completed Nous campaign into the shared wiki and generate a visualization.
2+
3+
## Steps
4+
5+
1. **Find the campaign**: If `$ARGUMENTS` is provided, use it as the path to the `.nous/<campaign>/` directory. Otherwise, search for directories containing both `ledger.json` and `principles.json` under `.nous/` paths in the project or `~/Downloads/`, and ask the user which to index.
6+
7+
2. **Read campaign artifacts**: Read `ledger.json`, `principles.json`, and `campaign.yaml` from the campaign directory. Extract:
8+
- Campaign name (directory name)
9+
- Campaign date (earliest non-baseline timestamp from ledger iterations)
10+
- **Campaign context** from `campaign.yaml`:
11+
- `research_question` — the overarching question being investigated
12+
- `target_system.name` — the system under test
13+
- `target_system.description` — what the system does
14+
- `target_system.repo_path` — path to the target repository
15+
- **Runtime metadata** from `campaign.yaml` (under `runtime:` block, if present):
16+
- `runtime.target_commit` — git SHA of target repo at campaign start
17+
- `runtime.target_repo` — org/repo identifier
18+
- `runtime.nous_version` — Nous version used
19+
- `runtime.started_at` — ISO timestamp of campaign initialization
20+
- All iterations with their outcomes (`h_main_result`), families, and prediction accuracy
21+
- All principles with full fields (statement, confidence, regime, mechanism, applicability_bounds, contradicts, superseded_by, status)
22+
23+
If `campaign.yaml` doesn't exist in the campaign directory, check for `report.md` and extract the research question from its opening section. If neither exists, ask the user for the campaign context.
24+
25+
The campaign context will be embedded in JSON metadata for each output file.
26+
27+
3. **Check idempotency**: Check if `~/.nous/wiki/campaigns/<campaign-name>/concepts.json` exists. If it does, report "Campaign already indexed — skipping to visualization" and jump to step 11.
28+
29+
4. **Write dead-ends.json**: Write a JSON array to `~/.nous/wiki/campaigns/<campaign-name>/dead-ends.json`.
30+
31+
Dead-ends are approaches that were tested and conclusively don't work. Each entry must be **self-contained**: another agent reading this should understand what was tried, why it failed, and when to avoid it — without looking at any other file.
32+
33+
For each iteration where `h_main_result == "REFUTED"`:
34+
- Find the principles extracted in that iteration
35+
- Synthesize a self-contained explanation of the failure
36+
37+
```json
38+
[
39+
{
40+
"id": "DE-1",
41+
"title": "<descriptive title of what was attempted>",
42+
"iteration": "iter-N",
43+
"what_was_tried": "<1-2 sentences: the specific approach/configuration, with concrete values>",
44+
"why_it_failed": "<1-2 sentences: the causal mechanism>",
45+
"avoid_when": "<specific conditions under which this fails>"
46+
}
47+
]
48+
```
49+
50+
Number IDs sequentially starting from DE-1 within this campaign.
51+
52+
5. **Write frontiers.json**: Write a JSON array to `~/.nous/wiki/campaigns/<campaign-name>/frontiers.json`.
53+
54+
Frontiers are the edges of what this campaign explored — where knowledge ends and the next experiment begins. Each frontier must be **self-contained**: a reader should understand it without looking at principles.json or the Principles tab.
55+
56+
Identify frontiers by looking for **high-confidence** principles only:
57+
- High-confidence principles from PARTIALLY_CONFIRMED iterations (boundary was actively hit)
58+
- High-confidence principles whose `applicability_bounds` explicitly mentions untested territory
59+
- High-confidence confirmed principles that were tested under narrow conditions (specific rates, cluster sizes, durations) where adjacent conditions remain unexplored
60+
61+
**Skip all medium and low confidence principles** — they aren't established enough to define meaningful boundaries.
62+
63+
```json
64+
[
65+
{
66+
"id": "F-1",
67+
"title": "<descriptive title of the frontier — what's at the edge>",
68+
"what_was_tried": "<1-2 sentences: the specific experiment/configuration that was run, using concrete values>",
69+
"what_was_left_untried": "<1-2 sentences: the adjacent territory not explored>",
70+
"what_to_try_next": "<1 sentence: a concrete, actionable experiment>",
71+
"related_principles": ["RP-5", "RP-18"]
72+
}
73+
]
74+
```
75+
76+
Write 5-10 frontiers per campaign. Prioritize frontiers where the next experiment is clearly actionable. **Only include frontiers based on high-confidence principles.**
77+
78+
6. **Write interactions.json**: Write a JSON array to `~/.nous/wiki/campaigns/<campaign-name>/interactions.json`.
79+
80+
Interactions are untested combinations of independently-validated approaches that might compound, conflict, or reveal new behavior when used together. Each entry must be **self-contained**: another agent should understand what the two approaches do individually, why combining them is interesting, and what experiment to run — without looking at any other file.
81+
82+
Identify interactions by:
83+
- Looking for pairs of CONFIRMED principles that address different mechanisms and were never validated together
84+
- Focusing on approaches that operate in adjacent or overlapping conditions
85+
- Limit to 3-5 most interesting interactions to avoid noise
86+
87+
```json
88+
[
89+
{
90+
"id": "I-1",
91+
"title": "<descriptive title of the combination>",
92+
"approach_a": "<1-2 sentences: what the first approach does, under what conditions, what it achieves>",
93+
"approach_b": "<1-2 sentences: what the second approach does, under what conditions, what it achieves>",
94+
"why_combine": "<1-2 sentences: why these together might produce better results>",
95+
"experiment_to_run": "<1 sentence: a concrete, actionable experiment configuration>",
96+
"related_principles": ["RP-8", "RP-17", "RP-18"]
97+
}
98+
]
99+
```
100+
101+
7. **Write campaign summary**: Write `~/.nous/wiki/campaigns/<campaign-name>/summary.md` (create directory if needed). Skip if the file already exists.
102+
103+
Read the campaign's `report.md` if it exists for additional context. Generate:
104+
105+
```
106+
# <campaign-name>
107+
108+
**Date:** <date>
109+
**Iterations:** <count of non-baseline iterations>
110+
**Key question:** <from report.md opening or inferred from iteration families>
111+
112+
## Outcome
113+
<2-3 sentence answer based on the pattern of confirmations/refutations>
114+
115+
## Iteration arc
116+
<Brief narrative: what families were explored, which confirmed/refuted, key pivots>
117+
118+
## Key principles
119+
<Bulleted list of 5-10 most important high-confidence principles with IDs>
120+
121+
## Open questions
122+
<Bulleted list of frontiers and untested territory>
123+
```
124+
125+
8. **Copy principles.json**: Copy the source campaign's `principles.json` to `~/.nous/wiki/campaigns/<campaign-name>/principles.json`.
126+
127+
9. **Copy llm_metrics.jsonl**: If `llm_metrics.jsonl` exists in the campaign directory, copy it to `~/.nous/wiki/campaigns/<campaign-name>/llm_metrics.jsonl`. This preserves per-iteration LLM cost data (model, cost, duration, turns) for the visualization's cost chart.
128+
129+
10. **Generate visualization data files**: Generate JSON files that feed the interactive graph. Save to `~/.nous/wiki/campaigns/<campaign-name>/` (create directory if needed).
130+
131+
**a) `concepts.json`** — structured JSON for the Knowledge tab and Iterations sub-nodes.
132+
133+
Index the campaign's vocabulary into three categories with strict ownership semantics:
134+
135+
**The directed ownership graph**: `Entity ←(operates_on)← Concept →(owns)→ Parameter`
136+
- Entities are leaf nodes (no outgoing ownership edges)
137+
- Concepts are the central nodes connecting entities to parameters
138+
- Parameters are leaf nodes owned by exactly one concept
139+
- Every concept MUST point to ≥1 entity and 0+ parameters
140+
- Every parameter MUST point back to exactly 1 concept
141+
- **A parameter appears in exactly ONE concept's `parameters` array** — the concept that INTRODUCED and OWNS the knob. Other concepts that merely USE or are AFFECTED BY the parameter do NOT list it. "Uses" ≠ "owns."
142+
143+
**Category definitions:**
144+
- **Concept**: A reusable algorithm, theory, or technique that Nous discovered and validated during this campaign. Must be self-contained — understandable and applicable without campaign-specific context. Concepts are transferable across campaigns (e.g., "Slope-Based Saturation Detection" is a concept; "iter-3 config" is not). A concept operates on one or more entities and owns zero or more parameters as its tweakable knobs. Did NOT exist before Nous ran.
145+
- **Parameter**: A numeric knob or threshold belonging to exactly ONE concept that was actively tuned during experimentation. The parameter's meaning derives entirely from its parent concept — it cannot exist independently. If you can't name which concept owns it, either the concept is missing or the parameter is misclassified.
146+
- **Entity**: A component that ALREADY EXISTED in the project's source code BEFORE this campaign ran. Entities are the pre-existing infrastructure that concepts operate ON — e.g., a scheduler, dispatcher, router, queue, or gateway that was already in the codebase. If the campaign INTRODUCED or CREATED a component (like a new detector, a new algorithm, a new module), that is a **Concept**, NOT an entity. The test: "Did this exist in the codebase before the campaign started?" If yes → Entity. If no → Concept. NOT model profiles, workload configurations, benchmark inputs, hardware specs, or experiment design choices. Entities do NOT own parameters — only concepts do.
147+
148+
**Include metadata at top level:**
149+
```json
150+
{
151+
"campaign_name": "<campaign-name>",
152+
"date": "<campaign date>",
153+
"repo_path": "<target_system.repo_path from campaign.yaml>",
154+
"system_name": "<target_system.name> — <target_system.description>",
155+
"research_question": "<research_question from campaign.yaml>",
156+
"target_commit": "<runtime.target_commit from campaign.yaml, or null>",
157+
"target_repo": "<runtime.target_repo from campaign.yaml, or null>",
158+
"nous_version": "<runtime.nous_version from campaign.yaml, or null>",
159+
"started_at": "<runtime.started_at from campaign.yaml, or null>",
160+
"concepts": [...],
161+
"parameters": [...],
162+
"entities": [...]
163+
}
164+
```
165+
166+
**Item schemas (MUST match exactly — the visualization script reads these field names):**
167+
168+
```json
169+
// Concept item:
170+
{
171+
"name": "Descriptive Name",
172+
"definition": "1-3 sentence explanation of what this is and how it works.",
173+
"principles": ["RP-1", "RP-7", "RP-10"],
174+
"operates_on": ["EntityName1", "EntityName2"],
175+
"parameters": ["paramName1", "paramName2"]
176+
}
177+
178+
// Parameter item:
179+
{
180+
"name": "parameterName",
181+
"definition": "What this knob controls and its effect.",
182+
"principles": ["RP-7", "RP-10"],
183+
"parent_concept": "Concept Name That Owns This Parameter",
184+
"evolution": [
185+
{"iter": "iter-3", "value": "0.1", "outcome": "confirmed", "note": "Eliminated false positives at rate=20"},
186+
{"iter": "iter-10", "value": "0.05", "outcome": "confirmed", "note": "2.6% incremental critical gain"}
187+
]
188+
}
189+
190+
// Entity item:
191+
{
192+
"name": "ComponentName",
193+
"source": "path/to/file.go::TypeName",
194+
"definition": "What this pre-existing component does in the system.",
195+
"principles": ["RP-2", "RP-15"]
196+
}
197+
```
198+
199+
**Relationship field requirements (explicit edges — authoritative for knowledge graph):**
200+
- Every concept MUST have `operates_on` (array of ≥1 entity name) and `parameters` (array of 0+ parameter names)
201+
- Every parameter MUST have `parent_concept` (string — exactly 1 concept name that owns this parameter)
202+
- Names in `operates_on` MUST exactly match names in this file's `entities` array
203+
- Names in `parameters` MUST exactly match names in this file's `parameters` array
204+
- The `parent_concept` value MUST exactly match a name in this file's `concepts` array
205+
- These relationship fields are the authoritative graph edges — `principles` arrays are supplementary (used for iteration-linking and cross-campaign principle queries)
206+
207+
**Relationship integrity checklist (run mentally before writing the file):**
208+
1. For each parameter P: can you name exactly one concept that P belongs to? If not, add the missing concept.
209+
2. For each concept C: does C.parameters list every parameter in the file whose parent_concept == C.name? (Bidirectional consistency)
210+
3. **Does any parameter name appear in MORE THAN ONE concept's `parameters` array?** This is always wrong. A parameter has one owner. If concept A introduced the knob and concept B merely uses it, only A lists it.
211+
4. For each concept C: does every name in C.operates_on appear in the entities array? If not, add the missing entity or fix the name.
212+
5. Is any parameter orphaned (not listed in ANY concept's `parameters` array)? Fix by adding it to its parent concept.
213+
6. Is any entity unreachable (not referenced by ANY concept's `operates_on`)? Either add a concept that operates on it, or remove the entity.
214+
215+
**Field name requirements (visualization contract):**
216+
- Use `definition` (NOT `description`) — displayed in tooltip and detail panel
217+
- Use `principles` (NOT `related_principles`) — array of RP-IDs that reference this item; used to compute graph edges between items and to connect items to iterations
218+
- For `evolution`: use `iter` (NOT `iteration`), `value`, `outcome` (lowercase status: "confirmed"/"refuted"/"partially_confirmed"/"baseline"), `note` (explanation text)
219+
- Every concept, parameter, AND entity MUST have a `principles` array — this is how the graph determines which items share connections and which iterations they belong to
220+
221+
Guidelines:
222+
- Extract 5-15 concepts, 3-10 parameters, and 5-10 entities per campaign.
223+
- Only include parameters that were actively varied during the campaign.
224+
- Skip common industry terms (TTFT, LLM, GPU, p99, etc.). Focus on campaign-specific vocabulary.
225+
- Every active principle should be referenced by at least one concept, parameter, or entity.
226+
- The `evolution` array for parameters should include every iteration where the parameter's value was meaningfully varied.
227+
228+
**Entity validation (mandatory):** After drafting the entity list, verify each entity is truly pre-existing — NOT something the campaign introduced. You cannot rely on git history (campaign code may not be committed). Instead, use these sources of truth:
229+
230+
1. **campaign.yaml is the ground truth for what pre-existed.** The `target_system.description` and any `reference_code_paths` describe the system AS IT WAS before the campaign. Components mentioned there are entities.
231+
2. **Principles describe what the campaign CREATED.** Read principles.json — any algorithm, detector, technique, or module described as something the campaign implemented, discovered, or introduced is a Concept, never an Entity. If a principle says "we built X" or "X was introduced to improve Y", then X is a Concept.
232+
3. **The litmus test:** Could you describe this component in a sentence that makes sense WITHOUT mentioning this campaign? "The gateway queue dispatches requests to instances" → Entity (it's infrastructure). "The TTFT slope detector fires when latency slope exceeds a threshold" → Concept (the campaign created it to test a hypothesis).
233+
234+
Remove or reclassify as Concept anything that fails this check. For each validated entity, note which part of `target_system` in campaign.yaml references it.
235+
236+
**Entity name grounding (mandatory):** Entity names MUST come from the actual source code, not from human-readable paraphrasing. For each validated entity, do a **single targeted search** in `<repo_path>` to find the primary type/class/struct name. Detect the language from file extensions in the repo and search accordingly:
237+
238+
```bash
239+
# Go:
240+
grep -r "type <YourGuess>" <repo_path> --include="*.go" -l
241+
# Python:
242+
grep -r "class <YourGuess>" <repo_path> --include="*.py" -l
243+
# Rust:
244+
grep -r "struct <YourGuess>\|impl <YourGuess>" <repo_path> --include="*.rs" -l
245+
# TypeScript/JavaScript:
246+
grep -r "class <YourGuess>\|interface <YourGuess>" <repo_path> --include="*.ts" --include="*.js" -l
247+
```
248+
249+
- Use the **actual type name** from source as the entity `name` (e.g., `FlowControlFilter`, not "Gateway Queue")
250+
- Set `source` to `<relative-path>::<TypeName>` where `<relative-path>` is relative to `repo_path` (e.g., `pkg/epp/handlers/flowcontrol.go::FlowControlFilter`). The validator will check that `<repo_path>/<relative-path>` exists on disk — so the path must resolve to a real file.
251+
- If multiple types compose one logical entity, pick the primary orchestrating type
252+
- If you cannot find a matching type after 2-3 searches, use the best name from `campaign.yaml`'s `target_system` description and leave `source` as `null`
253+
254+
**Scope guard:** This is a naming step, not a research step. Do NOT read function bodies, trace call graphs, or explore the codebase beyond finding the type declaration. Spend at most 1-2 searches per entity.
255+
256+
**Graph validation (mandatory — must pass before proceeding):** After writing concepts.json, run:
257+
```bash
258+
python scripts/validate_concepts.py ~/.nous/wiki/campaigns/<campaign-name>/concepts.json
259+
```
260+
If the script exits with errors, fix concepts.json and re-run until it passes. Common fixes:
261+
- "owned by multiple concepts" → remove the parameter from all but its true owner's `parameters` array
262+
- "orphaned parameter" → add it to the owning concept's `parameters` array
263+
- "unreachable entity" → either add an `operates_on` reference from a concept, or remove the entity
264+
- "unknown entity/parameter/concept" → fix the spelling to match exactly
265+
266+
Do NOT proceed to step 10b until `validate_concepts.py` exits 0.
267+
268+
**b) `summaries.json`** — iteration summaries for the detail panel:
269+
```json
270+
{
271+
"iter-0": {
272+
"what_was_tried": "<1-2 sentences: experimental setup>",
273+
"what_was_found": "<1-2 sentences: key result, include CONFIRMED/REFUTED/PARTIALLY_CONFIRMED>",
274+
"why_it_matters": "<1 sentence: significance for the campaign's evolution>"
275+
},
276+
"iter-1": { ... },
277+
...
278+
}
279+
```
280+
Write a summary for EVERY iteration (including baseline). These appear in the side panel when a user clicks an iteration node. Keep concise but informative.
281+
282+
11. **Generate visualization and open**: Only after ALL indexing steps (4-10) are complete, run the visualization script. The script reads insights from per-campaign JSON files.
283+
```bash
284+
python scripts/visualize_campaign.py "<campaign_path>" \
285+
--summaries ~/.nous/wiki/campaigns/<campaign-name>/summaries.json \
286+
--concepts ~/.nous/wiki/campaigns/<campaign-name>/concepts.json
287+
```
288+
The script generates `~/.nous/wiki/viz/<campaign-name>.html` and opens it in the browser.
289+
290+
12. **Report**: Print all output paths and confirm the visualization opened:
291+
- `~/.nous/wiki/campaigns/<name>/dead-ends.json`
292+
- `~/.nous/wiki/campaigns/<name>/frontiers.json`
293+
- `~/.nous/wiki/campaigns/<name>/interactions.json`
294+
- `~/.nous/wiki/campaigns/<name>/principles.json`
295+
- `~/.nous/wiki/campaigns/<name>/llm_metrics.jsonl`
296+
- `~/.nous/wiki/campaigns/<name>/summary.md`
297+
- `~/.nous/wiki/campaigns/<name>/concepts.json`
298+
- `~/.nous/wiki/campaigns/<name>/summaries.json`
299+
- `~/.nous/wiki/viz/<name>.html`
300+
301+
## Important Rules
302+
303+
- **Read-only inputs**: Never modify the campaign's own files (ledger.json, principles.json, etc.).
304+
- **Per-campaign isolation**: Each campaign's structured data lives in `~/.nous/wiki/campaigns/<name>/`. No shared markdown files.
305+
- **Idempotent**: If the campaign is already indexed (step 3 check), skip indexing and only regenerate the visualization (steps 11-12).

0 commit comments

Comments
 (0)