|
| 1 | +Index a completed Nous campaign into the shared wiki and generate a visualization. |
| 2 | + |
| 3 | +## Steps |
| 4 | + |
| 5 | +1. **Find the campaign**: If `$ARGUMENTS` is provided, use it as the path to the `.nous/<campaign>/` directory. Otherwise, search for directories containing both `ledger.json` and `principles.json` under `.nous/` paths in the project or `~/Downloads/`, and ask the user which to index. |
| 6 | + |
| 7 | +2. **Read campaign artifacts**: Read `ledger.json`, `principles.json`, and `campaign.yaml` from the campaign directory. Extract: |
| 8 | + - Campaign name (directory name) |
| 9 | + - Campaign date (earliest non-baseline timestamp from ledger iterations) |
| 10 | + - **Campaign context** from `campaign.yaml`: |
| 11 | + - `research_question` — the overarching question being investigated |
| 12 | + - `target_system.name` — the system under test |
| 13 | + - `target_system.description` — what the system does |
| 14 | + - `target_system.repo_path` — path to the target repository |
| 15 | + - **Runtime metadata** from `campaign.yaml` (under `runtime:` block, if present): |
| 16 | + - `runtime.target_commit` — git SHA of target repo at campaign start |
| 17 | + - `runtime.target_repo` — org/repo identifier |
| 18 | + - `runtime.nous_version` — Nous version used |
| 19 | + - `runtime.started_at` — ISO timestamp of campaign initialization |
| 20 | + - All iterations with their outcomes (`h_main_result`), families, and prediction accuracy |
| 21 | + - All principles with full fields (statement, confidence, regime, mechanism, applicability_bounds, contradicts, superseded_by, status) |
| 22 | + |
| 23 | + If `campaign.yaml` doesn't exist in the campaign directory, check for `report.md` and extract the research question from its opening section. If neither exists, ask the user for the campaign context. |
| 24 | + |
| 25 | + The campaign context will be embedded in JSON metadata for each output file. |
| 26 | + |
| 27 | +3. **Check idempotency**: Check if `~/.nous/wiki/campaigns/<campaign-name>/concepts.json` exists. If it does, report "Campaign already indexed — skipping to visualization" and jump to step 11. |
| 28 | + |
| 29 | +4. **Write dead-ends.json**: Write a JSON array to `~/.nous/wiki/campaigns/<campaign-name>/dead-ends.json`. |
| 30 | + |
| 31 | + Dead-ends are approaches that were tested and conclusively don't work. Each entry must be **self-contained**: another agent reading this should understand what was tried, why it failed, and when to avoid it — without looking at any other file. |
| 32 | + |
| 33 | + For each iteration where `h_main_result == "REFUTED"`: |
| 34 | + - Find the principles extracted in that iteration |
| 35 | + - Synthesize a self-contained explanation of the failure |
| 36 | + |
| 37 | + ```json |
| 38 | + [ |
| 39 | + { |
| 40 | + "id": "DE-1", |
| 41 | + "title": "<descriptive title of what was attempted>", |
| 42 | + "iteration": "iter-N", |
| 43 | + "what_was_tried": "<1-2 sentences: the specific approach/configuration, with concrete values>", |
| 44 | + "why_it_failed": "<1-2 sentences: the causal mechanism>", |
| 45 | + "avoid_when": "<specific conditions under which this fails>" |
| 46 | + } |
| 47 | + ] |
| 48 | + ``` |
| 49 | + |
| 50 | + Number IDs sequentially starting from DE-1 within this campaign. |
| 51 | + |
| 52 | +5. **Write frontiers.json**: Write a JSON array to `~/.nous/wiki/campaigns/<campaign-name>/frontiers.json`. |
| 53 | + |
| 54 | + Frontiers are the edges of what this campaign explored — where knowledge ends and the next experiment begins. Each frontier must be **self-contained**: a reader should understand it without looking at principles.json or the Principles tab. |
| 55 | + |
| 56 | + Identify frontiers by looking for **high-confidence** principles only: |
| 57 | + - High-confidence principles from PARTIALLY_CONFIRMED iterations (boundary was actively hit) |
| 58 | + - High-confidence principles whose `applicability_bounds` explicitly mentions untested territory |
| 59 | + - High-confidence confirmed principles that were tested under narrow conditions (specific rates, cluster sizes, durations) where adjacent conditions remain unexplored |
| 60 | + |
| 61 | + **Skip all medium and low confidence principles** — they aren't established enough to define meaningful boundaries. |
| 62 | + |
| 63 | + ```json |
| 64 | + [ |
| 65 | + { |
| 66 | + "id": "F-1", |
| 67 | + "title": "<descriptive title of the frontier — what's at the edge>", |
| 68 | + "what_was_tried": "<1-2 sentences: the specific experiment/configuration that was run, using concrete values>", |
| 69 | + "what_was_left_untried": "<1-2 sentences: the adjacent territory not explored>", |
| 70 | + "what_to_try_next": "<1 sentence: a concrete, actionable experiment>", |
| 71 | + "related_principles": ["RP-5", "RP-18"] |
| 72 | + } |
| 73 | + ] |
| 74 | + ``` |
| 75 | + |
| 76 | + Write 5-10 frontiers per campaign. Prioritize frontiers where the next experiment is clearly actionable. **Only include frontiers based on high-confidence principles.** |
| 77 | + |
| 78 | +6. **Write interactions.json**: Write a JSON array to `~/.nous/wiki/campaigns/<campaign-name>/interactions.json`. |
| 79 | + |
| 80 | + Interactions are untested combinations of independently-validated approaches that might compound, conflict, or reveal new behavior when used together. Each entry must be **self-contained**: another agent should understand what the two approaches do individually, why combining them is interesting, and what experiment to run — without looking at any other file. |
| 81 | + |
| 82 | + Identify interactions by: |
| 83 | + - Looking for pairs of CONFIRMED principles that address different mechanisms and were never validated together |
| 84 | + - Focusing on approaches that operate in adjacent or overlapping conditions |
| 85 | + - Limit to 3-5 most interesting interactions to avoid noise |
| 86 | + |
| 87 | + ```json |
| 88 | + [ |
| 89 | + { |
| 90 | + "id": "I-1", |
| 91 | + "title": "<descriptive title of the combination>", |
| 92 | + "approach_a": "<1-2 sentences: what the first approach does, under what conditions, what it achieves>", |
| 93 | + "approach_b": "<1-2 sentences: what the second approach does, under what conditions, what it achieves>", |
| 94 | + "why_combine": "<1-2 sentences: why these together might produce better results>", |
| 95 | + "experiment_to_run": "<1 sentence: a concrete, actionable experiment configuration>", |
| 96 | + "related_principles": ["RP-8", "RP-17", "RP-18"] |
| 97 | + } |
| 98 | + ] |
| 99 | + ``` |
| 100 | + |
| 101 | +7. **Write campaign summary**: Write `~/.nous/wiki/campaigns/<campaign-name>/summary.md` (create directory if needed). Skip if the file already exists. |
| 102 | + |
| 103 | + Read the campaign's `report.md` if it exists for additional context. Generate: |
| 104 | + |
| 105 | + ``` |
| 106 | + # <campaign-name> |
| 107 | +
|
| 108 | + **Date:** <date> |
| 109 | + **Iterations:** <count of non-baseline iterations> |
| 110 | + **Key question:** <from report.md opening or inferred from iteration families> |
| 111 | +
|
| 112 | + ## Outcome |
| 113 | + <2-3 sentence answer based on the pattern of confirmations/refutations> |
| 114 | +
|
| 115 | + ## Iteration arc |
| 116 | + <Brief narrative: what families were explored, which confirmed/refuted, key pivots> |
| 117 | +
|
| 118 | + ## Key principles |
| 119 | + <Bulleted list of 5-10 most important high-confidence principles with IDs> |
| 120 | +
|
| 121 | + ## Open questions |
| 122 | + <Bulleted list of frontiers and untested territory> |
| 123 | + ``` |
| 124 | +
|
| 125 | +8. **Copy principles.json**: Copy the source campaign's `principles.json` to `~/.nous/wiki/campaigns/<campaign-name>/principles.json`. |
| 126 | +
|
| 127 | +9. **Copy llm_metrics.jsonl**: If `llm_metrics.jsonl` exists in the campaign directory, copy it to `~/.nous/wiki/campaigns/<campaign-name>/llm_metrics.jsonl`. This preserves per-iteration LLM cost data (model, cost, duration, turns) for the visualization's cost chart. |
| 128 | +
|
| 129 | +10. **Generate visualization data files**: Generate JSON files that feed the interactive graph. Save to `~/.nous/wiki/campaigns/<campaign-name>/` (create directory if needed). |
| 130 | +
|
| 131 | + **a) `concepts.json`** — structured JSON for the Knowledge tab and Iterations sub-nodes. |
| 132 | +
|
| 133 | + Index the campaign's vocabulary into three categories with strict ownership semantics: |
| 134 | +
|
| 135 | + **The directed ownership graph**: `Entity ←(operates_on)← Concept →(owns)→ Parameter` |
| 136 | + - Entities are leaf nodes (no outgoing ownership edges) |
| 137 | + - Concepts are the central nodes connecting entities to parameters |
| 138 | + - Parameters are leaf nodes owned by exactly one concept |
| 139 | + - Every concept MUST point to ≥1 entity and 0+ parameters |
| 140 | + - Every parameter MUST point back to exactly 1 concept |
| 141 | + - **A parameter appears in exactly ONE concept's `parameters` array** — the concept that INTRODUCED and OWNS the knob. Other concepts that merely USE or are AFFECTED BY the parameter do NOT list it. "Uses" ≠ "owns." |
| 142 | +
|
| 143 | + **Category definitions:** |
| 144 | + - **Concept**: A reusable algorithm, theory, or technique that Nous discovered and validated during this campaign. Must be self-contained — understandable and applicable without campaign-specific context. Concepts are transferable across campaigns (e.g., "Slope-Based Saturation Detection" is a concept; "iter-3 config" is not). A concept operates on one or more entities and owns zero or more parameters as its tweakable knobs. Did NOT exist before Nous ran. |
| 145 | + - **Parameter**: A numeric knob or threshold belonging to exactly ONE concept that was actively tuned during experimentation. The parameter's meaning derives entirely from its parent concept — it cannot exist independently. If you can't name which concept owns it, either the concept is missing or the parameter is misclassified. |
| 146 | + - **Entity**: A component that ALREADY EXISTED in the project's source code BEFORE this campaign ran. Entities are the pre-existing infrastructure that concepts operate ON — e.g., a scheduler, dispatcher, router, queue, or gateway that was already in the codebase. If the campaign INTRODUCED or CREATED a component (like a new detector, a new algorithm, a new module), that is a **Concept**, NOT an entity. The test: "Did this exist in the codebase before the campaign started?" If yes → Entity. If no → Concept. NOT model profiles, workload configurations, benchmark inputs, hardware specs, or experiment design choices. Entities do NOT own parameters — only concepts do. |
| 147 | +
|
| 148 | + **Include metadata at top level:** |
| 149 | + ```json |
| 150 | + { |
| 151 | + "campaign_name": "<campaign-name>", |
| 152 | + "date": "<campaign date>", |
| 153 | + "repo_path": "<target_system.repo_path from campaign.yaml>", |
| 154 | + "system_name": "<target_system.name> — <target_system.description>", |
| 155 | + "research_question": "<research_question from campaign.yaml>", |
| 156 | + "target_commit": "<runtime.target_commit from campaign.yaml, or null>", |
| 157 | + "target_repo": "<runtime.target_repo from campaign.yaml, or null>", |
| 158 | + "nous_version": "<runtime.nous_version from campaign.yaml, or null>", |
| 159 | + "started_at": "<runtime.started_at from campaign.yaml, or null>", |
| 160 | + "concepts": [...], |
| 161 | + "parameters": [...], |
| 162 | + "entities": [...] |
| 163 | + } |
| 164 | + ``` |
| 165 | +
|
| 166 | + **Item schemas (MUST match exactly — the visualization script reads these field names):** |
| 167 | +
|
| 168 | + ```json |
| 169 | + // Concept item: |
| 170 | + { |
| 171 | + "name": "Descriptive Name", |
| 172 | + "definition": "1-3 sentence explanation of what this is and how it works.", |
| 173 | + "principles": ["RP-1", "RP-7", "RP-10"], |
| 174 | + "operates_on": ["EntityName1", "EntityName2"], |
| 175 | + "parameters": ["paramName1", "paramName2"] |
| 176 | + } |
| 177 | +
|
| 178 | + // Parameter item: |
| 179 | + { |
| 180 | + "name": "parameterName", |
| 181 | + "definition": "What this knob controls and its effect.", |
| 182 | + "principles": ["RP-7", "RP-10"], |
| 183 | + "parent_concept": "Concept Name That Owns This Parameter", |
| 184 | + "evolution": [ |
| 185 | + {"iter": "iter-3", "value": "0.1", "outcome": "confirmed", "note": "Eliminated false positives at rate=20"}, |
| 186 | + {"iter": "iter-10", "value": "0.05", "outcome": "confirmed", "note": "2.6% incremental critical gain"} |
| 187 | + ] |
| 188 | + } |
| 189 | +
|
| 190 | + // Entity item: |
| 191 | + { |
| 192 | + "name": "ComponentName", |
| 193 | + "source": "path/to/file.go::TypeName", |
| 194 | + "definition": "What this pre-existing component does in the system.", |
| 195 | + "principles": ["RP-2", "RP-15"] |
| 196 | + } |
| 197 | + ``` |
| 198 | +
|
| 199 | + **Relationship field requirements (explicit edges — authoritative for knowledge graph):** |
| 200 | + - Every concept MUST have `operates_on` (array of ≥1 entity name) and `parameters` (array of 0+ parameter names) |
| 201 | + - Every parameter MUST have `parent_concept` (string — exactly 1 concept name that owns this parameter) |
| 202 | + - Names in `operates_on` MUST exactly match names in this file's `entities` array |
| 203 | + - Names in `parameters` MUST exactly match names in this file's `parameters` array |
| 204 | + - The `parent_concept` value MUST exactly match a name in this file's `concepts` array |
| 205 | + - These relationship fields are the authoritative graph edges — `principles` arrays are supplementary (used for iteration-linking and cross-campaign principle queries) |
| 206 | +
|
| 207 | + **Relationship integrity checklist (run mentally before writing the file):** |
| 208 | + 1. For each parameter P: can you name exactly one concept that P belongs to? If not, add the missing concept. |
| 209 | + 2. For each concept C: does C.parameters list every parameter in the file whose parent_concept == C.name? (Bidirectional consistency) |
| 210 | + 3. **Does any parameter name appear in MORE THAN ONE concept's `parameters` array?** This is always wrong. A parameter has one owner. If concept A introduced the knob and concept B merely uses it, only A lists it. |
| 211 | + 4. For each concept C: does every name in C.operates_on appear in the entities array? If not, add the missing entity or fix the name. |
| 212 | + 5. Is any parameter orphaned (not listed in ANY concept's `parameters` array)? Fix by adding it to its parent concept. |
| 213 | + 6. Is any entity unreachable (not referenced by ANY concept's `operates_on`)? Either add a concept that operates on it, or remove the entity. |
| 214 | +
|
| 215 | + **Field name requirements (visualization contract):** |
| 216 | + - Use `definition` (NOT `description`) — displayed in tooltip and detail panel |
| 217 | + - Use `principles` (NOT `related_principles`) — array of RP-IDs that reference this item; used to compute graph edges between items and to connect items to iterations |
| 218 | + - For `evolution`: use `iter` (NOT `iteration`), `value`, `outcome` (lowercase status: "confirmed"/"refuted"/"partially_confirmed"/"baseline"), `note` (explanation text) |
| 219 | + - Every concept, parameter, AND entity MUST have a `principles` array — this is how the graph determines which items share connections and which iterations they belong to |
| 220 | +
|
| 221 | + Guidelines: |
| 222 | + - Extract 5-15 concepts, 3-10 parameters, and 5-10 entities per campaign. |
| 223 | + - Only include parameters that were actively varied during the campaign. |
| 224 | + - Skip common industry terms (TTFT, LLM, GPU, p99, etc.). Focus on campaign-specific vocabulary. |
| 225 | + - Every active principle should be referenced by at least one concept, parameter, or entity. |
| 226 | + - The `evolution` array for parameters should include every iteration where the parameter's value was meaningfully varied. |
| 227 | +
|
| 228 | + **Entity validation (mandatory):** After drafting the entity list, verify each entity is truly pre-existing — NOT something the campaign introduced. You cannot rely on git history (campaign code may not be committed). Instead, use these sources of truth: |
| 229 | +
|
| 230 | + 1. **campaign.yaml is the ground truth for what pre-existed.** The `target_system.description` and any `reference_code_paths` describe the system AS IT WAS before the campaign. Components mentioned there are entities. |
| 231 | + 2. **Principles describe what the campaign CREATED.** Read principles.json — any algorithm, detector, technique, or module described as something the campaign implemented, discovered, or introduced is a Concept, never an Entity. If a principle says "we built X" or "X was introduced to improve Y", then X is a Concept. |
| 232 | + 3. **The litmus test:** Could you describe this component in a sentence that makes sense WITHOUT mentioning this campaign? "The gateway queue dispatches requests to instances" → Entity (it's infrastructure). "The TTFT slope detector fires when latency slope exceeds a threshold" → Concept (the campaign created it to test a hypothesis). |
| 233 | +
|
| 234 | + Remove or reclassify as Concept anything that fails this check. For each validated entity, note which part of `target_system` in campaign.yaml references it. |
| 235 | +
|
| 236 | + **Entity name grounding (mandatory):** Entity names MUST come from the actual source code, not from human-readable paraphrasing. For each validated entity, do a **single targeted search** in `<repo_path>` to find the primary type/class/struct name. Detect the language from file extensions in the repo and search accordingly: |
| 237 | +
|
| 238 | + ```bash |
| 239 | + # Go: |
| 240 | + grep -r "type <YourGuess>" <repo_path> --include="*.go" -l |
| 241 | + # Python: |
| 242 | + grep -r "class <YourGuess>" <repo_path> --include="*.py" -l |
| 243 | + # Rust: |
| 244 | + grep -r "struct <YourGuess>\|impl <YourGuess>" <repo_path> --include="*.rs" -l |
| 245 | + # TypeScript/JavaScript: |
| 246 | + grep -r "class <YourGuess>\|interface <YourGuess>" <repo_path> --include="*.ts" --include="*.js" -l |
| 247 | + ``` |
| 248 | +
|
| 249 | + - Use the **actual type name** from source as the entity `name` (e.g., `FlowControlFilter`, not "Gateway Queue") |
| 250 | + - Set `source` to `<relative-path>::<TypeName>` where `<relative-path>` is relative to `repo_path` (e.g., `pkg/epp/handlers/flowcontrol.go::FlowControlFilter`). The validator will check that `<repo_path>/<relative-path>` exists on disk — so the path must resolve to a real file. |
| 251 | + - If multiple types compose one logical entity, pick the primary orchestrating type |
| 252 | + - If you cannot find a matching type after 2-3 searches, use the best name from `campaign.yaml`'s `target_system` description and leave `source` as `null` |
| 253 | +
|
| 254 | + **Scope guard:** This is a naming step, not a research step. Do NOT read function bodies, trace call graphs, or explore the codebase beyond finding the type declaration. Spend at most 1-2 searches per entity. |
| 255 | +
|
| 256 | + **Graph validation (mandatory — must pass before proceeding):** After writing concepts.json, run: |
| 257 | + ```bash |
| 258 | + python scripts/validate_concepts.py ~/.nous/wiki/campaigns/<campaign-name>/concepts.json |
| 259 | + ``` |
| 260 | + If the script exits with errors, fix concepts.json and re-run until it passes. Common fixes: |
| 261 | + - "owned by multiple concepts" → remove the parameter from all but its true owner's `parameters` array |
| 262 | + - "orphaned parameter" → add it to the owning concept's `parameters` array |
| 263 | + - "unreachable entity" → either add an `operates_on` reference from a concept, or remove the entity |
| 264 | + - "unknown entity/parameter/concept" → fix the spelling to match exactly |
| 265 | +
|
| 266 | + Do NOT proceed to step 10b until `validate_concepts.py` exits 0. |
| 267 | +
|
| 268 | + **b) `summaries.json`** — iteration summaries for the detail panel: |
| 269 | + ```json |
| 270 | + { |
| 271 | + "iter-0": { |
| 272 | + "what_was_tried": "<1-2 sentences: experimental setup>", |
| 273 | + "what_was_found": "<1-2 sentences: key result, include CONFIRMED/REFUTED/PARTIALLY_CONFIRMED>", |
| 274 | + "why_it_matters": "<1 sentence: significance for the campaign's evolution>" |
| 275 | + }, |
| 276 | + "iter-1": { ... }, |
| 277 | + ... |
| 278 | + } |
| 279 | + ``` |
| 280 | + Write a summary for EVERY iteration (including baseline). These appear in the side panel when a user clicks an iteration node. Keep concise but informative. |
| 281 | +
|
| 282 | +11. **Generate visualization and open**: Only after ALL indexing steps (4-10) are complete, run the visualization script. The script reads insights from per-campaign JSON files. |
| 283 | + ```bash |
| 284 | + python scripts/visualize_campaign.py "<campaign_path>" \ |
| 285 | + --summaries ~/.nous/wiki/campaigns/<campaign-name>/summaries.json \ |
| 286 | + --concepts ~/.nous/wiki/campaigns/<campaign-name>/concepts.json |
| 287 | + ``` |
| 288 | + The script generates `~/.nous/wiki/viz/<campaign-name>.html` and opens it in the browser. |
| 289 | +
|
| 290 | +12. **Report**: Print all output paths and confirm the visualization opened: |
| 291 | + - `~/.nous/wiki/campaigns/<name>/dead-ends.json` |
| 292 | + - `~/.nous/wiki/campaigns/<name>/frontiers.json` |
| 293 | + - `~/.nous/wiki/campaigns/<name>/interactions.json` |
| 294 | + - `~/.nous/wiki/campaigns/<name>/principles.json` |
| 295 | + - `~/.nous/wiki/campaigns/<name>/llm_metrics.jsonl` |
| 296 | + - `~/.nous/wiki/campaigns/<name>/summary.md` |
| 297 | + - `~/.nous/wiki/campaigns/<name>/concepts.json` |
| 298 | + - `~/.nous/wiki/campaigns/<name>/summaries.json` |
| 299 | + - `~/.nous/wiki/viz/<name>.html` |
| 300 | +
|
| 301 | +## Important Rules |
| 302 | +
|
| 303 | +- **Read-only inputs**: Never modify the campaign's own files (ledger.json, principles.json, etc.). |
| 304 | +- **Per-campaign isolation**: Each campaign's structured data lives in `~/.nous/wiki/campaigns/<name>/`. No shared markdown files. |
| 305 | +- **Idempotent**: If the campaign is already indexed (step 3 check), skip indexing and only regenerate the visualization (steps 11-12). |
0 commit comments