Skip to content

Commit 6f22464

Browse files
authored
feat(nexus): TaskResult, preflight exit codes, ledger event, /why post-run, smoke evidence (WI-NEXUS-011..015) (#73)
Squash-merge of WI-NEXUS-011..015
1 parent 8b15ce3 commit 6f22464

13 files changed

Lines changed: 1253 additions & 20 deletions

File tree

.specsmith/ledger-chain.txt

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,3 +13,8 @@ cdc3fb4815489052b6602f8965a51e1676816578587a2d56f2234021f3f045db
1313
a39dcae4c5b4338d2e0dc9f56418885cdf1efc6b582c09ede9d80ecd25368647
1414
d0e80ec48ee0a854345237c2fcb8f2ad112ff4f66dc7a6732926017501d85fb4
1515
8d94f66001aed55c2aef12c9e0c033cf9d76e373e25addb366659e08b54c8319
16+
414225ca221b7eb7d6cc52948accf3d77521e6d68616abb925094694d5e01971
17+
7f161e088d82fcde651a18df860bb2eb01f351c393e47a476da7db6017ed6668
18+
7125182a6d402b2e8022fee66cc10950e5734d21e4c40cd1410e9aaca303f5a2
19+
16eb05be2f953074e4e4c47efb0fd37e9000777db8ddedceee2eca165cf4d925
20+
01f963eb2078b1815be7cccff4d92dedf3b242c302ceba840f0eb489fe7a4f4a

.specsmith/requirements.json

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -628,5 +628,40 @@
628628
"description": "`ARCHITECTURE.md`, `README.md`, and `docs/` must describe the natural-language broker (REQ-084), the `specsmith preflight` CLI (REQ-085), the REPL execution gate (REQ-086), and the bounded-retry harness (REQ-087), including the `/why` toggle and an end-to-end example flow. Documentation must not surface REQ/TEST/WI tokens to the user except inside the explicit `/why` block.",
629629
"source": "ARCHITECTURE.md",
630630
"status": "defined"
631+
},
632+
{
633+
"id": "REQ-091",
634+
"title": "Orchestrator Must Return a Structured TaskResult",
635+
"description": "`orchestrator.run_task` must return a `TaskResult` dataclass with at least the fields `equilibrium: bool`, `confidence: float`, `summary: str`, `files_changed: list[str]`, and `test_results: dict`. The Nexus REPL's broker branch must consume this dataclass directly when feeding `execute_with_governance` (REQ-087); the broker must not synthesize `equilibrium` from a boolean cast of the summary string.",
636+
"source": "ARCHITECTURE.md",
637+
"status": "defined"
638+
},
639+
{
640+
"id": "REQ-092",
641+
"title": "specsmith preflight CLI Must Use Decision-Specific Exit Codes",
642+
"description": "The `specsmith preflight` CLI must exit `0` for `accepted`, `2` for `needs_clarification`, and `3` for `blocked` or `rejected` decisions, so CI pipelines and shell wrappers can branch on intent without parsing the JSON payload. The JSON payload must continue to print on stdout for both success and non-zero exits.",
643+
"source": "ARCHITECTURE.md",
644+
"status": "defined"
645+
},
646+
{
647+
"id": "REQ-093",
648+
"title": "Accepted preflight Must Record a Ledger Event",
649+
"description": "When `specsmith preflight` produces an `accepted` decision and `LEDGER.md` exists in the project root, the CLI must append a `preflight` ledger event tagged with `REQ-085` plus the resolved `requirement_ids`. The event must record the utterance, the assigned `work_item_id`, and the `confidence_target`, so every accepted preflight is traceable end-to-end.",
650+
"source": "ARCHITECTURE.md",
651+
"status": "defined"
652+
},
653+
{
654+
"id": "REQ-094",
655+
"title": "/why Must Surface Post-Run Governance in the REPL",
656+
"description": "When `verbose_governance` is on (toggled by `/why` or `/show-governance`), after the REPL drives `execute_with_governance` for an accepted utterance it must print a single `[/why]` block summarizing the assigned `work_item_id`, the matched `requirement_ids` and `test_case_ids`, the post-run confidence, and whether the bounded-retry harness reached equilibrium. When verbose mode is off, the post-run governance block must not be emitted.",
657+
"source": "ARCHITECTURE.md",
658+
"status": "defined"
659+
},
660+
{
661+
"id": "REQ-095",
662+
"title": "Nexus Live Smoke Run Must Be Reproducible Evidence",
663+
"description": "A live or honestly-skipped invocation of `scripts/nexus_smoke.py` must be captured under `.specsmith/runs/WI-NEXUS-011/logs.txt` so the project ledger preserves at least one reproducible record of the broker -> preflight -> orchestrator -> vLLM end-to-end path (or a documented reason the live container could not be reached in the current environment).",
664+
"source": "ARCHITECTURE.md",
665+
"status": "defined"
631666
}
632667
]
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
{
2+
"ok": false,
3+
"content": "",
4+
"latency_ms": 4078,
5+
"error": "transport: <urlopen error [WinError 10061] No connection could be made because the target machine actively refused it>"
6+
}
7+
8+
# WI-NEXUS-011 evidence note
9+
# Captured 2026-04-27 on Windows pwsh (Docker 29.1.3 available, but the vLLM
10+
# l1-nexus container was not running). The smoke script (REQ-089) returned
11+
# the structured offline failure shown above. To produce a green live result,
12+
# run: docker compose up -d l1-nexus && py scripts/nexus_smoke.py.

.specsmith/runs/WI-NEXUS-015/diff.patch

Lines changed: 669 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
........................................................................ [ 29%]
2+
........................................................................ [ 58%]
3+
.................s...................................................... [ 87%]
4+
................................ [100%]
5+
247 passed, 1 skipped in 16.61s

.specsmith/testcases.json

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -988,5 +988,60 @@
988988
"input": {},
989989
"expected_behavior": {},
990990
"confidence": 1.0
991+
},
992+
{
993+
"id": "TEST-091",
994+
"title": "Orchestrator.run_task Returns a Structured TaskResult",
995+
"description": "`orchestrator.run_task` must return a `TaskResult` instance whose attributes include `equilibrium`, `confidence`, `summary`, `files_changed`, and `test_results`. The REPL source must consume that result inside the executor closure (`result.equilibrium`, `result.confidence`) rather than computing equilibrium from `bool(summary)`.",
996+
"requirement_id": "REQ-091",
997+
"type": "unit",
998+
"verification_method": "pytest",
999+
"input": {},
1000+
"expected_behavior": {},
1001+
"confidence": 1.0
1002+
},
1003+
{
1004+
"id": "TEST-092",
1005+
"title": "specsmith preflight CLI Returns Decision-Specific Exit Codes",
1006+
"description": "Invoking `specsmith preflight` over a tmp project must exit 0 for `accepted` decisions, 2 for `needs_clarification`, and 3 for `blocked`/`rejected`. The JSON payload must continue to be emitted on stdout regardless of exit code.",
1007+
"requirement_id": "REQ-092",
1008+
"type": "unit",
1009+
"verification_method": "pytest",
1010+
"input": {},
1011+
"expected_behavior": {},
1012+
"confidence": 1.0
1013+
},
1014+
{
1015+
"id": "TEST-093",
1016+
"title": "Accepted preflight Records a Ledger Event",
1017+
"description": "When the preflight decision is `accepted` and `LEDGER.md` exists in the tmp project root, invoking the CLI must append a new ledger entry tagged with `REQ-085` and the matched `requirement_ids`. When the decision is `needs_clarification`, the ledger must not gain an entry.",
1018+
"requirement_id": "REQ-093",
1019+
"type": "unit",
1020+
"verification_method": "pytest",
1021+
"input": {},
1022+
"expected_behavior": {},
1023+
"confidence": 1.0
1024+
},
1025+
{
1026+
"id": "TEST-094",
1027+
"title": "/why Surfaces Post-Run Governance Block in REPL",
1028+
"description": "Inspecting the REPL source must show a `[/why]` post-run block guarded by `verbose_governance` after `execute_with_governance` returns; the block must reference `work_item_id`, `requirement_ids`, `test_case_ids`, `confidence`, and `equilibrium`.",
1029+
"requirement_id": "REQ-094",
1030+
"type": "unit",
1031+
"verification_method": "pytest",
1032+
"input": {},
1033+
"expected_behavior": {},
1034+
"confidence": 1.0
1035+
},
1036+
{
1037+
"id": "TEST-095",
1038+
"title": "Nexus Live Smoke Evidence Captured",
1039+
"description": "`.specsmith/runs/WI-NEXUS-011/logs.txt` must exist and document either a successful live smoke (`ok: true`) or an honest reason the live container could not be reached.",
1040+
"requirement_id": "REQ-095",
1041+
"type": "unit",
1042+
"verification_method": "pytest",
1043+
"input": {},
1044+
"expected_behavior": {},
1045+
"confidence": 1.0
9911046
}
9921047
]

LEDGER.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -472,3 +472,38 @@ Phase 4: feature flags, instinct/learning, eval harness, agent memory, multi-age
472472
- **REQs affected**: REQ-090
473473
- **Status**: complete
474474
- **Chain hash**: `8d94f66001aed55c...`
475+
476+
## 2026-04-27T19:42 — WI-NEXUS-011: Live l1-nexus smoke evidence captured (REQ-095)
477+
- **Author**: agent
478+
- **Type**: evidence
479+
- **REQs affected**: REQ-095
480+
- **Status**: complete
481+
- **Chain hash**: `414225ca221b7eb7...`
482+
483+
## 2026-04-27T19:42 — WI-NEXUS-012: Structured TaskResult dataclass returned by orchestrator.run_task (REQ-091)
484+
- **Author**: agent
485+
- **Type**: feature
486+
- **REQs affected**: REQ-091
487+
- **Status**: complete
488+
- **Chain hash**: `7f161e088d82fcde...`
489+
490+
## 2026-04-27T19:42 — WI-NEXUS-013: /why post-run governance block in REPL (REQ-094)
491+
- **Author**: agent
492+
- **Type**: feature
493+
- **REQs affected**: REQ-094
494+
- **Status**: complete
495+
- **Chain hash**: `7125182a6d402b2e...`
496+
497+
## 2026-04-27T19:42 — WI-NEXUS-014: specsmith preflight CLI decision-specific exit codes (REQ-092)
498+
- **Author**: agent
499+
- **Type**: feature
500+
- **REQs affected**: REQ-092
501+
- **Status**: complete
502+
- **Chain hash**: `16eb05be2f953074...`
503+
504+
## 2026-04-27T19:42 — WI-NEXUS-015: Accepted preflight records preflight ledger event (REQ-093)
505+
- **Author**: agent
506+
- **Type**: feature
507+
- **REQs affected**: REQ-093
508+
- **Status**: complete
509+
- **Chain hash**: `01f963eb2078b181...`

REQUIREMENTS.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -630,3 +630,38 @@
630630
- **Source:** ARCHITECTURE.md
631631
- **Status:** defined
632632

633+
## 91. Orchestrator Must Return a Structured TaskResult
634+
- **ID:** REQ-091
635+
- **Title:** Orchestrator Must Return a Structured TaskResult
636+
- **Description:** `orchestrator.run_task` must return a `TaskResult` dataclass with at least the fields `equilibrium: bool`, `confidence: float`, `summary: str`, `files_changed: list[str]`, and `test_results: dict`. The Nexus REPL's broker branch must consume this dataclass directly when feeding `execute_with_governance` (REQ-087); the broker must not synthesize `equilibrium` from a boolean cast of the summary string.
637+
- **Source:** ARCHITECTURE.md
638+
- **Status:** defined
639+
640+
## 92. specsmith preflight CLI Must Use Decision-Specific Exit Codes
641+
- **ID:** REQ-092
642+
- **Title:** specsmith preflight CLI Must Use Decision-Specific Exit Codes
643+
- **Description:** The `specsmith preflight` CLI must exit `0` for `accepted`, `2` for `needs_clarification`, and `3` for `blocked` or `rejected` decisions, so CI pipelines and shell wrappers can branch on intent without parsing the JSON payload. The JSON payload must continue to print on stdout for both success and non-zero exits.
644+
- **Source:** ARCHITECTURE.md
645+
- **Status:** defined
646+
647+
## 93. Accepted preflight Must Record a Ledger Event
648+
- **ID:** REQ-093
649+
- **Title:** Accepted preflight Must Record a Ledger Event
650+
- **Description:** When `specsmith preflight` produces an `accepted` decision and `LEDGER.md` exists in the project root, the CLI must append a `preflight` ledger event tagged with `REQ-085` plus the resolved `requirement_ids`. The event must record the utterance, the assigned `work_item_id`, and the `confidence_target`, so every accepted preflight is traceable end-to-end.
651+
- **Source:** ARCHITECTURE.md
652+
- **Status:** defined
653+
654+
## 94. /why Must Surface Post-Run Governance in the REPL
655+
- **ID:** REQ-094
656+
- **Title:** /why Must Surface Post-Run Governance in the REPL
657+
- **Description:** When `verbose_governance` is on (toggled by `/why` or `/show-governance`), after the REPL drives `execute_with_governance` for an accepted utterance it must print a single `[/why]` block summarizing the assigned `work_item_id`, the matched `requirement_ids` and `test_case_ids`, the post-run confidence, and whether the bounded-retry harness reached equilibrium. When verbose mode is off, the post-run governance block must not be emitted.
658+
- **Source:** ARCHITECTURE.md
659+
- **Status:** defined
660+
661+
## 95. Nexus Live Smoke Run Must Be Reproducible Evidence
662+
- **ID:** REQ-095
663+
- **Title:** Nexus Live Smoke Run Must Be Reproducible Evidence
664+
- **Description:** A live or honestly-skipped invocation of `scripts/nexus_smoke.py` must be captured under `.specsmith/runs/WI-NEXUS-011/logs.txt` so the project ledger preserves at least one reproducible record of the broker -> preflight -> orchestrator -> vLLM end-to-end path (or a documented reason the live container could not be reached in the current environment).
665+
- **Source:** ARCHITECTURE.md
666+
- **Status:** defined
667+

TESTS.md

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -990,3 +990,58 @@
990990
- **Expected Behavior:** Each file mentions the broker concept, the preflight CLI, the gate, and the `/why` toggle.
991991
- **Confidence:** 1.0
992992

993+
## TEST-091. Orchestrator.run_task Returns a Structured TaskResult
994+
- **ID:** TEST-091
995+
- **Title:** Orchestrator.run_task Returns a Structured TaskResult
996+
- **Description:** `orchestrator.run_task` must return a `TaskResult` instance whose attributes include `equilibrium`, `confidence`, `summary`, `files_changed`, and `test_results`. The REPL source must consume that result inside the executor closure (`result.equilibrium`, `result.confidence`) rather than computing equilibrium from `bool(summary)`.
997+
- **Requirement ID:** REQ-091
998+
- **Type:** unit
999+
- **Verification Method:** pytest
1000+
- **Input:** orchestrator module + REPL source
1001+
- **Expected Behavior:** TaskResult dataclass exposes the required fields; REPL source references `result.equilibrium` and `result.confidence`.
1002+
- **Confidence:** 1.0
1003+
1004+
## TEST-092. specsmith preflight CLI Returns Decision-Specific Exit Codes
1005+
- **ID:** TEST-092
1006+
- **Title:** specsmith preflight CLI Returns Decision-Specific Exit Codes
1007+
- **Description:** Invoking `specsmith preflight` over a tmp project must exit 0 for `accepted` decisions, 2 for `needs_clarification`, and 3 for `blocked`/`rejected`. The JSON payload must continue to be emitted on stdout regardless of exit code.
1008+
- **Requirement ID:** REQ-092
1009+
- **Type:** unit
1010+
- **Verification Method:** pytest
1011+
- **Input:** click.testing.CliRunner over isolated tmp_path
1012+
- **Expected Behavior:** Exit code matches the decision; JSON parses on stdout for both 0 and 2 exits.
1013+
- **Confidence:** 1.0
1014+
1015+
## TEST-093. Accepted preflight Records a Ledger Event
1016+
- **ID:** TEST-093
1017+
- **Title:** Accepted preflight Records a Ledger Event
1018+
- **Description:** When the preflight decision is `accepted` and `LEDGER.md` exists in the tmp project root, invoking the CLI must append a new ledger entry tagged with `REQ-085` and the matched `requirement_ids`. When the decision is `needs_clarification`, the ledger must not gain an entry.
1019+
- **Requirement ID:** REQ-093
1020+
- **Type:** unit
1021+
- **Verification Method:** pytest
1022+
- **Input:** click.testing.CliRunner with seeded LEDGER.md
1023+
- **Expected Behavior:** LEDGER.md grows on accept; LEDGER.md unchanged on needs_clarification.
1024+
- **Confidence:** 1.0
1025+
1026+
## TEST-094. /why Surfaces Post-Run Governance Block in REPL
1027+
- **ID:** TEST-094
1028+
- **Title:** /why Surfaces Post-Run Governance Block in REPL
1029+
- **Description:** Inspecting the REPL source must show a `[/why]` post-run block guarded by `verbose_governance` after `execute_with_governance` returns; the block must reference `work_item_id`, `requirement_ids`, `test_case_ids`, `confidence`, and `equilibrium`.
1030+
- **Requirement ID:** REQ-094
1031+
- **Type:** unit
1032+
- **Verification Method:** pytest
1033+
- **Input:** repl module source
1034+
- **Expected Behavior:** Source contains a `[/why]` block guarded by verbose_governance and referencing the required keys.
1035+
- **Confidence:** 1.0
1036+
1037+
## TEST-095. Nexus Live Smoke Evidence Captured
1038+
- **ID:** TEST-095
1039+
- **Title:** Nexus Live Smoke Evidence Captured
1040+
- **Description:** `.specsmith/runs/WI-NEXUS-011/logs.txt` must exist and document either a successful live smoke (`ok: true`) or an honest reason the live container could not be reached.
1041+
- **Requirement ID:** REQ-095
1042+
- **Type:** unit
1043+
- **Verification Method:** pytest
1044+
- **Input:** .specsmith/runs/WI-NEXUS-011/logs.txt
1045+
- **Expected Behavior:** Log file present and non-empty; mentions either ok=true/false from the smoke script.
1046+
- **Confidence:** 1.0
1047+

0 commit comments

Comments
 (0)