Skip to content

Commit 591bf14

Browse files
committed
release specx v0.4.0 execution results
1 parent c8e65e7 commit 591bf14

18 files changed

Lines changed: 963 additions & 20 deletions

.codex-plugin/plugin.json

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "specx-codex-plugin",
3-
"version": "0.3.0",
3+
"version": "0.4.0",
44
"description": "Universal spec-driven agent governance and execution contract runtime for Codex.",
55
"license": "MIT",
66
"author": {
@@ -29,8 +29,8 @@
2929
"websiteURL": "https://github.com/BTCNAI/specx-codex-plugin",
3030
"defaultPrompt": [
3131
"Use SpecX init to create a governed v0.1 contract.",
32-
"Use SpecX verify to fail closed on missing gates, evidence, or failure semantics.",
33-
"Use SpecX to explain the execution contract before workflow execution."
32+
"Use SpecX verify-result to check execution evidence against a contract.",
33+
"Use SpecX verify to fail closed on missing gates, evidence, or failure semantics."
3434
],
3535
"brandColor": "#2563EB"
3636
},

.github/workflows/test.yml

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
name: test
2+
3+
on:
4+
push:
5+
pull_request:
6+
7+
jobs:
8+
test:
9+
runs-on: ubuntu-latest
10+
steps:
11+
- uses: actions/checkout@v4
12+
- uses: actions/setup-python@v5
13+
with:
14+
python-version: "3.11"
15+
- name: Install dependencies
16+
run: python3 -m pip install -r requirements.txt pytest
17+
- name: Run tests
18+
run: python3 -m pytest
19+
- name: Validate contract
20+
run: python3 scripts/specx_cli.py validate templates/research.contract.json
21+
- name: Compile contract
22+
run: python3 scripts/specx_cli.py compile templates/research.contract.json
23+
- name: Verify contract
24+
run: python3 scripts/specx_cli.py verify templates/research.contract.json
25+
- name: Verify passed execution result
26+
run: python3 scripts/specx_cli.py verify-result examples/sample_execution_result_passed.json --contract templates/research.contract.json
27+
- name: Verify failed execution result semantics
28+
run: |
29+
python3 scripts/specx_cli.py verify-result examples/sample_execution_result_failed.json --contract templates/research.contract.json > /tmp/specx_failed_result.json
30+
python3 - <<'PY'
31+
import json
32+
from pathlib import Path
33+
payload = json.loads(Path("/tmp/specx_failed_result.json").read_text())
34+
assert payload["ok"] is True
35+
assert payload["result"]["execution_status"] == "failed"
36+
PY

CHANGELOG.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,14 @@
11
# Changelog
22

3+
## 0.4.0
4+
5+
- Added Execution Result Schema v0.1 at `schemas/specx_execution_result_v0_1.schema.json`.
6+
- Added `specx verify-result` to verify execution results against their governing contract.
7+
- Added MCP `specx.verify_result` with the same implementation as CLI `verify-result`.
8+
- Added passed, failed, and blocked execution-result samples.
9+
- Added GitHub Actions CI for pytest, contract validation, compilation, verification, and result verification.
10+
- Added tests for execution-result schema and fail-closed result verification.
11+
312
## 0.3.0
413

514
- Added Contract Schema v0.1 at `schemas/specx_contract_v0_1.schema.json`.

README.md

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ codex plugin marketplace add BTCNAI/specx-codex-marketplace
1717
Or pin the stable release:
1818

1919
```bash
20-
codex plugin marketplace add https://github.com/BTCNAI/specx-codex-marketplace.git --ref v0.3.0
20+
codex plugin marketplace add https://github.com/BTCNAI/specx-codex-marketplace.git --ref v0.4.0
2121
```
2222

2323
## What It Provides
@@ -46,6 +46,7 @@ python3 scripts/specx_cli.py init --template research --output ./specx.contract.
4646
python3 scripts/specx_cli.py init --template software_refactor --output ./specx.contract.json
4747
python3 scripts/specx_cli.py init --template content_pipeline --output ./specx.contract.json
4848
python3 scripts/specx_cli.py verify ./specx.contract.json
49+
python3 scripts/specx_cli.py verify-result ./specx.execution_result.json --contract ./specx.contract.json
4950
python3 scripts/specx_cli.py validate examples/demo_software_engineering_contract.json
5051
python3 scripts/specx_cli.py compile examples/demo_software_engineering_contract.json
5152
python3 scripts/specx_cli.py explain examples/demo_software_engineering_contract.json
@@ -56,6 +57,7 @@ Included MCP tools:
5657
- `specx.validate`
5758
- `specx.compile`
5859
- `specx.verify`
60+
- `specx.verify_result`
5961
- `specx.explain`
6062
- `specx.init`
6163

@@ -88,6 +90,18 @@ Each gate must include `gate_id`, `condition`, `on_pass`, and `on_failure`. `fai
8890

8991
If required evidence, tools, specs, gates, artifacts, or verification policy are missing, the workflow must return `ok=false` with `failure_state` and `details`. It must not claim success.
9092

93+
## Execution Result Shape
94+
95+
Contracts are pre-execution constraints. Execution results are post-execution acceptance records.
96+
97+
`verify-result` checks that a result has matching `contract_id`, gate results for every contract gate, artifacts for every expected artifact, explicit status, failure semantics checks, and correct `failure_state` behavior.
98+
99+
Execution statuses:
100+
101+
- `passed`: every required gate and artifact is present; `failure_state` must be null or empty.
102+
- `failed`: execution failed; `failure_state` must be explicit.
103+
- `blocked`: execution stopped before completion; `failure_state` must be explicit.
104+
91105
## Demos
92106

93107
Demo 1: Software engineering
@@ -116,13 +130,15 @@ Demo 3: Multi-agent system
116130
- `docs/comparison.md`
117131
- `docs/mcp-tools.md`
118132
- `docs/contract-schema-v0.1.md`
133+
- `docs/execution-result-v0.1.md`
119134
- `docs/cli.md`
120135

121136
## Roadmap
122137

123138
P0:
124139

125140
- Contract schema v0.1 adoption feedback.
141+
- Execution result adoption feedback.
126142
- MCP integration tests against real Codex marketplace install.
127143

128144
P1:

docs/cli.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,16 @@ python3 scripts/specx_cli.py verify ./specx.contract.json
2222

2323
Failure returns `ok=false`, `failure_state`, and structured `details`.
2424

25+
## specx verify-result
26+
27+
Verify an execution result against the governing contract:
28+
29+
```bash
30+
python3 scripts/specx_cli.py verify-result ./specx.execution_result.json --contract ./specx.contract.json
31+
```
32+
33+
`verify-result` checks the execution result schema, contract id, expected artifacts, gate results, verification checks, and failure-state semantics. A complete `failed` or `blocked` result can pass protocol verification while still reporting `execution_status: failed` or `execution_status: blocked`.
34+
2535
## Other Commands
2636

2737
```bash

docs/execution-result-v0.1.md

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
# Execution Result v0.1
2+
3+
The contract is the pre-execution constraint. The execution result is the post-execution acceptance record.
4+
5+
`verify-result` checks that an execution result is complete and aligned with its governing contract. It does not turn a failed or blocked workflow into success.
6+
7+
## Required Fields
8+
9+
- `result_id`
10+
- `schema_version`: must be `"0.1"`
11+
- `contract_id`
12+
- `status`: `passed`, `failed`, or `blocked`
13+
- `executed_agents`
14+
- `tool_calls`
15+
- `evidence_collected`
16+
- `gate_results`
17+
- `artifacts`
18+
- `verification_results`
19+
- `failure_state`
20+
- `risk_notes`
21+
22+
## Gate Results
23+
24+
Each `gate_results[]` item must include:
25+
26+
- `gate_id`
27+
- `status`
28+
- `evidence`
29+
- `details`
30+
31+
Every contract gate must have a matching gate result.
32+
33+
## Status Semantics
34+
35+
- `passed`: all required contract gates and artifacts are present; `failure_state` must be null or empty.
36+
- `failed`: execution failed; `failure_state` must be non-empty.
37+
- `blocked`: execution stopped because a required gate, evidence item, tool, or approval was unavailable; `failure_state` must be non-empty.
38+
39+
## Fake Success Prevention
40+
41+
`verify-result` prevents fake success by checking:
42+
43+
- `contract_id` matches the contract.
44+
- Every contract gate has a result.
45+
- Every `expected_artifacts` item is represented.
46+
- `passed` results do not carry a failure state.
47+
- `failed` and `blocked` results have an explicit failure state.
48+
- `verification_results` includes `no_fake_success`, `no_silent_fallback`, and `explicit_failure_state`.
49+
50+
## Samples
51+
52+
- `examples/sample_execution_result_passed.json`
53+
- `examples/sample_execution_result_failed.json`
54+
- `examples/sample_execution_result_blocked.json`
55+
56+
Samples are protocol fixtures. They are not proof of real external execution.

docs/mcp-tools.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ SpecX exposes four MCP tools:
55
- `specx.validate`
66
- `specx.compile`
77
- `specx.verify`
8+
- `specx.verify_result`
89
- `specx.explain`
910
- `specx.init`
1011

@@ -53,6 +54,10 @@ Compiles a valid contract into a governed execution plan with agents, tools, evi
5354

5455
Fails closed when gates, expected artifacts, required agents, required tools, or `no_fake_success` / `no_silent_fallback` constraints are missing.
5556

57+
### specx.verify_result
58+
59+
Verifies a SpecX execution result against its governing contract. It checks `contract_id`, gate coverage, artifact coverage, verification checks, and status/failure-state semantics.
60+
5661
### specx.init
5762

5863
Creates a verified Contract Schema v0.1 skeleton from `research`, `software_refactor`, or `content_pipeline`.

docs/quickstart.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ codex plugin marketplace add BTCNAI/specx-codex-marketplace
1111
Pinned install:
1212

1313
```bash
14-
codex plugin marketplace add https://github.com/BTCNAI/specx-codex-marketplace.git --ref v0.3.0
14+
codex plugin marketplace add https://github.com/BTCNAI/specx-codex-marketplace.git --ref v0.4.0
1515
```
1616

1717
## Initialize A Contract
@@ -34,6 +34,12 @@ python3 scripts/specx_cli.py init --template content_pipeline --output ./specx.c
3434
python3 scripts/specx_cli.py verify ./specx.contract.json
3535
```
3636

37+
## Verify An Execution Result
38+
39+
```bash
40+
python3 scripts/specx_cli.py verify-result examples/sample_execution_result_passed.json --contract templates/research.contract.json
41+
```
42+
3743
Expected success:
3844

3945
```json
@@ -70,6 +76,7 @@ The current main branch exposes MCP tools:
7076
- `specx.validate`
7177
- `specx.compile`
7278
- `specx.verify`
79+
- `specx.verify_result`
7380
- `specx.explain`
7481

7582
See `docs/mcp-tools.md`.
Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
{
2+
"result_id": "sample-result-blocked-001",
3+
"schema_version": "0.1",
4+
"contract_id": "research-contract-001",
5+
"status": "blocked",
6+
"executed_agents": [
7+
{
8+
"agent_id": "research_planner",
9+
"status": "blocked"
10+
},
11+
{
12+
"agent_id": "evidence_reviewer",
13+
"status": "blocked"
14+
}
15+
],
16+
"tool_calls": [
17+
{
18+
"tool_id": "source_reader",
19+
"status": "blocked"
20+
},
21+
{
22+
"tool_id": "citation_tracker",
23+
"status": "blocked"
24+
},
25+
{
26+
"tool_id": "artifact_writer",
27+
"status": "blocked"
28+
}
29+
],
30+
"evidence_collected": [
31+
"source_list",
32+
"claim_to_source_map",
33+
"risk_register"
34+
],
35+
"gate_results": [
36+
{
37+
"gate_id": "source_gate",
38+
"status": "blocked",
39+
"evidence": [
40+
"source_list"
41+
],
42+
"details": "Claim-to-source mapping was incomplete, so execution stopped at the source gate."
43+
},
44+
{
45+
"gate_id": "decision_packet_gate",
46+
"status": "blocked",
47+
"evidence": [
48+
"risk_register"
49+
],
50+
"details": "Decision packet release remained blocked because the source gate did not pass."
51+
}
52+
],
53+
"artifacts": [
54+
{
55+
"artifact_id": "research_plan",
56+
"status": "produced"
57+
},
58+
{
59+
"artifact_id": "evidence_matrix",
60+
"status": "blocked"
61+
},
62+
{
63+
"artifact_id": "decision_packet",
64+
"status": "blocked"
65+
}
66+
],
67+
"verification_results": {
68+
"no_fake_success": true,
69+
"no_silent_fallback": true,
70+
"explicit_failure_state": true
71+
},
72+
"failure_state": "blocked_missing_claim_to_source_map",
73+
"risk_notes": [
74+
"Blocked result documents the missing evidence instead of silently falling back."
75+
]
76+
}

0 commit comments

Comments
 (0)