Skip to content

Commit 9ba42cf

Browse files
committed
Validate JSON CLI flows in CI
1 parent 2e4b0b9 commit 9ba42cf

2 files changed

Lines changed: 34 additions & 0 deletions

File tree

.github/workflows/ci.yml

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,8 +26,12 @@ jobs:
2626
python examples/langgraph_integration.py
2727
python examples/openai_agents_integration.py
2828
agentci summarize examples/openai_agents_episode.json
29+
agentci summarize examples/openai_agents_episode.json --json > /tmp/agentci-summary.json
30+
python -c "import json; data=json.load(open('/tmp/agentci-summary.json')); assert data['episode_id'] == 'openai-agents-demo'; assert data['tool_calls'] >= 1"
2931
agentci diff-html examples/math_episode.json examples/math_episode_candidate.json examples/math_diff.html
3032
agentci assert-regression examples/math_episode.json examples/math_episode_latency_candidate.json --ignore-diff-prefix metric:latency_ms
33+
agentci assert-regression examples/math_episode.json examples/math_episode_latency_candidate.json --ignore-diff-prefix metric:latency_ms --json > /tmp/agentci-regression.json
34+
python -c "import json; data=json.load(open('/tmp/agentci-regression.json')); assert data['passed'] is True"
3135
agentci detect-flaky examples/math_episode.json examples/math_episode_latency_candidate.json examples/math_episode_candidate.json
3236
3337
tracepack:
@@ -48,8 +52,12 @@ jobs:
4852
run: |
4953
python examples/make_sample_episodes.py
5054
tracepack scan examples/source_episodes
55+
tracepack scan examples/source_episodes --json > /tmp/tracepack-scan.json
56+
python -c "import json; data=json.load(open('/tmp/tracepack-scan.json')); assert data['episode_count'] == 3; assert data['failures'] == 1"
5157
tracepack build examples/source_episodes examples/demo_pack --only-failures --redact --max-per-signature 1
5258
tracepack inspect examples/demo_pack
59+
tracepack inspect examples/demo_pack --json > /tmp/tracepack-inspect.json
60+
python -c "import json; data=json.load(open('/tmp/tracepack-inspect.json')); assert data['case_count'] == 1; assert data['redacted'] is True"
5361
tracepack export-jsonl examples/demo_pack examples/demo_pack.jsonl
5462
tracepack export-chat examples/demo_pack examples/demo_chat.jsonl
5563
@@ -71,15 +79,23 @@ jobs:
7179
run: |
7280
failmap cluster examples/sample_pack examples/clusters.json
7381
failmap summarize examples/clusters.json
82+
failmap summarize examples/clusters.json --json > /tmp/failmap-summary.json
83+
python -c "import json; data=json.load(open('/tmp/failmap-summary.json')); assert data['cluster_count'] >= 1; assert data['case_count'] >= 1"
7484
failmap markdown examples/clusters.json examples/report.md
7585
failmap compare examples/baseline_clusters.json examples/candidate_clusters.json examples/compare.json
7686
failmap compare-summary examples/compare.json
87+
failmap compare-summary examples/compare.json --json > /tmp/failmap-compare.json
88+
python -c "import json; data=json.load(open('/tmp/failmap-compare.json')); assert 'summary' in data; assert data['cluster_count'] >= 1"
7789
failmap compare-markdown examples/compare.json examples/compare.md
7890
failmap issue-drafts examples/compare.json examples/issues --rules examples/triage_rules.json
7991
failmap issue-bundle examples/issues examples/bundle
8092
failmap issue-bundle-summary examples/bundle/bundle.json
93+
failmap issue-bundle-summary examples/bundle/bundle.json --json > /tmp/failmap-bundle.json
94+
python -c "import json; data=json.load(open('/tmp/failmap-bundle.json')); assert data['draft_count'] >= 1"
8195
failmap trend examples/trends.json examples/baseline_clusters.json examples/candidate_clusters.json examples/release3_clusters.json
8296
failmap trend-summary examples/trends.json
97+
failmap trend-summary examples/trends.json --json > /tmp/failmap-trends.json
98+
python -c "import json; data=json.load(open('/tmp/failmap-trends.json')); assert data['snapshot_count'] == 3"
8399
failmap trend-markdown examples/trends.json examples/trends.md
84100
85101
packslice:
@@ -100,4 +116,6 @@ jobs:
100116
run: |
101117
packslice split examples/sample_pack examples/split_demo --group-by signature --train-ratio 70 --eval-ratio 15 --test-ratio 15
102118
packslice summarize examples/split_demo
119+
packslice summarize examples/split_demo --json > /tmp/packslice-summary.json
120+
python -c "import json; data=json.load(open('/tmp/packslice-summary.json')); assert data['total_cases'] == 6; assert len(data['splits']) == 3"
103121
packslice markdown examples/split_demo examples/split_demo/REPORT.md

README.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,19 @@ FailMap -> cluster failures, compare releases, generate triage issues, bundle
5454
PackSlice -> split packs into balanced train/eval/test datasets
5555
```
5656

57+
## Machine-readable CLI story
58+
59+
All four projects now support JSON-friendly CLI flows, so they can be chained in CI without scraping human text output:
60+
61+
```bash
62+
agentci summarize projects/agentci/examples/math_episode.json --json
63+
tracepack scan projects/tracepack/examples/source_episodes --json
64+
failmap summarize projects/failmap/examples/clusters.json --json
65+
packslice summarize projects/packslice/examples/split_demo --json
66+
```
67+
68+
That makes it easier to build release checks, artifact pipelines, and automated dashboards on top of the same OSS commands shown in the READMEs.
69+
5770
## Monorepo structure
5871

5972
```text
@@ -98,6 +111,7 @@ tracepack build examples/source_episodes examples/demo_pack --only-failures --re
98111
tracepack inspect examples/demo_pack
99112
tracepack export-jsonl examples/demo_pack examples/demo_pack.jsonl
100113
tracepack export-chat examples/demo_pack examples/demo_chat.jsonl
114+
tracepack scan examples/source_episodes --json
101115
```
102116

103117
### FailMap
@@ -111,6 +125,7 @@ failmap compare examples/baseline_clusters.json examples/candidate_clusters.json
111125
failmap issue-drafts examples/compare.json examples/issues --rules examples/triage_rules.json
112126
failmap issue-bundle examples/issues examples/bundle
113127
failmap trend examples/trends.json examples/baseline_clusters.json examples/candidate_clusters.json examples/release3_clusters.json
128+
failmap compare-summary examples/compare.json --json
114129
```
115130

116131
### PackSlice
@@ -123,6 +138,7 @@ pip install -e .
123138
packslice split examples/sample_pack examples/split_demo --group-by signature
124139
packslice summarize examples/split_demo
125140
packslice markdown examples/split_demo examples/split_demo/REPORT.md
141+
packslice summarize examples/split_demo --json
126142
```
127143

128144
## Why these projects have star potential

0 commit comments

Comments
 (0)