Skip to content

Commit ba329f3

Browse files
avrabeclaude
andauthored
feat(ci): rivet-driven verification gate + sticky PR comment (#221)
* feat(ci): rivet-driven verification gate + sticky PR comment Makes artifacts/verification.yaml *executable* rather than purely descriptive. A new CI job iterates every `type: feature` artifact, runs its `fields.steps[].run` commands via the existing rivet CLI, aggregates per- artifact pass/fail, and upserts a single marker-tagged PR comment with the counts and failed artifact IDs. Same script runs locally: tools/run_verification.py --filter '(and (= type "feature") (has-tag "v093"))' Per-PR override: add `Verify-Filter: <sexp>` to the PR body to scope what runs. What landed: - tools/run_verification.py — Python (stdlib only). Calls `rivet list`+ `rivet get`, executes each step under `bash -c`, writes a verification-results.json with passed/failed/skipped lists. - tools/post_verification_comment.py — finds the marker comment via `gh api` and PATCHes the body; creates a new comment only on first run. PR-comment markdown shows N/M passed, a table, and a `<details>` block of failed IDs. - .github/workflows/verification-gate.yml — new CI job on PRs. Pulls PR body via env (avoiding script-injection on `${{ ... }}` interpolation). - artifacts/verification.yaml: quote `cargo test ... enumerate::` step commands (line ~1615) — unquoted trailing `::` made rivet's YAML parser silently skip the entire file. Required for any rivet command to see the verification artifacts at all. - artifacts/{requirements,verification}.yaml: REQ-VERIFY-GATE-001 + TEST-VERIFY-GATE-RUNNER documenting the new gate. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(ci): specify package for rivet install (rivet repo has multiple binary packages) The rivet workspace ships `rivet-cli` (the user-facing CLI binary `rivet`) and `rivet-fuzz` side-by-side. `cargo install --git ... --bin rivet` fails because cargo can't disambiguate which package owns the binary; add `--package rivet-cli` to select the right one. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(ci): use positional crate arg for cargo install (not --package) cargo install rejects --package; the crate name is the positional argument. Switch from `--package rivet-cli` to the trailing positional `rivet-cli`, which is the form cargo's error message itself suggests. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(ci): install rivet v0.7.0 (b7a17be) — the SHA I used was etch's, not rivet's The 4c06709 SHA I pinned earlier is etch's revision (the layout-engine sibling crate in the rivet workspace). It pre-dates rivet's `--filter` flag, so the verification gate ran against an old CLI that errored on `--filter`. Switch to b7a17be — rivet v0.7.0, the first release that ships `rivet list --filter <sexp>` (verified locally: `rivet --version` reports `0.7.0 (b7a17bef main 2026-04-30)`). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(ci): use valid rivet filter operator + narrow default scope to current milestone Two corrections to the verification gate's default filter: 1. `has-status` is not a rivet operator — rivet v0.7.0 supports the family and/or/not/implies/excludes/=/!=/>/</has-tag/has-field/in/matches/contains/ linked-*. Use `(= status "passing")` semantics where needed. 2. Running every `type: feature` artifact is 124 entries — they invoke `cargo test -p <crate>` across the whole workspace and easily exceed the 30-minute job timeout. Narrow the default to the current-milestone scope (v093 + v0100 tags = ~12 artifacts), which runs in minutes. Per-PR override via `Verify-Filter: <sexp>` in the PR body is unchanged. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(ci): drop gh CLI dependency + correct TEST-BINDING-NESTED-PATH step name Two end-to-end fixes surfaced by the gate's first real run on PR #221: 1. The post-comment script used the `gh` CLI which is not installed on the self-hosted runners. Rewrote `post_verification_comment.py` to call the GitHub REST API directly via stdlib `urllib.request` — no external deps. Same sticky-comment behavior (marker lookup, PATCH or POST). 2. TEST-BINDING-NESTED-PATH's second step was `cargo test -p spar-cli applies_to_nested`, but the spar-cli crate's *package* name is `spar`, not `spar-cli`. The step never matched anything until the gate started actually executing it. Corrected to `cargo test -p spar --test applies_to_nested` — runs the 3 integration tests cleanly. This is exactly the drift the gate is built to catch: the YAML claimed a test was passing, but the test invocation was wrong all along. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
1 parent b7c8834 commit ba329f3

5 files changed

Lines changed: 477 additions & 3 deletions

File tree

Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
name: Verification Gate
2+
3+
on:
4+
pull_request:
5+
branches: [main]
6+
workflow_dispatch:
7+
8+
concurrency:
9+
group: ${{ github.workflow }}-${{ github.head_ref || github.ref }}
10+
cancel-in-progress: true
11+
12+
permissions:
13+
contents: read
14+
pull-requests: write
15+
16+
jobs:
17+
rivet-verification:
18+
name: Verification Gate (rivet-driven)
19+
runs-on: [self-hosted, linux, x64, rust-cpu]
20+
timeout-minutes: 30
21+
env:
22+
CARGO_TERM_COLOR: always
23+
RUSTFLAGS: -D warnings
24+
CARGO_INCREMENTAL: 0
25+
steps:
26+
- uses: actions/checkout@v4
27+
28+
- uses: dtolnay/rust-toolchain@stable
29+
30+
- uses: Swatinem/rust-cache@v2
31+
32+
- name: Install rivet
33+
run: |
34+
# Pin to rivet v0.7.0 (commit b7a17be) — the release that first ships
35+
# `rivet list --filter <sexp>`, which this gate depends on. (The
36+
# 4c06709 SHA I tried first is *etch*'s rev — etch is a sibling crate
37+
# in the rivet workspace but unrelated to the CLI.)
38+
# The rivet repo is a multi-package workspace; the trailing positional
39+
# `rivet-cli` selects which crate owns the `rivet` binary, otherwise
40+
# cargo errors with "multiple packages with binaries found:
41+
# rivet-cli, rivet-fuzz".
42+
cargo install --locked --git https://github.com/pulseengine/rivet \
43+
--rev b7a17bef \
44+
--bin rivet \
45+
rivet-cli
46+
47+
- name: Pick verification filter (per-PR override via body)
48+
id: pick
49+
env:
50+
PR_BODY: ${{ github.event.pull_request.body }}
51+
run: |
52+
# Default: every `type: feature` artifact whose status is currently
53+
# `passing`. PRs may override by adding a single line to the body:
54+
# Verify-Filter: (and (= type "feature") (has-tag "v093"))
55+
# Default scope: current-milestone artifacts only (v0.9.x bug
56+
# closeout + v0.10.x work). Running every `(= type "feature")`
57+
# artifact would re-run ~124 cargo test invocations and blow the
58+
# 30-minute job timeout. Per-PR override via `Verify-Filter:` line.
59+
# rivet v0.7.0 supports: and/or/not/implies/excludes/=/!=/>/</has-tag/
60+
# has-field/in/matches/contains/linked-* (no `has-status`).
61+
DEFAULT='(and (= type "feature") (or (has-tag "v093") (has-tag "v0100")))'
62+
OVERRIDE=$(printf '%s' "$PR_BODY" | grep -m1 -E '^Verify-Filter:' | sed 's/^Verify-Filter:[[:space:]]*//' || true)
63+
if [ -n "${OVERRIDE:-}" ]; then
64+
printf 'filter=%s\n' "$OVERRIDE" >>"$GITHUB_OUTPUT"
65+
else
66+
printf 'filter=%s\n' "$DEFAULT" >>"$GITHUB_OUTPUT"
67+
fi
68+
69+
- name: Run rivet verification gate
70+
id: verify
71+
continue-on-error: true
72+
env:
73+
FILTER: ${{ steps.pick.outputs.filter }}
74+
run: |
75+
tools/run_verification.py \
76+
--filter "$FILTER" \
77+
--results-json verification-results.json
78+
79+
- name: Upload results artifact
80+
if: always()
81+
uses: actions/upload-artifact@v4
82+
with:
83+
name: verification-results
84+
path: verification-results.json
85+
if-no-files-found: warn
86+
87+
- name: Post sticky PR comment
88+
if: github.event_name == 'pull_request' && always()
89+
env:
90+
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
91+
PR_NUMBER: ${{ github.event.pull_request.number }}
92+
run: |
93+
tools/post_verification_comment.py "$PR_NUMBER"
94+
95+
- name: Fail job if any verification artifact failed
96+
if: steps.verify.outcome != 'success'
97+
run: |
98+
echo "::error::One or more verification artifacts failed; see PR comment for details"
99+
exit 1

artifacts/requirements.yaml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2021,4 +2021,20 @@ artifacts:
20212021
status: implemented
20222022
tags: [mermaid, emission, v0100]
20232023

2024+
# ── CI / Verification gate ──────────────────────────────────────────────
2025+
2026+
- id: REQ-VERIFY-GATE-001
2027+
type: requirement
2028+
title: Rivet-driven verification gate executes artifact steps in CI
2029+
description: >
2030+
A CI workflow shall iterate every `type: feature` artifact (filtered by
2031+
s-expression), execute each artifact's `fields.steps[].run` commands,
2032+
aggregate per-artifact pass/fail, and post a sticky PR comment listing
2033+
counts and failed artifact IDs. Same script (`tools/run_verification.py`)
2034+
runs locally for fast feedback. This makes `artifacts/verification.yaml`
2035+
executable rather than purely descriptive, catching drift between what
2036+
the YAML claims and what's actually green.
2037+
status: implemented
2038+
tags: [ci, rivet, verification, v0100]
2039+
20242040
# Research findings tracked separately in research/findings.yaml

artifacts/verification.yaml

Lines changed: 23 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1612,8 +1612,8 @@ artifacts:
16121612
method: automated-test
16131613
steps:
16141614
- run: cargo test -p spar --test moves_enumerate_objectives
1615-
- run: cargo test -p spar-solver enumerate::
1616-
- run: cargo test -p spar moves::enumerate_tests
1615+
- run: "cargo test -p spar-solver enumerate::"
1616+
- run: "cargo test -p spar moves::enumerate_tests"
16171617
status: passing
16181618
tags: [migration, track-e, v080, cli, solver]
16191619
links:
@@ -2519,7 +2519,7 @@ artifacts:
25192519
method: automated-test
25202520
steps:
25212521
- run: cargo test -p spar-hir-def overlay
2522-
- run: cargo test -p spar-cli applies_to_nested
2522+
- run: cargo test -p spar --test applies_to_nested
25232523
status: passing
25242524
tags: [binding, hir-def, v093]
25252525
links:
@@ -2627,3 +2627,23 @@ artifacts:
26272627
links:
26282628
- type: satisfies
26292629
target: REQ-MERMAID-001
2630+
2631+
# ── CI / Verification gate ──────────────────────────────────────────────
2632+
2633+
- id: TEST-VERIFY-GATE-RUNNER
2634+
type: feature
2635+
title: Verification gate runner executes a tag-filtered subset and emits JSON
2636+
description: >
2637+
Smoke-test of `tools/run_verification.py`: filter to the
2638+
classifier-match tag (1 artifact), verify that the runner exits 0 and
2639+
writes a verification-results.json with passed_count=1, failed_count=0.
2640+
fields:
2641+
method: automated-test
2642+
steps:
2643+
- run: "tools/run_verification.py --filter '(and (= type \"feature\") (has-tag \"classifier-match\"))' --results-json /tmp/verify-smoke.json"
2644+
- run: "python3 -c 'import json,sys; d=json.load(open(\"/tmp/verify-smoke.json\")); sys.exit(0 if d[\"passed_count\"]==1 and d[\"failed_count\"]==0 else 1)'"
2645+
status: passing
2646+
tags: [ci, rivet, verification, v0100]
2647+
links:
2648+
- type: satisfies
2649+
target: REQ-VERIFY-GATE-001

tools/post_verification_comment.py

Lines changed: 171 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,171 @@
1+
#!/usr/bin/env python3
2+
"""Post (or update) a sticky PR comment summarising rivet verification results.
3+
4+
Reads the JSON written by `tools/run_verification.py` and calls the GitHub
5+
REST API directly to upsert a single marker-tagged comment on the PR.
6+
Re-running on the same PR replaces the prior body rather than appending
7+
another comment. Pure stdlib (urllib) — no `gh` CLI dependency.
8+
9+
Usage:
10+
tools/post_verification_comment.py <pr-number> [--results-json PATH] [--repo OWNER/NAME]
11+
12+
Required env:
13+
GH_TOKEN (or GITHUB_TOKEN) with `pull-requests: write`.
14+
"""
15+
16+
from __future__ import annotations
17+
18+
import argparse
19+
import json
20+
import os
21+
import sys
22+
import urllib.error
23+
import urllib.request
24+
from pathlib import Path
25+
26+
MARKER = "<!-- rivet-verification-gate -->"
27+
API = "https://api.github.com"
28+
29+
30+
def github_request(
31+
method: str, path: str, token: str, body: dict | None = None
32+
) -> tuple[int, bytes]:
33+
url = f"{API}{path}"
34+
data = json.dumps(body).encode("utf-8") if body is not None else None
35+
req = urllib.request.Request(
36+
url,
37+
data=data,
38+
method=method,
39+
headers={
40+
"Accept": "application/vnd.github+json",
41+
"Authorization": f"Bearer {token}",
42+
"X-GitHub-Api-Version": "2022-11-28",
43+
"User-Agent": "spar-verification-gate",
44+
"Content-Type": "application/json" if data else "application/octet-stream",
45+
},
46+
)
47+
try:
48+
with urllib.request.urlopen(req) as resp:
49+
return resp.status, resp.read()
50+
except urllib.error.HTTPError as e:
51+
return e.code, e.read()
52+
53+
54+
def render_body(results: dict) -> str:
55+
passed = results["passed_count"]
56+
failed = results["failed_count"]
57+
skipped = results["skipped_count"]
58+
total = results["total"]
59+
failed_ids = results["failed"]
60+
flt = results["filter"]
61+
62+
if failed == 0:
63+
status = f"✅ **{passed}/{total}** passed"
64+
else:
65+
status = f"❌ **{passed}/{total}** passed — **{failed}** failed"
66+
67+
failed_section = (
68+
"\n".join(f"- `{i}`" for i in failed_ids) if failed_ids else "_(none)_"
69+
)
70+
71+
return f"""{MARKER}
72+
## Rivet verification gate
73+
74+
{status}
75+
76+
| | count |
77+
|---|---:|
78+
| Passed | {passed} |
79+
| Failed | {failed} |
80+
| Skipped (no steps) | {skipped} |
81+
82+
**Filter:** `{flt}`
83+
84+
<details><summary>Failed artifacts</summary>
85+
86+
{failed_section}
87+
88+
</details>
89+
90+
<sub>Updated automatically by `tools/post_verification_comment.py`. Source of truth: `artifacts/verification.yaml`.</sub>"""
91+
92+
93+
def find_marker_comment(repo: str, pr: int, token: str) -> int | None:
94+
"""Page through PR comments looking for the marker. Returns comment id or None."""
95+
page = 1
96+
while True:
97+
status, body = github_request(
98+
"GET",
99+
f"/repos/{repo}/issues/{pr}/comments?per_page=100&page={page}",
100+
token,
101+
)
102+
if status != 200:
103+
print(f"GET comments failed: {status} {body[:200]}", file=sys.stderr)
104+
return None
105+
comments = json.loads(body)
106+
if not comments:
107+
return None
108+
for c in comments:
109+
if MARKER in (c.get("body") or ""):
110+
return c["id"]
111+
if len(comments) < 100:
112+
return None
113+
page += 1
114+
115+
116+
def upsert_comment(repo: str, pr: int, body: str, token: str) -> None:
117+
existing = find_marker_comment(repo, pr, token)
118+
if existing is not None:
119+
print(f"updating comment {existing}", file=sys.stderr)
120+
status, resp = github_request(
121+
"PATCH",
122+
f"/repos/{repo}/issues/comments/{existing}",
123+
token,
124+
{"body": body},
125+
)
126+
else:
127+
print("creating new comment", file=sys.stderr)
128+
status, resp = github_request(
129+
"POST",
130+
f"/repos/{repo}/issues/{pr}/comments",
131+
token,
132+
{"body": body},
133+
)
134+
if status not in (200, 201):
135+
print(f"comment upsert failed: {status} {resp[:300]}", file=sys.stderr)
136+
sys.exit(2)
137+
138+
139+
def main() -> int:
140+
parser = argparse.ArgumentParser(description=__doc__)
141+
parser.add_argument("pr", type=int, help="pull-request number")
142+
parser.add_argument(
143+
"--results-json",
144+
default="verification-results.json",
145+
type=Path,
146+
help="path to the JSON summary (default: %(default)s)",
147+
)
148+
parser.add_argument(
149+
"--repo",
150+
default=os.environ.get("GH_REPO", "pulseengine/spar"),
151+
help="OWNER/NAME (default: %(default)s)",
152+
)
153+
args = parser.parse_args()
154+
155+
token = os.environ.get("GH_TOKEN") or os.environ.get("GITHUB_TOKEN")
156+
if not token:
157+
print("GH_TOKEN or GITHUB_TOKEN required", file=sys.stderr)
158+
return 2
159+
160+
if not args.results_json.is_file():
161+
print(f"no {args.results_json} found; nothing to post", file=sys.stderr)
162+
return 0
163+
164+
results = json.loads(args.results_json.read_text())
165+
body = render_body(results)
166+
upsert_comment(args.repo, args.pr, body, token)
167+
return 0
168+
169+
170+
if __name__ == "__main__":
171+
sys.exit(main())

0 commit comments

Comments
 (0)