Skip to content

Commit 48f7b73

Browse files
authored
Add GitHub Actions CI usage example
Adds a copy-pasteable GitHub Actions workflow example under docs/examples for running security regression scenarios in CI. Documents trace-file mode, result artifact upload, and explicit result checking that fails CI on fail or error while allowing not_run for recognized but unimplemented assertions. Closes #18.
2 parents b5baa87 + bc3ec0f commit 48f7b73

2 files changed

Lines changed: 200 additions & 0 deletions

File tree

docs/ci-github-actions.md

Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,119 @@
1+
# CI with GitHub Actions
2+
3+
When copied into `.github/workflows/security-regression.yml`, this workflow runs
4+
on every push to `main` and on every pull request targeting
5+
`main`. It installs the harness, validates scenario files, runs assertions
6+
against pre-recorded traces, and explicitly checks the result JSON to fail
7+
the job if any regression was detected.
8+
9+
## Where the example workflow lives
10+
11+
The example workflow is at:
12+
13+
```
14+
docs/examples/github-actions/security-regression.yml
15+
```
16+
17+
## How pass and fail actually work
18+
19+
`agent-harness run` writes machine-readable result JSON to the path you give
20+
`--out`. The `result` field in that JSON will be `pass`, `fail`, `not_run`,
21+
or `error`.
22+
23+
The workflow handles this by adding an explicit result-checking step after
24+
all the `agent-harness run` steps. It reads every JSON file in `results/`,
25+
looks for `"result": "fail"` or `"result": "error"`, and calls `sys.exit(1)`
26+
if any are found. That is what actually fails the job.
27+
28+
A result of `"error"` means the harness did not complete the regression check
29+
correctly, so this example treats it as a CI failure.
30+
31+
```
32+
harness writes JSON → result-checking step reads JSON → step exits 1 → job fails
33+
```
34+
35+
## A note on `not_run`
36+
37+
Some assertions are recognized by the harness but not fully implemented yet.
38+
`no_secret_disclosure` is one example. When an assertion has no implementation,
39+
it comes back as `not_run` rather than `pass` or `fail`.
40+
41+
The basic goal-hijack scenario includes `no_secret_disclosure`, so you will
42+
see `not_run` in that result. This is expected. The result-checking step treats
43+
`"result": "fail"` and `"result": "error"` as CI failures, but allows
44+
`"not_run"` so recognized-but-unimplemented assertions do not break the build.
45+
The README documents which assertions are currently implemented.
46+
47+
## Run mode
48+
49+
This workflow uses `--trace-file` mode. It evaluates assertions against a
50+
pre-recorded JSON trace without starting a live agent.
51+
52+
That makes it a good fit for CI: no server to start, no API keys required,
53+
and the same input always produces the same result.
54+
55+
To test against a live agent instead, see `--live` mode in the README. You
56+
would need an HTTP agent server running before the harness step fires.
57+
58+
## What the workflow does
59+
60+
1. Check out the repository
61+
2. Set up Python 3.11
62+
3. Install the harness with `python -m pip install -e .`
63+
4. Create the `results/` directory
64+
5. Validate each scenario file with `agent-harness validate`
65+
6. Run each scenario with `agent-harness run --trace-file ... --out results/....json`
66+
7. Read every file in `results/` and exit 1 if any has `"result": "fail"` or `"result": "error"`
67+
8. Upload result JSON files as artifacts (runs even when step 7 fails, because of `if: always()`)
68+
69+
## Viewing results
70+
71+
Result JSON files are uploaded as a workflow artifact named
72+
`regression-results` after every run. The artifact upload step has
73+
`if: always()`, so you get the files whether the job passed or failed.
74+
75+
To find them:
76+
77+
1. Go to the Actions tab in your repository
78+
2. Click the workflow run
79+
3. Scroll to Artifacts
80+
4. Download `regression-results`
81+
82+
Each file contains the scenario ID, result status, which assertions ran, and
83+
the evidence for any failure.
84+
85+
## Adapting this for your own project
86+
87+
1. Copy `docs/examples/github-actions/security-regression.yml` into
88+
`.github/workflows/security-regression.yml` in your repository
89+
2. Put your scenario files in a `scenarios/` directory
90+
3. Put your trace files in `examples/traces/`
91+
4. Update the `agent-harness validate` and `agent-harness run` commands to
92+
point to your files
93+
5. Add one `agent-harness run` step per scenario
94+
95+
The result-checking step at the end works across however many scenarios you
96+
add. It globs `results/*.json`, so you do not need to update it when you add
97+
new scenarios.
98+
99+
## Adding a new scenario
100+
101+
When you write a new scenario, add two things to the workflow:
102+
103+
```yaml
104+
- name: Validate my new scenario
105+
run: agent-harness validate scenarios/my_category/my_scenario.yaml
106+
107+
- name: Run my new scenario
108+
run: |
109+
agent-harness run scenarios/my_category/my_scenario.yaml \
110+
--trace-file examples/traces/my_trace.json \
111+
--out results/my_scenario.json
112+
```
113+
114+
The result-checking step picks up the new output file automatically.
115+
116+
## Related
117+
118+
- [Trace format](trace-format.md)
119+
- [README](../README.md)
Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
name: Agent Security Regression
2+
3+
on:
4+
push:
5+
branches:
6+
- main
7+
pull_request:
8+
branches:
9+
- main
10+
11+
permissions:
12+
contents: read
13+
14+
jobs:
15+
security-regression:
16+
name: Run security regression scenarios
17+
runs-on: ubuntu-latest
18+
19+
steps:
20+
- name: Check out repository
21+
uses: actions/checkout@v6
22+
23+
- name: Set up Python
24+
uses: actions/setup-python@v6
25+
with:
26+
python-version: "3.11"
27+
28+
- name: Install harness
29+
run: python -m pip install -e .
30+
31+
- name: Create results directory
32+
run: mkdir -p results
33+
34+
- name: Validate scenarios
35+
run: |
36+
agent-harness validate scenarios/goal_hijack/basic.yaml
37+
agent-harness validate scenarios/goal_hijack/outbound_email_exfiltration_001.yaml
38+
39+
- name: Run goal hijack scenario (basic)
40+
run: |
41+
agent-harness run scenarios/goal_hijack/basic.yaml \
42+
--trace-file examples/traces/no_denied_tool_call.json \
43+
--out results/goal_hijack_basic.json
44+
45+
- name: Run outbound email exfiltration scenario
46+
run: |
47+
agent-harness run scenarios/goal_hijack/outbound_email_exfiltration_001.yaml \
48+
--trace-file examples/traces/no_denied_tool_call.json \
49+
--out results/goal_hijack_exfiltration.json
50+
51+
- name: Fail if any regression result failed or errored
52+
run: |
53+
python - <<'PY'
54+
import json
55+
import pathlib
56+
import sys
57+
58+
failed = []
59+
60+
for path in pathlib.Path("results").glob("*.json"):
61+
result = json.loads(path.read_text(encoding="utf-8"))
62+
if result.get("result") in {"fail", "error"}:
63+
failed.append(
64+
f"{path}: {result.get('scenario_id')} returned {result.get('result')}"
65+
)
66+
67+
if failed:
68+
print("Security regression failures detected:")
69+
for item in failed:
70+
print(f"- {item}")
71+
sys.exit(1)
72+
73+
print("No failing or errored security regression results detected.")
74+
PY
75+
76+
- name: Upload result artifacts
77+
if: always()
78+
uses: actions/upload-artifact@v6
79+
with:
80+
name: regression-results
81+
path: results/

0 commit comments

Comments
 (0)