feat: add GitHub Actions CI example for security regression testing by RajGajjar-01 · Pull Request #45 · OWASP/Agent-Security-Regression-Harness

RajGajjar-01 · 2026-05-04T20:33:18Z

Closes #18

What this adds

.github/workflows/security-regression.yml : a GitHub Actions workflow
that runs security regression scenarios on every push and pull request
docs/ci-github-actions.md : explains how the workflow works, how pass
and fail are triggered, and how to adapt it for another project

Approach

Uses --trace-file mode so the workflow requires no live agent, no API keys,
and produces deterministic results on every run. The example uses the
scenarios and trace files already present in the repository.

Testing

Ran all commands locally before opening this PR.

mertsatilmaz

Thanks for working on this. This is the right direction and the docs are useful, but I cannot merge it as-is because the workflow currently does not actually guarantee failure when a regression is detected.

The main issue is this statement in the docs:

“The harness CLI exits with a non-zero code when an assertion fails.”

That is not true for the current CLI behavior. agent-harness run can emit result JSON with "result": "fail" while still exiting successfully. So the GitHub Actions workflow needs an explicit result-checking step after writing the JSON files.

Please update the workflow to parse the result JSON files and fail the job if any result has "result": "fail".

For example, after the agent-harness run ... --out results/...json steps, add a step like this:

- name: Fail if any regression result failed
  run: |
    python - <<'PY'
    import json
    import pathlib
    import sys

    failed = []

    for path in pathlib.Path("results").glob("*.json"):
        result = json.loads(path.read_text(encoding="utf-8"))
        if result.get("result") == "fail":
            failed.append(f"{path}: {result.get('scenario_id')} failed")

    if failed:
        print("Security regression failures detected:")
        for item in failed:
            print(f"- {item}")
        sys.exit(1)

    print("No failing security regression results detected.")
    PY

Please also update docs/ci-github-actions.md to explain the actual behavior:

the harness writes machine-readable result JSON
the workflow fails by checking for "result": "fail" in the result files
not_run may still appear when an assertion is recognized but not implemented yet
this first example treats only "fail" as a CI failure

A few smaller requests:

I do not want this PR to add another active workflow to this repository yet. Please move the example workflow out of .github/workflows/ and place it under something like:

docs/examples/github-actions/security-regression.yml

Then update docs/ci-github-actions.md to explain that users can copy that file into .github/workflows/security-regression.yml in their own project. That keeps this PR as a CI usage example without adding extra CI jobs to this repository on every PR.
The basic goal-hijack scenario currently includes no_secret_disclosure, which still reports not_run. That is okay if documented, but the docs should not imply every assertion is fully enforced in that example.
Please keep the artifact upload step with if: always() so result JSON files are still uploaded even when the result-checking step fails.

After these fixes, this should be a solid copy-pasteable GitHub Actions example that actually fails CI when a regression result is detected.

RajGajjar-01 · 2026-05-06T21:03:43Z

Hey @mertsatilmaz,

Thanks for the detailed feedback.
All of the requested changes are addressed in the updated commit.
Let me know if anything else needs adjusting.

mertsatilmaz · 2026-05-07T05:41:41Z

@RajGajjar-01 thanks for your contribution, welcome to the team. I made some changes on the error handling/documentation. LGTM now.

Clarified the description of the workflow in the document.

mertsatilmaz

LGTM

RajGajjar-01 · 2026-05-07T07:56:55Z

Thank you.
Excited to contribute further.

mertsatilmaz requested changes May 5, 2026

View reviewed changes

RajGajjar-01 added 2 commits May 7, 2026 02:29

feat: add GitHub Actions CI example for security regression testing

5938810

docs: move workflow to examples dir and add result-checking step

2ff5342

RajGajjar-01 force-pushed the feat/github-actions-ci-example branch from 4720792 to 2ff5342 Compare May 6, 2026 21:01

mertsatilmaz added 2 commits May 7, 2026 06:37

Update regression check to include error results

a04107a

Enhance result-checking to include 'error' status

13b3e63

Update CI workflow description in GitHub Actions doc

bc3ec0f

Clarified the description of the workflow in the document.

mertsatilmaz approved these changes May 7, 2026

View reviewed changes

mertsatilmaz merged commit 48f7b73 into OWASP:main May 7, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add GitHub Actions CI example for security regression testing#45

feat: add GitHub Actions CI example for security regression testing#45
mertsatilmaz merged 5 commits into
OWASP:mainfrom
RajGajjar-01:feat/github-actions-ci-example

RajGajjar-01 commented May 4, 2026

Uh oh!

mertsatilmaz left a comment

Uh oh!

RajGajjar-01 commented May 6, 2026

Uh oh!

mertsatilmaz commented May 7, 2026

Uh oh!

mertsatilmaz left a comment

Uh oh!

Uh oh!

RajGajjar-01 commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

RajGajjar-01 commented May 4, 2026

What this adds

Approach

Testing

Uh oh!

mertsatilmaz left a comment

Choose a reason for hiding this comment

Uh oh!

RajGajjar-01 commented May 6, 2026

Uh oh!

mertsatilmaz commented May 7, 2026

Uh oh!

mertsatilmaz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

RajGajjar-01 commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants