Skip to content

feat: add GitHub Actions CI example for security regression testing#45

Merged
mertsatilmaz merged 5 commits into
OWASP:mainfrom
RajGajjar-01:feat/github-actions-ci-example
May 7, 2026
Merged

feat: add GitHub Actions CI example for security regression testing#45
mertsatilmaz merged 5 commits into
OWASP:mainfrom
RajGajjar-01:feat/github-actions-ci-example

Conversation

@RajGajjar-01
Copy link
Copy Markdown
Contributor

Closes #18

What this adds

  • .github/workflows/security-regression.yml : a GitHub Actions workflow
    that runs security regression scenarios on every push and pull request
  • docs/ci-github-actions.md : explains how the workflow works, how pass
    and fail are triggered, and how to adapt it for another project

Approach

Uses --trace-file mode so the workflow requires no live agent, no API keys,
and produces deterministic results on every run. The example uses the
scenarios and trace files already present in the repository.

Testing

Ran all commands locally before opening this PR.

Copy link
Copy Markdown
Collaborator

@mertsatilmaz mertsatilmaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. This is the right direction and the docs are useful, but I cannot merge it as-is because the workflow currently does not actually guarantee failure when a regression is detected.

The main issue is this statement in the docs:

“The harness CLI exits with a non-zero code when an assertion fails.”

That is not true for the current CLI behavior. agent-harness run can emit result JSON with "result": "fail" while still exiting successfully. So the GitHub Actions workflow needs an explicit result-checking step after writing the JSON files.

Please update the workflow to parse the result JSON files and fail the job if any result has "result": "fail".

For example, after the agent-harness run ... --out results/...json steps, add a step like this:

- name: Fail if any regression result failed
  run: |
    python - <<'PY'
    import json
    import pathlib
    import sys

    failed = []

    for path in pathlib.Path("results").glob("*.json"):
        result = json.loads(path.read_text(encoding="utf-8"))
        if result.get("result") == "fail":
            failed.append(f"{path}: {result.get('scenario_id')} failed")

    if failed:
        print("Security regression failures detected:")
        for item in failed:
            print(f"- {item}")
        sys.exit(1)

    print("No failing security regression results detected.")
    PY

Please also update docs/ci-github-actions.md to explain the actual behavior:

  • the harness writes machine-readable result JSON
  • the workflow fails by checking for "result": "fail" in the result files
  • not_run may still appear when an assertion is recognized but not implemented yet
  • this first example treats only "fail" as a CI failure

A few smaller requests:

  1. I do not want this PR to add another active workflow to this repository yet. Please move the example workflow out of .github/workflows/ and place it under something like:

    docs/examples/github-actions/security-regression.yml

    Then update docs/ci-github-actions.md to explain that users can copy that file into .github/workflows/security-regression.yml in their own project. That keeps this PR as a CI usage example without adding extra CI jobs to this repository on every PR.

  2. The basic goal-hijack scenario currently includes no_secret_disclosure, which still reports not_run. That is okay if documented, but the docs should not imply every assertion is fully enforced in that example.

  3. Please keep the artifact upload step with if: always() so result JSON files are still uploaded even when the result-checking step fails.

After these fixes, this should be a solid copy-pasteable GitHub Actions example that actually fails CI when a regression result is detected.

@RajGajjar-01 RajGajjar-01 force-pushed the feat/github-actions-ci-example branch from 4720792 to 2ff5342 Compare May 6, 2026 21:01
@RajGajjar-01
Copy link
Copy Markdown
Contributor Author

Hey @mertsatilmaz,

Thanks for the detailed feedback.
All of the requested changes are addressed in the updated commit.
Let me know if anything else needs adjusting.

@mertsatilmaz
Copy link
Copy Markdown
Collaborator

@RajGajjar-01 thanks for your contribution, welcome to the team. I made some changes on the error handling/documentation. LGTM now.

Clarified the description of the workflow in the document.
Copy link
Copy Markdown
Collaborator

@mertsatilmaz mertsatilmaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mertsatilmaz mertsatilmaz merged commit 48f7b73 into OWASP:main May 7, 2026
1 check passed
@RajGajjar-01
Copy link
Copy Markdown
Contributor Author

Thank you.
Excited to contribute further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Good first issue: Add a GitHub Actions CI usage example

2 participants