This guide is the maintainer-oriented reference for gh-aw-threat-detection.
It is intentionally shorter than the parent gh-aw guide and only covers the surfaces that exist in this repository.
This repository contains the threat detection component used by GitHub Agentic Workflows.
Core responsibilities:
- read agent artifacts from an artifacts directory
- evaluate those artifacts for prompt injection, secret leakage, and malicious patch risk
- produce a machine-readable detection result before downstream
safe-outputshandling proceeds
The canonical behavior specification for this repository lives in specs/threat-detection-spec.md.
Preferred setup:
- GitHub Codespaces or the local dev container in .devcontainer/devcontainer.json
The dev container keeps model-related tooling available, including GitHub CLI, Docker, Copilot CLI, and optional Vertex-backed Claude setup. Vertex credentials are optional and configured by setting VERTEX_API_JSON before container creation.
git clone https://github.com/github/gh-aw-threat-detection.git
cd gh-aw-threat-detection
make deps
make buildmake deps-dev
make test
make buildCommon commands:
make build # build the CLI
make test # run Go tests
make lint # run go vet
make fmt # format Go code
make security-scan # run gosec and govulncheck
make agent-finish # maintainer validation flowWhat make agent-finish currently runs:
deps-devfmtlintbuildtestsecurity-scan
Use make help to see the full list of maintained targets.
/
├── .devcontainer/ # Codespaces and Dev Container setup
├── .github/workflows/ # CI and release workflows
├── cmd/threat-detect/ # CLI entrypoint
├── pkg/artifacts/ # Artifact loading and validation
├── pkg/detector/ # Threat detection logic and prompt templates
├── pkg/engine/ # AI engine abstraction and adapters
├── scratchpad/ # Design references retained from gh-aw where still relevant
├── skills/ # Small repo-relevant agent skills
├── specs/ # Threat detection specification
└── Makefile # Build and maintainer commands
The detector operates on an artifacts directory. The current expected shape is documented in README.md and specs/threat-detection-spec.md, including:
aw-prompts/prompt.txtagent_output.json- optional
aw-*.patch - optional
aw-*.bundle - optional
comment-memory/*.md
When changing artifact handling, review:
The agentic CLI engine path (copilot, claude, codex) gives the detection
model an in-session, validated reporting channel instead of relying solely on
post-hoc transcript scraping:
threat-detect report-resultis an internal subcommand (dispatched inmain()before global flag parsing; intentionally omitted from--help). It validates the verdict against the existing result schema, then atomically records canonical JSON to the path in--result-file/THREAT_DETECTION_RESULT_FILE. Invalid input printsTHREAT_DETECTION_RESULT_ERROR:and exits non-zero without recording (exit2); a missing sink path exits3. The first valid write wins (idempotent); a later valid call prints "already recorded" without overwriting.- Before each engine run,
pkg/engineprovisions athreat_detection_resultwrapper script onPATH(provisionResultTool) that execsreport-result, and setsTHREAT_DETECTION_RESULT_FILE.watchResultSinkpolls the sink and cancels the engine subprocess as soon as a valid result is recorded (early termination). A subprocess kill is treated as success when a valid sink result exists. analyzeWithRetriesprefers the sink result (detector.ReadResultFile) and only falls back todetector.ParseResulttranscript scraping when the sink is absent or invalid. Helpers live inpkg/detector/result.go(WriteResultFile,ReadResultFile,BuildResultFromReport,ValidateReportFields).- Claude is invoked with
--allowed-tools Bashwhen the sink is enabled so it can execute the wrapper; engines that cannot run shell tools fall back to the legacyTHREAT_DETECTION_RESULT:{...}transcript line, which remains supported.
Threat detection sits upstream of safe-outputs. This repo does not implement the full safe-outputs system, but changes here should preserve the assumptions made by downstream consumers.
Useful references:
- scratchpad/safe-outputs-specification.md
- scratchpad/safe-output-environment-variables.md
- scratchpad/safe-output-messages.md
When changing code in this repository:
- keep behavior aligned with specs/threat-detection-spec.md
- prefer small, local packages and targeted tests
- update README or spec text when behavior changes
- preserve the JSON result contract unless the spec intentionally changes it
Useful retained references:
- scratchpad/code-organization.md
- scratchpad/validation-architecture.md
- scratchpad/go-type-patterns.md
- scratchpad/styles-guide.md
- scratchpad/errors.md
- scratchpad/testing.md
- skills/console-rendering/SKILL.md
- skills/error-messages/SKILL.md
Common reset flow:
make clean
make deps
make build
make testIf tooling is missing:
- run
make deps-devto install maintainer dependencies - reopen the dev container if local environment drift is suspected
This section stays in place even though the release flow is still being built out.
Releases follow a prerelease → promote model:
-
Create tag (manual) — a maintainer triggers the Create Release Tag workflow via Actions → Create Release Tag → Run workflow and selects a patch or minor bump. The workflow validates
mainand pushes the nextvX.Y.Ztag. -
Build & Publish (automated) — pushing a tag matching
v*triggers the release workflow. It builds thethreat-detect-linux-amd64binary, attaches it (pluschecksums.txt) to a prerelease on GitHub, and records the asset sha256 in the release notes. Therelease-publishenvironment gate pauses the workflow before publishing so maintainers can abort if needed. -
Promote (manual) — after verifying the prerelease, a maintainer triggers the promote-release workflow via Actions → Promote Release → Run workflow, entering the tag name. This workflow (gated by the
release-promoteenvironment):- verifies the release is still a prerelease
- re-downloads the asset and verifies its sha256 against the recorded value
- marks the GitHub release as stable and explicitly selects it as Latest (
--prerelease=false --latest)
The GitHub "Latest" release pointer only moves when a maintainer explicitly promotes. This gives the team time to validate a release before it becomes the default for users downloading the latest stable asset.
In addition to release tags, every push to main triggers the
publish-main workflow, which builds the
binary and republishes a single rolling main pre-release:
- The
mainpre-release always carries thethreat-detect-linux-amd64asset built from the most recent successful build frommain, versionedmain-<shortsha>.
These are unverified branch builds. The main pre-release is not eligible
for promotion. The Latest stable release pointer is
unaffected by this workflow and continues to track the most recently promoted release.
The main CI workflow in .github/workflows/ci.yml runs:
go vetgo test -racego build
- Go to Actions → Create Release Tag → Run workflow.
- Select
patchorminor. The visiblemajoroption may still appear in the workflow UI, but it is currently rejected by the workflow until major releases are enabled. - Run the workflow. It validates
main, computes the next semantic-version tag, and pushes it to trigger the release workflow.
After the tag is pushed:
- Approve the
release-publishenvironment gate when the workflow pauses. - Verify the prerelease on the Releases page and test the
version-tagged
threat-detect-linux-amd64asset. - When satisfied, go to Actions → Promote Release, enter the tag, and run
the workflow. Approve the
release-promoteenvironment gate. - Confirm the Latest release now resolves to the new version.
If a promoted release is later found unsafe, delete or replace the affected GitHub release and promote a known-good version instead.