Skip to content

feat(ci): scan container image + SAST on PRs — weekly-only cadence let a HIGH CVE sit on main for 6 days #235

@theagenticguy

Description

@theagenticguy

Problem

The container-image and SAST scanners (trivy image, grype, semgrep) run only in the weekly security.yml workflow (schedule: cron "0 12 * * 1", Monday 12:00 UTC). PR builds in build.yml explicitly disable them:

# .github/workflows/build.yml:59
MISE_DISABLE_TOOLS: "aqua:aquasecurity/trivy,grype,semgrep"

So a HIGH/CRITICAL vulnerability introduced or newly-disclosed against the agent container is invisible to PR review and can sit on main for up to 7 days until the next scheduled scan.

This is not hypothetical — it just happened:

  • CVE-2026-48501 (HIGH, GHSA-8xvp-7hj6-mcj9) was disclosed against github.com/cli/cli/v2; fix landed in gh v2.93.0 on 2026-05-27.
  • The agent image pins ARG GH_VERSION=2.92.0 (agent/Dockerfile:2) and builds gh from source, so trivy flagged it.
  • The weekly run on 2026-06-01 went red, auto-filed issue Security suite failed (main @ 4b77329) #226, and main's security suite stayed failing from 2026-05-26 through 2026-06-01 (3 consecutive failing scheduled runs).
  • The CVE itself is being fixed by PR fix(security): fix gh issue #234 (bump to 2.93.0) — this issue is about the detection-latency gap, not the CVE.

The auto-filed #226 is a raw log dump with no triage, and security.yml has issues: write but no dedup, so every weekly failure opens a fresh near-duplicate issue.

Why it matters

  • A reference sample for autonomous coding agents that builds and runs untrusted-ish workloads should detect image CVEs at PR time, not up to a week later.
  • The fast PR scanners (gitleaks, osv-scanner) already run on PRs; the gap is specifically the heavier image/SAST scanners that were disabled for build-time/cost reasons.

Proposed approach

Add a dedicated, non-blocking-to-fast-feedback security job that runs the image + SAST scanners on PRs and on push-to-main, separate from the existing fast build job so it doesn't slow the inner loop. Options (pick one in discussion):

  1. PR-triggered security job (preferred): a new job in security.yml (or a security-pr.yml) gated on pull_request + push: [main] that runs mise //agent:security:image, grype, and semgrep against the just-built image artifact. Reuse the image already built by build.yml via artifact handoff to avoid a second image build.
  2. Re-enable selectively in build.yml: drop trivy/grype/semgrep from MISE_DISABLE_TOOLS but move them to a parallel matrix leg with continue-on-error: false and required status check, so fast lint/test feedback isn't blocked on DB downloads.
  3. Keep weekly full scan, but add a lightweight PR-time trivy image --severity HIGH,CRITICAL --exit-code 1 on the built image as a required check.

Also (small, independent): make security.yml's "open issue on failure" step idempotent — search for an existing open Security suite failed issue and comment/update instead of opening a new one each week.

Acceptance criteria

  • A HIGH/CRITICAL image CVE (e.g. an intentionally pinned old gh) causes a PR check to fail, not just the weekly scheduled run.
  • PR-time security scanning does not add more than ~2–3 min to the critical path (run in parallel; reuse the built image artifact; cache trivy/grype DBs).
  • Push-to-main runs the same image/SAST scan so regressions merged via admin override are caught immediately.
  • security.yml failure-issue creation is deduplicated: at most one open Security suite failed issue at a time; subsequent failures comment on it.
  • Docs (docs/guides/ security/CI section) note the PR-time vs. weekly scanning split and the rationale.

References


Filed by Laith Al-Saadoon, drafted by Bonk (ABCA nightly review, 2026-06-02). The root-cause analysis — tracing #226's red status to the weekly-only scan cadence and the build.yml scanner disable — was done by Bonk; Laith reviewed and is filing it.

— Laith + Bonk 🛰️

Metadata

Metadata

Assignees

No one assigned

    Labels

    ci-cdBuild pipeline, deploy.yml, CI perf/caching, GitHub Actions workflowsenhancementNew feature or requestsecurityCedar/HITL, IAM least-privilege, secrets, PII/DLP, guardrails, supply-chain/CVEtooling

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions