feat(agent-docs-audit): policy, deterministic scanner, and weekly semantic audit workflow by caio-pizzol · Pull Request #3296 · superdoc-dev/superdoc

caio-pizzol · 2026-05-14T13:37:11Z

Adds a layered audit for agent-context docs (CLAUDE.md, AGENTS.md, .claude/rules/*.md) against a new agent-docs-policy.md. Modeled on risk-assess.yml: L1 deterministic scan (sizes, symlinks, broken refs), L2 Haiku triage on flagged docs (~~$0.01/doc), and scheduled/manual-only L3 Sonnet deep analysis via @anthropic-ai/claude-agent-sdk with Read/Glob/Grep/Bash tools (~~$0.20/flagged doc).

Triggers: weekly Monday cron, doc-path PRs, manual dispatch.
PRs run L1 only. The audited input is prompt text that a PR author can modify; running a tool-using model with Bash + ANTHROPIC_API_KEY over PR-authored markdown would be a prompt-injection path. Scheduled and workflow_dispatch runs use the full L1+L2+L3 pipeline against main, where the input has passed code review.
Warning-only: uploads artifacts and writes a Step Summary. No PR comments, no failing CI. We can promote behavior after seeing a few real runs.
AI layers skip gracefully when ANTHROPIC_API_KEY is missing.
Policy codifies size budgets, placement rules (root vs nested vs .claude/rules/), the verifiable-claims standard, and five finding labels (KEEP / TRIM / MOVE / UPDATE / INVESTIGATE).
Manual prototype run against current main: 5 of 9 L1-flagged docs reached L3, 15 concrete findings for ~$1.19 total. Notable catch the deterministic layer cannot make: `blockIdToEntry` identifier in `packages/layout-engine/AGENTS.md` does not exist in `renderer.ts` (the real symbol is `pageIndexToState`).

Required secret: `ANTHROPIC_API_KEY`. Already used by `risk-assess.yml`, no new secret needed.

Follow-ups (not in this PR): act on the 15 findings from the prototype run; revisit PR-time L2/L3 enablement once we have a tighter sandbox (e.g., disallow Bash entirely); consider a `--max-budget-usd` cap for safety.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cdadcc4ea0

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@imports

…workflow Three-layer audit modeled on risk-assess.mjs: - L1 (.github/scripts/agent-docs-l1.mjs): deterministic scan. Walks agent-context docs, counts lines, classifies AGENTS/CLAUDE pairs, detects broken @imports, broken path refs (with context-aware resolution), and unresolved pnpm commands. No model calls. - L2 (.github/scripts/agent-docs-audit.mjs Haiku triage): given an L1-flagged doc + policy, decides via tool-use whether the doc needs deep review. Cheap (~$0.01). - L3 (same file, Sonnet via claude-agent-sdk): reads the doc, uses Read/Glob/Grep/Bash to verify concrete claims (paths, identifiers, commands, architecture). Emits structured KEEP/TRIM/MOVE/UPDATE/ INVESTIGATE findings. ~$0.20/doc. Workflow (.github/workflows/agent-docs-audit.yml): triggers on PR doc-path changes, weekly Monday cron, and workflow_dispatch. Skips AI layers gracefully if ANTHROPIC_API_KEY missing (fork PRs). Warning-only: uploads /tmp/agent-docs-audit.json and /tmp/agent-docs-audit-summary.md as artifacts plus a Step Summary, no PR comments and no failing CI yet. Policy (agent-docs-policy.md, 91 lines): codifies size budgets, placement rules, write/do-not-write criteria, verifiable claims standard, and the five finding labels. Manual prototype run against current main: 5 of 9 L1-flagged docs passed Haiku triage to Sonnet review, 15 concrete findings produced for $1.19 total. Notable finds the deterministic scanner alone cannot catch: `blockIdToEntry` identifier in packages/layout-engine/AGENTS.md does not exist in renderer.ts (stale symbol).

caio-pizzol requested a review from a team as a code owner May 14, 2026 13:37

superdoc-bot Bot added the review: quick label May 14, 2026

chatgpt-codex-connector Bot reviewed May 14, 2026

View reviewed changes

Comment thread .github/scripts/agent-docs-l1.mjs Outdated

Comment thread .github/scripts/agent-docs-l1.mjs Outdated

caio-pizzol force-pushed the caio/agent-docs-audit branch from cdadcc4 to 2a236a2 Compare May 14, 2026 13:48

caio-pizzol force-pushed the caio/agent-docs-audit branch from 2a236a2 to 407b278 Compare May 14, 2026 13:53

caio-pizzol merged commit 051d208 into main May 14, 2026
12 checks passed

caio-pizzol deleted the caio/agent-docs-audit branch May 14, 2026 13:55

caio-pizzol mentioned this pull request May 14, 2026

docs(agent-docs): enforce canonical AGENTS symlinks #3298

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agent-docs-audit): policy, deterministic scanner, and weekly semantic audit workflow#3296

feat(agent-docs-audit): policy, deterministic scanner, and weekly semantic audit workflow#3296
caio-pizzol merged 1 commit into
mainfrom
caio/agent-docs-audit

caio-pizzol commented May 14, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

caio-pizzol commented May 14, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant