Skip to content

Commit d4d2eb1

Browse files
authored
feat(agent-docs-audit): delta-only high-confidence CI gate (#3301)
* feat(agent-docs-audit): delta-only high-confidence CI gate Fails the workflow when a PR introduces new high-confidence findings vs base, while leaving existing baseline debt and heuristic classes warning- only. High-confidence classes that block CI: - broken @imports - broken symlink targets - linked-inverted pairs - unexpected-duplicate pairs Heuristic/advisory classes (still warning-only via the comment): - broken path refs (backtick regex, known false-positive prone) - budget warnings - unresolved pnpm commands Mechanics: a separate gate script worktree-scans origin/$BASE_REF and diffs high-confidence finding identities against the PR head, scoped to files (or pair-dirs) the PR actually changed. Result is written to /tmp/agent-docs-gate.json; the comment script reads it and prepends a 'Blocking' header so reviewers see why CI is red without opening logs. * feat(agent-docs-audit): block pair-to-single regressions Adds a fifth high-confidence blocking class: when a PR transitions a dir from a paired classification (linked, linked-inverted, unexpected-duplicate, intentional-different) to 'single', the gate fails. Legitimate-single dirs are unaffected since this check is delta-only. Also fixes a gap where the comment script took the 'resolved' path when the PR deleted an agent-doc file, hiding the gate's Blocking banner. The banner now renders in both the findings and clean paths whenever the gate result file marks blocking=true.
1 parent a4449ba commit d4d2eb1

3 files changed

Lines changed: 275 additions & 22 deletions

File tree

.github/scripts/agent-docs-pr-comment.mjs

Lines changed: 45 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
*/
88

99
import { execFileSync } from 'node:child_process';
10-
import { writeFileSync } from 'node:fs';
10+
import { existsSync, readFileSync, writeFileSync } from 'node:fs';
1111
import { tmpdir } from 'node:os';
1212
import { dirname, join, relative, resolve } from 'node:path';
1313
import { computeFlags, pairFlaggedForReview, runL1Scan } from './agent-docs-l1.mjs';
@@ -17,8 +17,18 @@ const PR = process.env.PR_NUMBER;
1717
const REPO = process.env.REPO ?? 'superdoc-dev/superdoc';
1818
const REPO_ROOT = resolve(process.env.REPO_ROOT ?? process.cwd());
1919
const SHA = process.env.GITHUB_SHA ?? 'unknown-sha';
20+
const GATE_RESULT_PATH = process.env.GATE_RESULT_PATH || '/tmp/agent-docs-gate.json';
2021
const DRY_RUN = process.argv.includes('--dry-run');
2122

23+
function readGateResult() {
24+
try {
25+
if (!existsSync(GATE_RESULT_PATH)) return null;
26+
return JSON.parse(readFileSync(GATE_RESULT_PATH, 'utf-8'));
27+
} catch {
28+
return null;
29+
}
30+
}
31+
2232
if (!PR && !DRY_RUN) {
2333
console.log('PR_NUMBER not set; not in a PR context. Skipping.');
2434
process.exit(0);
@@ -123,14 +133,26 @@ function formatPairFinding(finding) {
123133
].join('\n');
124134
}
125135

136+
function formatGateFinding(f) {
137+
if (f.type === 'broken-import') return `broken \`@import\` in \`${f.relPath}\`: \`${f.importPath}\``;
138+
if (f.type === 'broken-symlink') return `broken symlink \`${f.relPath}\` -> \`${f.target}\``;
139+
if (f.type === 'pair') return `pair drift in \`${f.dir}\`: ${f.classification} (${f.detail})`;
140+
if (f.type === 'pair-to-single') return `pair-to-single regression in \`${f.dir}\` (was ${f.wasClassification}): ${f.detail}`;
141+
return JSON.stringify(f);
142+
}
143+
126144
function buildFindingsBody(findings) {
127-
const lines = [
128-
MARKER,
129-
'## Agent docs audit',
130-
'',
131-
`Found deterministic findings on ${findings.length} changed agent-doc item(s).`,
132-
'',
133-
];
145+
const gate = readGateResult();
146+
const lines = [MARKER, '## Agent docs audit', ''];
147+
if (gate?.blocking) {
148+
lines.push(
149+
`**Blocking**: this PR introduces ${gate.newFindings.length} new high-confidence finding(s). CI will fail until resolved.`,
150+
);
151+
for (const f of gate.newFindings) lines.push(`- ${formatGateFinding(f)}`);
152+
lines.push('');
153+
}
154+
lines.push(`Found deterministic findings on ${findings.length} changed agent-doc item(s).`);
155+
lines.push('');
134156

135157
for (const finding of findings) {
136158
lines.push(finding.type === 'pair' ? formatPairFinding(finding) : formatFileFinding(finding));
@@ -143,15 +165,20 @@ function buildFindingsBody(findings) {
143165
}
144166

145167
function buildResolvedBody(changed) {
168+
const gate = readGateResult();
169+
const lines = [MARKER, '## Agent docs audit', ''];
170+
if (gate?.blocking) {
171+
lines.push(
172+
`**Blocking**: this PR introduces ${gate.newFindings.length} new high-confidence finding(s). CI will fail until resolved.`,
173+
);
174+
for (const f of gate.newFindings) lines.push(`- ${formatGateFinding(f)}`);
175+
lines.push('');
176+
}
177+
lines.push(`All changed agent-doc files are clean (in-file checks) as of \`${SHA.slice(0, 12)}\`.`);
178+
lines.push('');
146179
const files = changed.map((path) => `\`${path}\``).join(', ');
147-
return [
148-
MARKER,
149-
'## Agent docs audit',
150-
'',
151-
`All changed agent-doc files are clean as of \`${SHA.slice(0, 12)}\`.`,
152-
'',
153-
files ? `Checked: ${files}` : 'No changed agent-doc files detected.',
154-
].join('\n');
180+
lines.push(files ? `Checked: ${files}` : 'No changed agent-doc files detected.');
181+
return lines.join('\n');
155182
}
156183

157184
function getExistingCommentId() {
@@ -207,10 +234,10 @@ if (DRY_RUN) {
207234
process.exit(0);
208235
}
209236

210-
if (findings.length === 0) {
237+
if (findings.length === 0 && !readGateResult()?.blocking) {
211238
const existing = getExistingCommentId();
212239
if (!existing) {
213-
console.log('No L1 findings and no previous sticky comment. Skipping comment.');
240+
console.log('No L1 findings, gate not blocking, and no previous sticky comment. Skipping comment.');
214241
process.exit(0);
215242
}
216243
}
Lines changed: 196 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,196 @@
1+
#!/usr/bin/env node
2+
/**
3+
* Delta-only high-confidence gate. Fails the workflow when the PR introduces
4+
* NEW high-confidence agent-doc findings vs base.
5+
*
6+
* High-confidence classes:
7+
* - broken @imports
8+
* - broken symlink targets
9+
* - linked-inverted pairs
10+
* - unexpected-duplicate pairs
11+
*
12+
* Heuristic / advisory classes are explicitly excluded to keep the false-
13+
* positive rate near zero: brokenPathRefs (backtick regex), budget warnings,
14+
* unresolvedCommands.
15+
*
16+
* Writes the result to GATE_RESULT_PATH so the comment step can surface
17+
* "Blocking" state inline. Exits 1 if blocking, 0 otherwise.
18+
*/
19+
20+
import { execFileSync } from 'node:child_process';
21+
import { mkdtempSync, rmSync, writeFileSync } from 'node:fs';
22+
import { tmpdir } from 'node:os';
23+
import { dirname, join, resolve } from 'node:path';
24+
import { runL1Scan } from './agent-docs-l1.mjs';
25+
26+
const REPO_ROOT = resolve(process.env.REPO_ROOT ?? process.cwd());
27+
const BASE_REF = process.env.BASE_REF || 'main';
28+
const PR = process.env.PR_NUMBER;
29+
const REPO = process.env.REPO ?? 'superdoc-dev/superdoc';
30+
const RESULT_PATH = process.env.GATE_RESULT_PATH || '/tmp/agent-docs-gate.json';
31+
const DRY_RUN = process.argv.includes('--dry-run');
32+
33+
function isAgentDocPath(path) {
34+
if (/(?:^|\/)(?:AGENTS|CLAUDE)(?:\.local)?\.md$/.test(path)) return true;
35+
return /(?:^|\/)\.claude\/rules\/.+\.md$/.test(path);
36+
}
37+
38+
function getChangedAgentDocs() {
39+
if (DRY_RUN) {
40+
const idx = process.argv.indexOf('--files');
41+
if (idx < 0) return [];
42+
return (process.argv[idx + 1] || '').split(',').map((s) => s.trim()).filter(Boolean).filter(isAgentDocPath);
43+
}
44+
if (!PR) return [];
45+
try {
46+
const out = execFileSync('gh', ['pr', 'diff', PR, '--repo', REPO, '--name-only'], { encoding: 'utf-8' });
47+
return out.split('\n').map((s) => s.trim()).filter(Boolean).filter(isAgentDocPath);
48+
} catch (err) {
49+
console.log(`Could not list PR changed files: ${err.message}`);
50+
return [];
51+
}
52+
}
53+
54+
function changedPairDirs(paths) {
55+
const dirs = new Set();
56+
for (const path of paths) {
57+
if (/(?:^|\/)(?:AGENTS|CLAUDE)(?:\.local)?\.md$/.test(path)) {
58+
dirs.add(dirname(path));
59+
}
60+
}
61+
return dirs;
62+
}
63+
64+
function highConfidenceFindings(scan) {
65+
const findings = [];
66+
for (const file of scan.files) {
67+
if (file.brokenSymlinkTarget) {
68+
findings.push({
69+
type: 'broken-symlink',
70+
relPath: file.relPath,
71+
target: file.brokenSymlinkTarget,
72+
id: `symlink:${file.relPath}`,
73+
});
74+
}
75+
if (file.isSymlink) continue;
76+
for (const importPath of file.brokenImports) {
77+
findings.push({
78+
type: 'broken-import',
79+
relPath: file.relPath,
80+
importPath,
81+
id: `import:${file.relPath}:${importPath}`,
82+
});
83+
}
84+
}
85+
for (const pair of scan.pairs) {
86+
if (pair.classification === 'linked-inverted' || pair.classification === 'unexpected-duplicate') {
87+
findings.push({
88+
type: 'pair',
89+
dir: pair.dir,
90+
classification: pair.classification,
91+
detail: pair.detail,
92+
id: `pair:${pair.dir}:${pair.classification}`,
93+
});
94+
}
95+
}
96+
return findings;
97+
}
98+
99+
function prepareBaseSnapshot() {
100+
execFileSync('git', ['fetch', '--no-tags', '--depth=1', 'origin', BASE_REF], { cwd: REPO_ROOT, stdio: 'inherit' });
101+
const baseDir = mkdtempSync(join(tmpdir(), 'agent-docs-base-'));
102+
execFileSync('git', ['worktree', 'add', '--detach', baseDir, `origin/${BASE_REF}`], { cwd: REPO_ROOT, stdio: 'inherit' });
103+
return baseDir;
104+
}
105+
106+
function cleanupBaseSnapshot(baseDir) {
107+
try {
108+
execFileSync('git', ['worktree', 'remove', '--force', baseDir], { cwd: REPO_ROOT, stdio: 'ignore' });
109+
} catch {
110+
rmSync(baseDir, { recursive: true, force: true });
111+
}
112+
}
113+
114+
function writeResult(result) {
115+
writeFileSync(RESULT_PATH, JSON.stringify(result, null, 2));
116+
}
117+
118+
const changed = getChangedAgentDocs();
119+
if (changed.length === 0) {
120+
console.log('No agent-doc files changed; gate is a no-op.');
121+
writeResult({ blocking: false, newFindings: [], changed: [] });
122+
process.exit(0);
123+
}
124+
125+
console.log(`Changed agent-doc files: ${changed.join(', ')}`);
126+
127+
const headScan = runL1Scan(REPO_ROOT);
128+
const headFindings = highConfidenceFindings(headScan);
129+
130+
let baseScan = null;
131+
let baseDir = null;
132+
try {
133+
if (DRY_RUN) {
134+
const baseFromFlag = process.argv.indexOf('--base-root');
135+
if (baseFromFlag >= 0 && process.argv[baseFromFlag + 1]) {
136+
baseScan = runL1Scan(resolve(process.argv[baseFromFlag + 1]));
137+
}
138+
} else {
139+
baseDir = prepareBaseSnapshot();
140+
baseScan = runL1Scan(baseDir);
141+
}
142+
} finally {
143+
if (baseDir) cleanupBaseSnapshot(baseDir);
144+
}
145+
146+
const baseFindings = baseScan ? highConfidenceFindings(baseScan) : [];
147+
const baseIds = new Set(baseFindings.map((f) => f.id));
148+
const newFindings = headFindings.filter((f) => !baseIds.has(f.id));
149+
150+
// Pair-to-single regression: base had a paired classification (linked,
151+
// linked-inverted, unexpected-duplicate, intentional-different), head has
152+
// 'single' in the same dir. Bare 'single' is legitimate for fresh packages,
153+
// so this is meaningful only as a delta.
154+
if (baseScan) {
155+
const baseDirHadPair = new Map();
156+
for (const pair of baseScan.pairs) {
157+
if (pair.classification !== 'single') baseDirHadPair.set(pair.dir, pair.classification);
158+
}
159+
for (const pair of headScan.pairs) {
160+
if (pair.classification !== 'single') continue;
161+
if (!baseDirHadPair.has(pair.dir)) continue;
162+
newFindings.push({
163+
type: 'pair-to-single',
164+
dir: pair.dir,
165+
detail: pair.detail,
166+
wasClassification: baseDirHadPair.get(pair.dir),
167+
id: `pair-to-single:${pair.dir}`,
168+
});
169+
}
170+
}
171+
172+
const changedSet = new Set(changed);
173+
const dirSet = changedPairDirs(changed);
174+
175+
const scoped = newFindings.filter((f) => {
176+
if (f.type === 'pair' || f.type === 'pair-to-single') return dirSet.has(f.dir);
177+
return changedSet.has(f.relPath);
178+
});
179+
180+
const result = { blocking: scoped.length > 0, newFindings: scoped, changed };
181+
writeResult(result);
182+
183+
if (result.blocking) {
184+
console.log('\nBlocking — new high-confidence findings introduced by this PR:');
185+
for (const f of scoped) {
186+
if (f.type === 'broken-import') console.log(` - broken @import in ${f.relPath}: ${f.importPath}`);
187+
else if (f.type === 'broken-symlink') console.log(` - broken symlink ${f.relPath} -> ${f.target}`);
188+
else if (f.type === 'pair') console.log(` - pair ${f.dir} ${f.classification}: ${f.detail}`);
189+
else if (f.type === 'pair-to-single') console.log(` - pair-to-single in ${f.dir} (was ${f.wasClassification}): ${f.detail}`);
190+
}
191+
console.log(`\nWrote ${RESULT_PATH}`);
192+
process.exit(1);
193+
}
194+
195+
console.log('No new high-confidence findings introduced by this PR. Gate passes.');
196+
process.exit(0);

.github/workflows/agent-docs-audit.yml

Lines changed: 34 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,12 @@ name: Agent docs audit
1010
# concrete claims via Read/Glob/Grep and produces
1111
# KEEP/TRIM/MOVE/UPDATE/INVESTIGATE findings.
1212
#
13-
# Warning-only for now. Uploads artifacts and writes a Step Summary; does not
14-
# post PR comments, does not fail the workflow on findings.
13+
# PR runs post a diff-scoped sticky comment with L1 findings, then enforce a
14+
# delta-only high-confidence gate: the workflow fails when the PR introduces
15+
# new broken @imports, broken symlink targets, or unexpected pair drift
16+
# (linked-inverted, unexpected-duplicate). Existing baseline debt on touched
17+
# files does not fail CI. Heuristic classes (broken path refs, budget warnings,
18+
# unresolved commands) remain advisory.
1519
#
1620
# AI layers are skipped automatically if ANTHROPIC_API_KEY is unavailable
1721
# (fork PRs, secret not set). In that case the L1 report still uploads.
@@ -125,14 +129,40 @@ jobs:
125129
if-no-files-found: warn
126130
retention-days: 30
127131

132+
# Delta-only high-confidence gate. Runs before the comment step so the
133+
# sticky can surface "Blocking" state inline. Never fails the comment
134+
# step itself (set as a separate step further down with always()).
135+
- name: Delta gate (PR only, scan-and-write)
136+
if: github.event_name == 'pull_request'
137+
id: gate
138+
continue-on-error: true
139+
env:
140+
PR_NUMBER: ${{ github.event.pull_request.number }}
141+
REPO: ${{ github.repository }}
142+
REPO_ROOT: ${{ github.workspace }}
143+
BASE_REF: ${{ github.base_ref }}
144+
GATE_RESULT_PATH: /tmp/agent-docs-gate.json
145+
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
146+
run: node .github/scripts/agent-docs-pr-gate.mjs
147+
128148
# Diff-scoped sticky PR comment. Pull_request runs are L1-only; this
129149
# surfaces deterministic findings only for agent-doc files touched by the
130-
# PR. No AI, no Bash, no secrets.
150+
# PR. Reads the gate result file if present to add a "Blocking" header.
151+
# always() so the comment posts even when the gate step exits non-zero.
131152
- name: Post sticky PR comment with L1 findings
132-
if: github.event_name == 'pull_request'
153+
if: ${{ always() && github.event_name == 'pull_request' }}
133154
env:
134155
PR_NUMBER: ${{ github.event.pull_request.number }}
135156
REPO: ${{ github.repository }}
136157
REPO_ROOT: ${{ github.workspace }}
158+
GATE_RESULT_PATH: /tmp/agent-docs-gate.json
137159
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
138160
run: node .github/scripts/agent-docs-pr-comment.mjs
161+
162+
# Enforce the gate result. Separate step so the comment posts first; this
163+
# step is the one that turns the job red when blocking findings exist.
164+
- name: Enforce delta gate
165+
if: ${{ always() && github.event_name == 'pull_request' && steps.gate.outcome == 'failure' }}
166+
run: |
167+
echo "::error::Agent docs audit gate failed — PR introduces new high-confidence findings. See sticky comment."
168+
exit 1

0 commit comments

Comments
 (0)