CI: fail build when audit-harness citation markers leak into compiled output#1678
CI: fail build when audit-harness citation markers leak into compiled output#1678marcin-kordas-hoc wants to merge 7 commits into
Conversation
… output Adds a post-build scan step to `.github/workflows/build.yml` that greps `dist/`, `commonjs/`, and `es/` for two internal-only marker patterns: - `\[V[0-9]+\]` — audit-harness citation markers used in spec drafts - `§[[:space:]]*Sources` — section heading used in audit-harness footers Both are conventions from the audit-harness tooling and belong in internal docs/prompts only. If they ever appear in compiled JS it means a comment or string literal slipped through from a spec draft into shipped output — the scan fails the workflow with the offending file path and line number. The step runs after `npm run bundle-all` (which produces the three output directories) and skips gracefully if a directory is missing, so unrelated build failures aren't masked by this guardrail. Manual verification: - Synthesized `dist/foo.js` containing both markers — grep matched both lines and exited 1 with a clear message. - Repeated with clean JS — grep exited 0. - Repeated with no output dirs — step exited 0 (skip path).
✅ Deploy Preview for hyperformula-docs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
✅ Deploy Preview for hyperformula-dev-docs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Performance comparison of head (542062c) vs base (508d78f) |
Bugbot review #3296952334 flagged that the `if grep ...` form treats grep's exit code 2 (scan/IO error) identically to exit code 1 (no matches) — so a permission or read error on dist/, commonjs/, or es/ would silently green- light the step. Split the rc into 0/1/other and fail the step explicitly on any non-zero, non-1 result.
Validates the build.yml marker-scan step against synthetic fixtures: clean
build, marker in dist/*.js, marker in dist/*.js.map (sourcesContent), marker
in commonjs/*.js, marker in es/*.mjs. Wired as a single self-test step in
build.yml that runs once per OS (node 22, ci install).
Empirically confirmed (probed by planting a marker in src/index.ts and
running `npm run bundle-all`) that source comments survive into:
- commonjs/index.js and es/index.mjs (babel preserves comments)
- dist/hyperformula{,.full}.js (webpack development build preserves comments)
- dist/hyperformula.js.map (`sourcesContent` embeds full original source)
All three surfaces are inside the existing `grep -rn dist commonjs es` scope,
so the scan already covers source-maps. The new self-test pins this behavior
so a future bundler/comment-stripping change cannot silently erode coverage.
Tier-2 hardening: integration test for marker scan + source-map coverageAdded Empirical answer to the SFDIPOT P0 question — do source-maps carry source comments? Yes. Probed by planting
The existing Also added an inline comment block in Verification (local):
New head SHA: |
…test cannot drift The verify step in build.yml previously inlined the audit-marker grep logic while scripts/test-marker-scan.sh kept its own duplicate copy. A workflow-only edit could silently desynchronize the live scan from the self-test fixtures that are supposed to guard it. Move the scan into scripts/marker-scan.sh as a single parameterized entry point (accepts paths as $@, exit 0=clean, 1=dirty, 2+=error). The workflow step now invokes `bash scripts/marker-scan.sh dist commonjs es`, and the self-test drives the SAME script against synthetic fixture roots.
…es output marker-scan grepped only the legacy [V<n>] form, so current markers ([vrf_1], [dec_3], ...) passed the gate; it also scanned only dist/commonjs/es, missing the typings/ and languages/ bundle outputs (both preserve source comments). Extend the grep to the lowercase prefix+_digits grammar, add typings+languages to the CI invocation and the self-test dir list, and add fixtures for both gaps.
… in build-output scan
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 542062c. Configure here.
| local root="$3" | ||
|
|
||
| local rc=0 | ||
| run_marker_scan "$root" >/tmp/scan-out 2>&1 || rc=$? |
There was a problem hiding this comment.
Scan output file bypasses temp directory cleanup trap
Low Severity
assert_scan writes scan output to a hardcoded /tmp/scan-out path instead of using the already-allocated $TMP_ROOT directory. This file is not cleaned up by the trap 'rm -rf "$TMP_ROOT"' EXIT handler, which is inconsistent with the script's own temp-file management design. Using $TMP_ROOT/scan-out would keep all artifacts under the managed directory and ensure proper cleanup.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 542062c. Configure here.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #1678 +/- ##
========================================
Coverage 97.16% 97.16%
========================================
Files 175 176 +1
Lines 15319 15322 +3
Branches 3356 3356
========================================
+ Hits 14884 14887 +3
Misses 427 427
Partials 8 8 🚀 New features to boost your workflow:
|


Summary
Adds a defensive post-build scan to
.github/workflows/build.ymlthat fails the workflow if internal audit-harness markers leak into compiled JS output.The markers (
[V<n>]/[vrf_n]citation tags and the§AuditSourcesfooter) are an internal convention used in spec drafts and agent prompts. They must never appear in shipped JS — if they do, something slipped from a comment/string literal in source into bundled output.What changed
.github/workflows/build.yml: two new steps afterBuild(npm run bundle-all):scripts/marker-scan.shoverdist,commonjs,es,typings,languages; exits 1 with line-numbered output on any hit.scripts/test-marker-scan.shon one matrix slice (Node 22 +npm ci) to keep the live scan and fixture coverage aligned.scripts/marker-scan.sh(73 lines): centralizes the grep scan; skips missing dirs, distinguishes "no match" (exit 0) from I/O errors (exit 2).scripts/test-marker-scan.sh(160 lines): asserts scan behaviour against clean vs. dirty synthetic fixtures covering webpack dist, source maps, CommonJS, and ESM outputs.Patterns
\[(V[0-9]+|(vrf|dec|con|que|wrg|crf)_[0-9]+)\][V1]/[V12]and the current prefixed form[vrf_3],[dec_1],[con_…],[que_…],[wrg_…],[crf_…]§[[:space:]]*AuditSources§AuditSources(or§ AuditSources)Why
Tiny, self-contained guardrail. No new dependency on external repos or npm packages — bash only (
scripts/marker-scan.sh+scripts/test-marker-scan.sh). Fires on the same matrix as the existing build (Node 20/22/24 across Linux/Windows/macOS), but since the step usesshell: bashit runs identically everywhere.Test plan
dist/foo.jscontaining[V3]/[vrf_3]and§AuditSourcestriggers exit 1 with line-numbered outputNote
Low Risk
Adds bash-only CI checks after the existing build; no runtime, auth, or application logic changes.
Overview
Adds a post-build CI gate so internal audit-harness tokens (
[V<n>], prefixed tags like[vrf_3], and§AuditSources) cannot ship in compiled artifacts.After
npm run bundle-all, build.yml runsscripts/marker-scan.shoverdist,commonjs,es,typings, andlanguages, failing the job with line-numbered hits when grep finds a match. A second step on one matrix slice (Node 22 +npm ci) runsscripts/test-marker-scan.sh, which drives the same scan script against synthetic clean/dirty fixtures (bundles, source maps, CommonJS, ESM, typings) so workflow logic and tests stay aligned.marker-scan.shcentralizes the regex scan, skips missing output dirs, treats “no matches” as success, and does not treat grep I/O errors as a clean pass.Reviewed by Cursor Bugbot for commit 25a566a. Bugbot is set up for automated code reviews on this repo. Configure here.