Skip to content

Commit c26b9ff

Browse files
Copilotpethers
andauthored
Trim 05-analysis-gate.md comments to fit ≤550-line architecture cap
Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/df4ba618-1c2e-47a9-81d3-23d80a422152 Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
1 parent df93e18 commit c26b9ff

1 file changed

Lines changed: 17 additions & 42 deletions

File tree

.github/prompts/05-analysis-gate.md

Lines changed: 17 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -37,9 +37,7 @@ ANALYSIS_DIR="analysis/daily/$ARTICLE_DATE/$SUBFOLDER"
3737
DOK_RE='[Hh][A-Za-z0-9]{3,}[0-9]+'
3838
EVIDENCE_RE='[Hh][A-Za-z0-9]{3,}[0-9]+|riksdagen\.se|regeringen\.se|scb\.se|statskontoret\.se|worldbank\.org|api\.imf\.org|data\.imf\.org|www\.imf\.org'
3939
FAIL=0
40-
41-
# Materialise required-file lists via /tmp lists (the AWF sandbox forbids
42-
# inline bash arrays — see 01-bash-and-shell-safety.md §Banned expansion patterns).
40+
# Materialise required-file lists via /tmp (AWF sandbox forbids inline bash arrays; see 01-bash-and-shell-safety.md).
4341
GATE_REQ_LIST="/tmp/gate-req-$$"; GATE_PASS2_LIST="/tmp/gate-pass2-$$"
4442
GATE_SYNTH_LIST="/tmp/gate-synth-$$"; GATE_DOK_LIST="/tmp/gate-doks-$$"
4543
trap 'rm -f "$GATE_REQ_LIST" "$GATE_PASS2_LIST" "$GATE_SYNTH_LIST" "$GATE_DOK_LIST"' EXIT
@@ -74,8 +72,7 @@ while IFS= read -r f; do
7472
[ -s "$ANALYSIS_DIR/$f" ] || { echo "❌ missing/empty: $f"; FAIL=1; }
7573
done < "$GATE_REQ_LIST"
7674

77-
# Check 2 — per-document coverage against manifest (avoid process substitution
78-
# per 01-bash-and-shell-safety.md §Shell hygiene).
75+
# Check 2 — per-document coverage against manifest (avoid process substitution per 01-bash-and-shell-safety.md).
7976
if [ -s "$ANALYSIS_DIR/data-download-manifest.md" ]; then
8077
grep -oE "$DOK_RE" "$ANALYSIS_DIR/data-download-manifest.md" | sort -u > "$GATE_DOK_LIST"
8178
DOK_COUNT=$(wc -l < "$GATE_DOK_LIST" | tr -d ' ')
@@ -175,10 +172,7 @@ if [ -s "$ANALYSIS_DIR/executive-brief.md" ]; then
175172
|| { echo "❌ executive-brief.md: missing '## BLUF' section"; FAIL=1; }
176173
grep -qE '^##[[:space:]].*(Decision|Decisions[[:space:]]+This[[:space:]]+Brief)' "$ANALYSIS_DIR/executive-brief.md" \
177174
|| { echo "❌ executive-brief.md: missing 'Decisions' section"; FAIL=1; }
178-
# H1 quality scan — the executive-brief H1 ships as <title> / og:title /
179-
# JSON-LD headline / sitemap card across all 14 languages. Block the
180-
# template placeholder and boilerplate-only headings. Prefer Markdown H1,
181-
# then fall back to the template's centered HTML <h1> form.
175+
# H1 quality scan — ships as <title>/og:title/JSON-LD headline/sitemap card across 14 languages.
182176
EB_H1="$(grep -m1 -E '^#[[:space:]]+' "$ANALYSIS_DIR/executive-brief.md" || true)"
183177
if [ -z "$EB_H1" ]; then
184178
EB_H1="$(grep -m1 -oE '<h1[^>]*>[^<]+</h1>' "$ANALYSIS_DIR/executive-brief.md" || true)"
@@ -195,8 +189,7 @@ if [ -s "$ANALYSIS_DIR/executive-brief.md" ]; then
195189
*ai-generated\ political\ intelligence*)
196190
echo "❌ executive-brief.md: H1 contains banned phrase 'AI-generated political intelligence'"; FAIL=1 ;;
197191
esac
198-
# Strip leading H1 marker + emoji/whitespace + trailing dashes to detect
199-
# the bare-boilerplate `# Executive Brief` case.
192+
# Strip leading H1 marker + emoji/whitespace + trailing dashes to detect bare-boilerplate `# Executive Brief`.
200193
EB_H1_PLAIN="$(printf '%s' "$EB_H1_LOWER" \
201194
| sed -E 's/^#[[:space:]]+//' \
202195
| sed -E 's/<[^>]+>//g' \
@@ -206,11 +199,7 @@ if [ -s "$ANALYSIS_DIR/executive-brief.md" ]; then
206199
echo "❌ executive-brief.md: H1 is bare boilerplate ('Executive Brief') — write a publishable story-oriented title (actor + active verb + instrument or number)"
207200
FAIL=1
208201
fi
209-
# Date-in-H1 guard (seo-metadata-contract.md §2.1) — title must not
210-
# contain a literal publication date. Catches ISO YYYY-MM-DD,
211-
# English day-first ("15 May 2026") + US-order ("May 15, 2026") +
212-
# Swedish long-form months. Mirrors scripts/agentic/analysis-gate.ts
213-
# checkExecutiveBrief — keep regex parity TS ↔ bash.
202+
# Date-in-H1 guard (seo-metadata-contract.md §2.1) — mirrors scripts/agentic/analysis-gate.ts checkExecutiveBrief.
214203
EB_H1_TEXT="$(printf '%s' "$EB_H1" \
215204
| sed -E 's/^#[[:space:]]+//' \
216205
| sed -E 's/<[^>]+>//g')"
@@ -227,9 +216,7 @@ if [ -s "$ANALYSIS_DIR/executive-brief.md" ]; then
227216
echo "❌ executive-brief.md: H1 contains a literal Swedish long-form date — dates belong in article:published_time, not the SERP <title>"
228217
FAIL=1
229218
fi
230-
# Trailing-punctuation / dangling-connector guard — H1 must be a
231-
# complete grammatical phrase. Catches `Sweden Evening Analysis,`,
232-
# `Week Ahead: Aid Accountability,`, `… opposition for`, etc.
219+
# Trailing-punctuation / dangling-connector guard — H1 must be a complete grammatical phrase.
233220
EB_H1_TRIM="$(printf '%s' "$EB_H1_TEXT" | sed -E 's/[[:space:]]+$//')"
234221
case "$EB_H1_TRIM" in
235222
*,|*\;|*:|*—|*–|*-)
@@ -241,11 +228,7 @@ if [ -s "$ANALYSIS_DIR/executive-brief.md" ]; then
241228
echo "❌ executive-brief.md: H1 ends with a coordinating connector or article ('and', 'or', 'with', 'the', …) — complete the headline"
242229
FAIL=1
243230
fi
244-
# Across-days uniqueness check (Phase 2 — period-aggregation duplicate-card
245-
# guard). The full normalised comparison lives in
246-
# scripts/agentic/analysis-gate.ts checkExecutiveBrief; this bash
247-
# check is a fast pre-flight that compares the raw H1 line against
248-
# the prior 7 sibling daily folders for the same subfolder.
231+
# Across-days uniqueness check (Phase 2 dup-card guard); full normalised comparison in analysis-gate.ts.
249232
EB_DAILY_DIR="$(dirname "$ANALYSIS_DIR")"
250233
EB_DAILY_ROOT="$(dirname "$EB_DAILY_DIR")"
251234
EB_CURR_DATE="$(basename "$EB_DAILY_DIR")"
@@ -268,8 +251,7 @@ if [ -s "$ANALYSIS_DIR/executive-brief.md" ]; then
268251
fi
269252
fi
270253
else
271-
# No H1 at all — the renderer has nothing to seed the SERP <title>
272-
# from and will silently fall back to a BLUF-sentence fragment.
254+
# No H1 — renderer has nothing to seed SERP <title> and falls back to a BLUF-sentence fragment.
273255
echo "❌ executive-brief.md: no '# H1' heading found — the H1 is the SERP <title> source across all 14 languages; add a publishable story-oriented title"
274256
FAIL=1
275257
fi
@@ -338,9 +320,8 @@ if [ -s "$ANALYSIS_DIR/coalition-mathematics.md" ]; then
338320
|| { echo "❌ coalition-mathematics.md: missing seat-count / vote-breakdown table"; FAIL=1; }
339321
fi
340322

341-
# Check 9b — Statskontoret evidence in implementation-feasibility.md.
342-
# When the file names a recognised agency it MUST carry a statskontoret.se URL
343-
# or the literal `none found` in the `| **Statskontoret relevance** | ... |` row.
323+
# Check 9b — Statskontoret evidence in implementation-feasibility.md. When file names a recognised
324+
# agency it MUST carry a statskontoret.se URL or literal `none found` in `| **Statskontoret relevance** |` row.
344325
AGENCY_RE='Kriminalvård(en)?|Polismyndigheten|Försäkringskassan|Skatteverket|Migrationsverket|Arbetsförmedlingen|Socialstyrelsen|Transportstyrelsen|Trafikverket|Naturvårdsverket|Energimyndigheten'
345326
STATSKONTORET_RELEVANCE_RE='^\|[[:space:]]*\*\*Statskontoret relevance\*\*[[:space:]]*\|[[:space:]]*([^|]*statskontoret\.se[^|]*|[^|]*none found[^|]*)\|'
346327
if [ -s "$ANALYSIS_DIR/implementation-feasibility.md" ]; then
@@ -395,10 +376,8 @@ sys.exit(bad)
395376
PYEOF
396377
fi
397378

398-
# Check 10 — top-2 full-text availability. When the manifest contains a
399-
# "Full-Text Fetch Outcomes" table (from --auto-full-text-top-n), ≥ 2 top
400-
# documents must have full_text_available=true. A `full-text-fallback:` annotation
401-
# bypasses the check.
379+
# Check 10 — top-2 full-text availability. When manifest has "Full-Text Fetch Outcomes" table (from
380+
# --auto-full-text-top-n), ≥ 2 top docs must have full_text_available=true. `full-text-fallback:` bypasses.
402381
MANIFEST="$ANALYSIS_DIR/data-download-manifest.md"
403382
if [ -s "$MANIFEST" ] && grep -q "## Full-Text Fetch Outcomes" "$MANIFEST" \
404383
&& ! grep -q "full-text-fallback:" "$MANIFEST"; then
@@ -408,8 +387,7 @@ if [ -s "$MANIFEST" ] && grep -q "## Full-Text Fetch Outcomes" "$MANIFEST" \
408387
fi
409388

410389
# Check 12 — Editorial QA gate (validate-article.ts: banned phrases, citation density, vintage discipline).
411-
# Runs against the aggregated article.md when present; if the aggregator hasn't run yet the
412-
# article gate is informational (logged), because the editorial validator's domain is post-aggregation.
390+
# Runs on aggregated article.md when present; informational when aggregator hasn't run yet.
413391
ART_MD_GATE="$ANALYSIS_DIR/article.md"
414392
if [ -s "$ART_MD_GATE" ]; then
415393
if command -v npx >/dev/null 2>&1; then
@@ -421,10 +399,8 @@ else
421399
echo "ℹ️ Check 12 (editorial QA): $ART_MD_GATE not yet produced — skipped (run after aggregator)"
422400
fi
423401

424-
# Check 13 — Analysis language (English-only)
425-
# Block when any analysis artifact (excluding executive-brief_<lang>.md translation siblings)
426-
# exceeds the Swedish-density threshold. The script exits 0 on pass and exits 1 with a
427-
# per-file violation list on fail.
402+
# Check 13 — Analysis language (English-only). Blocks any analysis artifact (excluding
403+
# executive-brief_<lang>.md siblings) exceeding the Swedish-density threshold. Exits 0/1.
428404
if command -v npx >/dev/null 2>&1; then
429405
npx tsx scripts/check-analysis-language.ts "$ANALYSIS_DIR" || FAIL=1
430406
else
@@ -448,7 +424,7 @@ Same-day re-runs are **improvement runs** (not skip runs) when `03-data-download
448424

449425
### Check 12 ordering note
450426

451-
Check 12 (`scripts/validate-article.ts`) is the **editorial QA gate** and runs on the aggregated `article.md`. The blocking branch in §Implementation only fires when `article.md` is already on disk; the inline gate runs before aggregation, so on a first pass the article validator is **informational** (the gate logs `ℹ️ Check 12 (editorial QA): … skipped (run after aggregator)`). Workflows MUST re-invoke the gate (or call `npx tsx scripts/validate-article.ts $ANALYSIS_DIR/article.md` directly) **after** `scripts/aggregate-analysis.ts` writes `article.md` so the editorial checks (banned phrases, citation density, `economicProvenance` vintage) become blocking before staging. See `06-article-generation.md §Step 1b — Editorial QA re-check (post-aggregation)` for the post-aggregation invocation pattern.
427+
Check 12 (`scripts/validate-article.ts`) is the **editorial QA gate** on aggregated `article.md`. The blocking branch in §Implementation only fires when `article.md` is on disk; the inline gate runs before aggregation, so on first pass the validator is **informational** (logs `ℹ️ Check 12 (editorial QA): … skipped (run after aggregator)`). Workflows MUST re-invoke the gate (or call `npx tsx scripts/validate-article.ts $ANALYSIS_DIR/article.md` directly) **after** `scripts/aggregate-analysis.ts` writes `article.md`. See `06-article-generation.md §Step 1b — Editorial QA re-check (post-aggregation)`.
452428

453429
## Supplementary checks
454430

@@ -476,8 +452,7 @@ IS_AGGREGATION=0; IS_TIER_C=0; IS_MULTI_RUN=0; RUN_COUNT=1
476452
[[ "${ANALYSIS_RUN_COUNT:-}" =~ ^[0-9]+$ ]] && RUN_COUNT="${ANALYSIS_RUN_COUNT}"
477453
(( RUN_COUNT >= 2 )) && IS_MULTI_RUN=1
478454
if (( IS_AGGREGATION == 1 || IS_TIER_C == 1 || IS_MULTI_RUN == 1 )); then
479-
# No inline bash arrays — see 01-bash-and-shell-safety.md §Banned expansion patterns.
480-
SUPP_LIST="/tmp/gate-supp-$$"; : > "$SUPP_LIST"
455+
SUPP_LIST="/tmp/gate-supp-$$"; : > "$SUPP_LIST" # /tmp list (no inline bash arrays)
481456
(( IS_AGGREGATION == 1 || IS_TIER_C == 1 )) && \
482457
printf '%s\n' analysis-index.md reference-analysis-quality.md mcp-reliability-audit.md workflow-audit.md >> "$SUPP_LIST"
483458
(( IS_AGGREGATION == 1 )) && printf '%s\n' cross-session-intelligence.md session-baseline.md >> "$SUPP_LIST"

0 commit comments

Comments
 (0)