Skip to content

Commit f3d592b

Browse files
Copilotpethers
andauthored
fix(agentic): address PR #1941 review + failing vitest (force_generation, timeout <=55, Tier-C pre-flight, v0.69.3 recompile)
- tests/workflow-architecture.test.ts: * restore failing force_generation check by re-introducing `force_generation=false` literal in each article-type workflow .md (Run-mode selection section) * tighten no-timeout test from <= 90 min to <= 55 min (matches current standard) - .github/prompts/00-base-contract.md: replace "25-min only reliable mitigation" with a mode-aware guidance that aligns with the ~60-min token window and 55-min workflow cap - .github/prompts/07-commit-and-pr.md: Deadline enforcement now a mode-aware table (Analysis mode target 40–45min / hard 48min; Article mode target 20–25min / hard 30min) instead of a single 25-min hard rule that would force early PRs on healthy analysis runs - .github/prompts/03-data-download.md: * Tier-C pre-flight now conditionally extends REQ with the 5 Tier-C artifacts when $SUBFOLDER matches evening-analysis | week-ahead | month-ahead | weekly-review | monthly-review | deep-inspection | realtime-* * reconcile folder-reuse rule with legacy auto-suffix: base folder reused when force_generation=false; suffix only as escape hatch when force_generation=true - Installed gh-aw v0.69.3 (the CLI pinned in compile-agentic-workflows.yml) and recompiled all 12 .lock.yml → compiler_version now v0.69.3 and .github/aw/actions-lock.json picks up github/gh-aw-actions/setup-cli@v0.69.3 - Full vitest run: 107 files, 4324/4324 tests passing Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/aba5a55a-e7b3-4703-bd9d-a06176e7fa5b Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
1 parent 9bddc50 commit f3d592b

28 files changed

Lines changed: 864 additions & 476 deletions

.github/aw/actions-lock.json

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -40,10 +40,15 @@
4040
"version": "v7.0.1",
4141
"sha": "043fb46d1a93c77aae656e7c1c64a875d1fc6a0a"
4242
},
43-
"github/gh-aw-actions/setup@v0.68.3": {
43+
"github/gh-aw-actions/setup-cli@v0.69.3": {
44+
"repo": "github/gh-aw-actions/setup-cli",
45+
"version": "v0.69.3",
46+
"sha": "006ffd856b868b71df342dbe0ba082a963249b31"
47+
},
48+
"github/gh-aw-actions/setup@v0.69.3": {
4449
"repo": "github/gh-aw-actions/setup",
45-
"version": "v0.68.3",
46-
"sha": "ba90f2186d7ad780ec640f364005fa24e797b360"
50+
"version": "v0.69.3",
51+
"sha": "006ffd856b868b71df342dbe0ba082a963249b31"
4752
},
4853
"github/gh-aw/actions/setup@v0.43.18": {
4954
"repo": "github/gh-aw/actions/setup",

.github/prompts/00-base-contract.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ Same-day re-runs always use the same `$ANALYSIS_DIR` folder — never create a p
6464
6565
To mitigate MCP idle-connection drops, workflows set `sandbox.mcp.keepalive-interval: 300` (5-minute ping). This keeps MCP connections alive but does **not** refresh the Copilot API token.
6666

67-
**The only reliable mitigation is to call `safeoutputs___create_pull_request` within 25 minutes of agent start** before the token nears expiry. See `07-commit-and-pr.md §Deadline enforcement` for the mandatory early-PR procedure.
67+
**The reliable mitigation is to ensure `safeoutputs___create_pull_request` is called well before the session approaches expiry.** Plan the run so the PR is created before the agent passes ~45 minutes of work — that leaves ~10 minutes of safety margin on the 55-minute `timeout-minutes` cap and ~15 minutes on the ~60-minute token window for staging and safe-outputs publishing. See `07-commit-and-pr.md §Deadline enforcement` for the mandatory PR-timing procedure.
6868

6969
Do not add per-phase checkpoint PRs or repo-memory push steps.
7070

.github/prompts/03-data-download.md

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,24 +6,37 @@ Run this check as the **first action** after MCP pre-warm, before any download:
66

77
```bash
88
ANALYSIS_DIR="analysis/daily/$ARTICLE_DATE/$SUBFOLDER"
9+
10+
# 9 core artifacts required by every workflow
911
REQ=(synthesis-summary.md swot-analysis.md risk-assessment.md threat-analysis.md \
1012
stakeholder-perspectives.md significance-scoring.md classification-results.md \
1113
cross-reference-map.md data-download-manifest.md)
14+
15+
# Tier-C workflows require 5 additional artifacts (evening-analysis, week-ahead,
16+
# month-ahead, weekly-review, monthly-review, realtime-*, deep-inspection).
17+
# See ext/tier-c-aggregation.md for the full list.
18+
case "$SUBFOLDER" in
19+
evening-analysis|week-ahead|month-ahead|weekly-review|monthly-review|deep-inspection|realtime-*)
20+
REQ+=(README.md executive-brief.md scenario-analysis.md \
21+
comparative-international.md methodology-reflection.md)
22+
;;
23+
esac
24+
1225
SKIP_ANALYSIS=false
1326
ALL_PRESENT=true
1427
for f in "${REQ[@]}"; do
1528
[ -s "$ANALYSIS_DIR/$f" ] || { ALL_PRESENT=false; break; }
1629
done
1730
[ "$ALL_PRESENT" = "true" ] && SKIP_ANALYSIS=true
18-
echo "SKIP_ANALYSIS=$SKIP_ANALYSIS (analysis folder present: $ALL_PRESENT)"
31+
echo "SKIP_ANALYSIS=$SKIP_ANALYSIS (required artifacts present: $ALL_PRESENT, count: ${#REQ[@]})"
1932
```
2033

2134
| `SKIP_ANALYSIS` | Mode | Next step |
2235
|-----------------|------|-----------|
2336
| `false` | **Analysis mode** | Continue with download pipeline below → `04-analysis-pipeline.md` → analysis-only PR (see `07-commit-and-pr.md`). Do **not** generate articles in this run. |
2437
| `true` | **Article mode** | Skip the entire download pipeline and `04-analysis-pipeline.md`. Proceed directly to `06-article-generation.md`. Optionally re-query the API and compare against `data-download-manifest.md`; add only genuinely new `dok_id` entries found since the analysis ran. |
2538

26-
> **Folder reuse rule**: the same `$ANALYSIS_DIR` is always reused across runs for the same `$ARTICLE_DATE` + `$SUBFOLDER`. Never create `propositions-2`, `propositions-3`, etc. for the same date unless `force_generation=true`.
39+
> **Folder reuse rule**: the same `$ANALYSIS_DIR` is always reused across runs for the same `$ARTICLE_DATE` + `$SUBFOLDER` when `force_generation=false`. The legacy auto-suffix behaviour (`propositions-2`, `propositions-3`, …) is retained **only** as an explicit escape hatch when `force_generation=true`, so that a forced rerun on a merged day can produce a fresh parallel analysis without trampling the existing one.
2740
2841
## Goal
2942

@@ -45,7 +58,7 @@ Populate `analysis/daily/$ARTICLE_DATE/$SUBFOLDER/` with raw Riksdag/Regering da
4558
| news-realtime-monitor | `realtime-$HHMM` |
4659
| news-article-generator (`deep-inspection`) | `deep-inspection` |
4760

48-
If the base subfolder already contains `synthesis-summary.md` from a prior merged run **and** `force_generation=false`, auto-suffix: `propositions-2`, `propositions-3`, …
61+
If `force_generation=true` is supplied on a day whose base subfolder already contains `synthesis-summary.md` from a prior merged run, auto-suffix the subfolder (`propositions-2`, `propositions-3`, …) so the forced rerun does not overwrite the merged analysis. Under the default `force_generation=false`, the same base subfolder is reused across runs — see §Pre-flight above.
4962

5063
## Download pipeline
5164

.github/prompts/07-commit-and-pr.md

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -102,11 +102,18 @@ In every other case, commit whatever exists and call `create_pull_request` once.
102102

103103
> **Root cause**: The Copilot API session is bound to the `github.token` baked in at step start. That token expires at approximately **60 minutes** and is never refreshed mid-run (gh-aw issue #24920). Every tool call and inference request fails silently after that point — the agent appears to run but makes no progress and the PR is never created. Setup steps consume ~5 minutes, so the agent has at most **~55 minutes** of usable session time, and safe-outputs publishing needs several minutes on top.
104104
105-
**If the run exceeds 25 minutes with no safe-output call yet:**
105+
The target PR-creation window depends on which mode the run is in (see `03-data-download.md §Pre-flight`):
106+
107+
| Mode | Target PR window | Hard deadline |
108+
|------|------------------|---------------|
109+
| Run 1 — Analysis | 40–45 min after agent start | **48 min** |
110+
| Run 2 — Articles | 20–25 min after agent start | **30 min** |
111+
112+
**If the run exceeds its hard deadline with no safe-output call yet:**
106113

107114
1. Stop analysis / article work immediately.
108115
2. Stage whatever exists on disk (analysis artifacts and/or partial articles).
109116
3. Commit with message including `[early-pr]` to signal partial content.
110117
4. Call `safeoutputs___create_pull_request` with label `analysis-only` if articles are incomplete.
111118

112-
Do not attempt to "save" work via a second PR — there is no second PR. Creating the PR early is always better than losing all work to a token expiry.
119+
Do not attempt to "save" work via a second PR — there is no second PR. Creating the PR early is always better than losing all work to a token expiry. The hard deadlines above leave ~7 minutes of margin on the 55-minute `timeout-minutes` cap for staging and safe-outputs publishing before the ~60-minute Copilot API token expiry.

0 commit comments

Comments
 (0)