Skip to content

Commit ffe7f31

Browse files
authored
Merge pull request #1941 from Hack23/copilot/update-keepalive-interval-configuration
feat(agentic): two-run analysis/articles pipeline with automatic mode detection
2 parents a82dae4 + f3d592b commit ffe7f31

30 files changed

Lines changed: 747 additions & 1934 deletions

.github/prompts/00-base-contract.md

Lines changed: 24 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -36,71 +36,37 @@ Before producing any analysis or article content, the agent MUST have read:
3636

3737
No article sentence may be drafted until every required analysis artifact exists on disk and the gate in `05-analysis-gate.md` reports pass.
3838

39-
## Pipeline (fixed order)
39+
## Two-run pipeline (primary model)
4040

41+
Every run selects one of two modes automatically — see `03-data-download.md §Pre-flight`:
42+
43+
**Run 1 — Analysis** (when `$ANALYSIS_DIR` is missing or incomplete):
4144
```
42-
Download → Read methodology → Read templates → Analysis Pass 1 → Analysis Pass 2 →
43-
Analysis Gate → Article (if applicable) → Stage → Commit → ONE create_pull_request
45+
MCP pre-warm → Download → Read methodology → Read templates →
46+
Analysis Pass 1 → Pass 1 snapshot → Analysis Pass 2 → Analysis Gate →
47+
Stage analysis → Commit → ONE create_pull_request (analysis-only)
4448
```
4549

46-
No step may be skipped, reordered, or executed in parallel with its successor.
47-
48-
## Phase checkpoint — persist every phase to repo memory
49-
50-
Valuable analysis must never be lost. After each pipeline phase completes, snapshot its output to the gh-aw repo-memory mount at `$GH_AW_MEMORY_DIR` (runtime default `/tmp/gh-aw/repo-memory/default`). gh-aw pushes that directory to the `memory/news-generation` branch in a **separate post-job** — so checkpoints survive even if the content PR job fails, crashes, or times out.
51-
52-
### Mandatory checkpoint points
53-
54-
| After phase | Phase label | Source(s) |
55-
|-------------|-------------|-----------|
56-
| 03 Data download | `phase-03-download` | `$ANALYSIS_DIR` (manifest + fetched data summaries) |
57-
| 04 Analysis Pass 1 | `phase-04-pass1` | `$ANALYSIS_DIR` top-level artifacts |
58-
| 04 Analysis Pass 2 | `phase-04-pass2` | `$ANALYSIS_DIR` top-level artifacts |
59-
| 05 Gate pass | `phase-05-gate` | `$ANALYSIS_DIR` top-level artifacts |
60-
| 06 Article generated | `phase-06-article` | `$ANALYSIS_DIR` + today's `news/${ARTICLE_DATE}-*.html` |
61-
| 07 Immediately before `create_pull_request` | `phase-07-final` | `$ANALYSIS_DIR` + articles from `news/${ARTICLE_DATE}-*.html` |
62-
| `news-translate` per batch | `phase-translate-<lang>` | Translated `news/${ARTICLE_DATE}-*.html` |
63-
64-
Each checkpoint is mandatory. Skipping them forfeits the only cross-run safety net for analysis work.
65-
66-
### Reusable snippet
67-
68-
Run this bash block at the end of every phase (pass the phase label as `$1`). Article HTML is written directly under the flat `news/` directory, so checkpoint copies must use `news/${ARTICLE_DATE}-*.html` rather than `news/$YYYY/$MM/$DD/*.html`:
69-
70-
```bash
71-
set -Eeuo pipefail
72-
: "${GH_AW_MEMORY_DIR:=/tmp/gh-aw/repo-memory/default}"
73-
: "${ARTICLE_DATE:?ARTICLE_DATE required for checkpoint}"
74-
: "${SUBFOLDER:?SUBFOLDER required for checkpoint (use batch/<lang> for news-translate)}"
75-
PHASE="${1:?phase label required, e.g. phase-04-pass1}"
76-
ANALYSIS_DIR="${ANALYSIS_DIR:-analysis/daily/$ARTICLE_DATE/$SUBFOLDER}"
77-
DEST="$GH_AW_MEMORY_DIR/$ARTICLE_DATE/$SUBFOLDER/$PHASE"
78-
mkdir -p "$DEST" 2>/dev/null || { echo "[checkpoint] mkdir failed for $DEST — continuing"; exit 0; }
79-
# Snapshot top-level analysis artifacts (never documents/ — often 100+ files — and never pass1/).
80-
if [ -d "$ANALYSIS_DIR" ]; then
81-
find "$ANALYSIS_DIR" -maxdepth 1 -type f \( -name '*.md' -o -name '*.json' \) \
82-
-exec cp -f {} "$DEST"/ \; 2>/dev/null || true
83-
fi
84-
# Snapshot today's produced article HTML from the flat news/ directory (if any exists at this phase).
85-
if [ -d "news" ]; then
86-
find "news" -maxdepth 1 -type f -name "${ARTICLE_DATE}-*.html" \
87-
-exec cp -f {} "$DEST"/ \; 2>/dev/null || true
88-
fi
89-
COUNT="$(find "$DEST" -maxdepth 1 -type f 2>/dev/null | wc -l | tr -d ' ')"
90-
echo "[checkpoint] $PHASE$DEST ($COUNT files)"
91-
exit 0
50+
**Run 2 — Articles** (when `$ANALYSIS_DIR` already contains all 9 core artifacts):
51+
```
52+
MCP pre-warm → Detect existing analysis → Read all artifacts into context →
53+
Optionally check for new data → Article Pass 1 → Article Pass 2 →
54+
Stage articles → Commit → ONE create_pull_request (articles)
9255
```
9356

94-
### Checkpoint rules
57+
No step may be skipped within a run. Runs must not overlap for the same `$ARTICLE_DATE` + `$SUBFOLDER`.
58+
59+
Same-day re-runs always use the same `$ANALYSIS_DIR` folder — never create a parallel folder for the same date + type combination unless `force_generation=true`.
60+
61+
## Session keepalive requirement
62+
63+
> ⚠️ **Critical**: The Copilot API creates a server-side session when the agent starts. That session is bound to the `github.token` baked in at step start — it is **never refreshed** mid-run. The session expires at approximately **60 minutes** (gh-aw issue #24920). After expiry, all tool calls and inference requests fail silently. The workflow appears to run but makes zero progress, and **the PR is never created**.
64+
65+
To mitigate MCP idle-connection drops, workflows set `sandbox.mcp.keepalive-interval: 300` (5-minute ping). This keeps MCP connections alive but does **not** refresh the Copilot API token.
66+
67+
**The reliable mitigation is to ensure `safeoutputs___create_pull_request` is called well before the session approaches expiry.** Plan the run so the PR is created before the agent passes ~45 minutes of work — that leaves ~10 minutes of safety margin on the 55-minute `timeout-minutes` cap and ~15 minutes on the ~60-minute token window for staging and safe-outputs publishing. See `07-commit-and-pr.md §Deadline enforcement` for the mandatory PR-timing procedure.
9568

96-
| Rule | Rationale |
97-
|------|-----------|
98-
| **Never block on checkpoint failure** — always `exit 0`. | Repo-memory is a safety net, not a gate. |
99-
| Do **not** copy `$ANALYSIS_DIR/documents/` or `$ANALYSIS_DIR/pass1/`. | `documents/` exceeds the 50-file push cap; `pass1/` is local gate evidence only. |
100-
| Do **not** stage or commit anything under `$GH_AW_MEMORY_DIR`. | gh-aw's `push_repo_memory` post-job publishes it; see `07-commit-and-pr.md`. |
101-
| Prefer small summary `.md` / `.json` files (≤ 50 KB each, ≤ 50 per push). | gh-aw silently drops files exceeding the push caps. |
102-
| Re-run the snippet at every phase, even if earlier phases already snapshotted — it overwrites with the latest content. | Ensures the final state is always preserved, and earlier snapshots remain on the branch from prior runs. |
103-
| For `news-translate`, use `SUBFOLDER=batch/<lang-or-batch-id>` so memory paths don't collide with analysis runs. | Keeps the branch organised by article type. |
69+
Do not add per-phase checkpoint PRs or repo-memory push steps.
10470

10571
## Output contract
10672

.github/prompts/02-mcp-access.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,15 +4,14 @@ Authoritative per-workflow surface: the `mcp-servers:` + `tools:` blocks in that
44

55
## Servers & tool naming
66

7-
News workflows declare three data MCP servers + the built-in `github` toolset (via `tools.github.toolsets: [all]`) + `bash` + `agentic-workflows` + `repo-memory`.
7+
News workflows declare three data MCP servers + the built-in `github` toolset (via `tools.github.toolsets: [all]`) + `bash` + `agentic-workflows`.
88

99
| Server | Transport | Declared in | Tool-name style | Example tools |
1010
|--------|-----------|-------------|-----------------|---------------|
1111
| `riksdag-regering` | HTTP (Render) | workflow `mcp-servers:` | `snake_case` | `get_sync_status`, `search_dokument`, `get_voteringar`, `get_dokument_innehall` |
1212
| `scb` | container (`@jarib/pxweb-mcp`) | workflow `mcp-servers:` | `snake_case` | `search_tables`, `get_table_info`, `query_table` |
1313
| `world-bank` | container (`worldbank-mcp`) | workflow `mcp-servers:` | `kebab-case` | `get-economic-data`, `get-country-info`, `search-indicators` |
1414
| `github` | HTTP (Copilot MCP) | workflow `tools.github` | standard | full GitHub MCP toolset |
15-
| `repo-memory` | local helper | workflow `tools.repo-memory` | standard | persistent cross-run memory on `memory/news-generation` |
1615
| `bash` | local helper | workflow `tools.bash` | standard | shell execution |
1716
| `safeoutputs` | runner | always available | `snake_case` | `safeoutputs___create_pull_request`, `safeoutputs___noop`, `safeoutputs___dispatch_workflow` |
1817

@@ -42,4 +41,4 @@ Run once at workflow start, then proceed — do not loop forever.
4241

4342
## Pre-warm step (CI job, not prompt)
4443

45-
Every news workflow declares a **single** `curl`-based pre-warm step with ≤ 6 retries, ≤ 20 s apart. With `curl --max-time 30`, the worst-case runtime can exceed 4 minutes, so this is a best-effort pre-warm rather than a hard ≤ 2 minute guarantee. If a strict 2 minute cap is required, the workflow's `curl` timeout and/or retry policy must be reduced accordingly. No background pingers. The `safeoutputs` session is kept alive by completing work inside its ~30-minute idle window, not by opening interim PRs.
44+
Every news workflow declares a **single** `curl`-based pre-warm step with ≤ 6 retries, ≤ 20 s apart. With `curl --max-time 30`, the worst-case runtime can exceed 4 minutes, so this is a best-effort pre-warm rather than a hard ≤ 2 minute guarantee. If a strict 2 minute cap is required, the workflow's `curl` timeout and/or retry policy must be reduced accordingly. No background pingers. MCP session longevity is maintained via `sandbox.mcp.keepalive-interval: 300`.

.github/prompts/03-data-download.md

Lines changed: 39 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,43 @@
11
# 03 — Data Download
22

3+
## Pre-flight: existing analysis check
4+
5+
Run this check as the **first action** after MCP pre-warm, before any download:
6+
7+
```bash
8+
ANALYSIS_DIR="analysis/daily/$ARTICLE_DATE/$SUBFOLDER"
9+
10+
# 9 core artifacts required by every workflow
11+
REQ=(synthesis-summary.md swot-analysis.md risk-assessment.md threat-analysis.md \
12+
stakeholder-perspectives.md significance-scoring.md classification-results.md \
13+
cross-reference-map.md data-download-manifest.md)
14+
15+
# Tier-C workflows require 5 additional artifacts (evening-analysis, week-ahead,
16+
# month-ahead, weekly-review, monthly-review, realtime-*, deep-inspection).
17+
# See ext/tier-c-aggregation.md for the full list.
18+
case "$SUBFOLDER" in
19+
evening-analysis|week-ahead|month-ahead|weekly-review|monthly-review|deep-inspection|realtime-*)
20+
REQ+=(README.md executive-brief.md scenario-analysis.md \
21+
comparative-international.md methodology-reflection.md)
22+
;;
23+
esac
24+
25+
SKIP_ANALYSIS=false
26+
ALL_PRESENT=true
27+
for f in "${REQ[@]}"; do
28+
[ -s "$ANALYSIS_DIR/$f" ] || { ALL_PRESENT=false; break; }
29+
done
30+
[ "$ALL_PRESENT" = "true" ] && SKIP_ANALYSIS=true
31+
echo "SKIP_ANALYSIS=$SKIP_ANALYSIS (required artifacts present: $ALL_PRESENT, count: ${#REQ[@]})"
32+
```
33+
34+
| `SKIP_ANALYSIS` | Mode | Next step |
35+
|-----------------|------|-----------|
36+
| `false` | **Analysis mode** | Continue with download pipeline below → `04-analysis-pipeline.md` → analysis-only PR (see `07-commit-and-pr.md`). Do **not** generate articles in this run. |
37+
| `true` | **Article mode** | Skip the entire download pipeline and `04-analysis-pipeline.md`. Proceed directly to `06-article-generation.md`. Optionally re-query the API and compare against `data-download-manifest.md`; add only genuinely new `dok_id` entries found since the analysis ran. |
38+
39+
> **Folder reuse rule**: the same `$ANALYSIS_DIR` is always reused across runs for the same `$ARTICLE_DATE` + `$SUBFOLDER` when `force_generation=false`. The legacy auto-suffix behaviour (`propositions-2`, `propositions-3`, …) is retained **only** as an explicit escape hatch when `force_generation=true`, so that a forced rerun on a merged day can produce a fresh parallel analysis without trampling the existing one.
40+
341
## Goal
442

543
Populate `analysis/daily/$ARTICLE_DATE/$SUBFOLDER/` with raw Riksdag/Regering data and a provenance manifest **before** any analysis starts.
@@ -20,7 +58,7 @@ Populate `analysis/daily/$ARTICLE_DATE/$SUBFOLDER/` with raw Riksdag/Regering da
2058
| news-realtime-monitor | `realtime-$HHMM` |
2159
| news-article-generator (`deep-inspection`) | `deep-inspection` |
2260

23-
If the base subfolder already contains `synthesis-summary.md` from a prior merged run **and** `force_generation=false`, auto-suffix: `propositions-2`, `propositions-3`, …
61+
If `force_generation=true` is supplied on a day whose base subfolder already contains `synthesis-summary.md` from a prior merged run, auto-suffix the subfolder (`propositions-2`, `propositions-3`, …) so the forced rerun does not overwrite the merged analysis. Under the default `force_generation=false`, the same base subfolder is reused across runs — see §Pre-flight above.
2462

2563
## Download pipeline
2664

.github/prompts/04-analysis-pipeline.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,8 @@ Plus `documents/` subfolder with **one `{dok_id}-analysis.md` file per `dok_id`*
3636

3737
## Execution order
3838

39+
> **Fast-path**: If `SKIP_ANALYSIS=true` (set by `03-data-download.md §Pre-flight`), skip all steps 1–5 below and proceed directly to `06-article-generation.md`. The full analysis already exists on disk from a prior run — do not re-run downloads, Pass 1, Pass 2, or the gate.
40+
3941
1. **Read all 6 methodologies first** (one tool call per file, do not skip).
4042
2. **Read all 8 templates first.**
4143
3. **Pass 1 — Create** all 9 artifacts + every per-document file. Minimum 15 minutes of real work.

.github/prompts/07-commit-and-pr.md

Lines changed: 22 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,17 @@
1010
1111
Workflows declare `safe-outputs.create-pull-request.max: 1`. Attempting a second call is a workflow error.
1212

13+
## Two-run PR strategy
14+
15+
| Run mode | What to commit | PR title prefix | Labels | After PR |
16+
|----------|---------------|-----------------|--------|----------|
17+
| **Analysis mode** (`SKIP_ANALYSIS=false`) | `analysis/daily/$ARTICLE_DATE/$SUBFOLDER/*.md` + `*.json` (never `pass1/`) | `📊 Analysis — ` | `analysis-only` + article-type | **Stop.** Do NOT generate articles. The next scheduled run will detect the analysis and enter Article mode automatically. |
18+
| **Article mode** (`SKIP_ANALYSIS=true`) | `news/$YYYY/$MM/$DD/$SLUG.{en,sv}.html` + chart JSON | `📰 ` | `agentic-news` + article-type | Dispatch `news-translate` for 12 remaining languages. |
19+
20+
In **Analysis mode**: commit analysis artifacts, create the `analysis-only` PR, then exit. Zero articles are generated in this run. The analysis stays in the `$ANALYSIS_DIR` folder; the next run of this workflow for the same `$ARTICLE_DATE` will find it and proceed directly to articles.
21+
22+
In **Article mode**: generate articles from existing analysis, commit, and create the articles PR.
23+
1324
## Stage → commit → PR
1425

1526
1. **Stage scoped files only.** Never stage the whole repo.
@@ -21,8 +32,6 @@ Workflows declare `safe-outputs.create-pull-request.max: 1`. Attempting a second
2132
| Articles (core languages) | `news/$YYYY/$MM/$DD/$SLUG.{en,sv}.html` |
2233
| Translations (news-translate only) | `news/$YYYY/$MM/$DD/$SLUG.<lang>.html` |
2334

24-
Repo-memory persistence is handled separately by `tools.repo-memory` and pushed to the `memory/news-generation` branch by the safe-outputs runner job. **Do not** create, stage, or commit any `memory/news-generation/*.json` files in the content PR — there is no `memory/` directory in the working tree of `main`.
25-
2635
Never stage `analysis/daily/$ARTICLE_DATE/$SUBFOLDER/documents/` wholesale — it often contains 100+ files. Stage only `documents/*.md` **if** your `documents/` stays under the safe-outputs 100-file cap; otherwise stage only summary files. Never stage `analysis/daily/$ARTICLE_DATE/$SUBFOLDER/pass1/` — it is a local gate-evidence snapshot (see `04-analysis-pipeline.md`), not a deliverable.
2736

2837
2. **100-file guard.** Before calling safeoutputs, count staged files. If the count > 99, unstage everything under `documents/` except `synthesis-summary.md` and re-check.
@@ -89,19 +98,22 @@ Call `safeoutputs___noop({"message": "<reason>"})` **only** if:
8998

9099
In every other case, commit whatever exists and call `create_pull_request` once.
91100

92-
## Final checkpoint — before the PR call
101+
## Deadline enforcement
93102

94-
Immediately before calling `safeoutputs___create_pull_request`, run the **phase checkpoint** from `00-base-contract.md` with label `phase-07-final`. This snapshots the final authoritative analysis + article state to repo memory, so even if the PR call, the safe-outputs runner, or the post-job push fails, the last good state survives on the `memory/news-generation` branch.
103+
> **Root cause**: The Copilot API session is bound to the `github.token` baked in at step start. That token expires at approximately **60 minutes** and is never refreshed mid-run (gh-aw issue #24920). Every tool call and inference request fails silently after that point — the agent appears to run but makes no progress and the PR is never created. Setup steps consume ~5 minutes, so the agent has at most **~55 minutes** of usable session time, and safe-outputs publishing needs several minutes on top.
95104
96-
For `news-translate`, run the checkpoint with label `phase-translate-<lang>` after each per-language batch succeeds (before the final PR call), so individual language translations are preserved even if later languages fail.
105+
The target PR-creation window depends on which mode the run is in (see `03-data-download.md §Pre-flight`):
97106

98-
## Deadline enforcement
107+
| Mode | Target PR window | Hard deadline |
108+
|------|------------------|---------------|
109+
| Run 1 — Analysis | 40–45 min after agent start | **48 min** |
110+
| Run 2 — Articles | 20–25 min after agent start | **30 min** |
99111

100-
If the run exceeds 40 minutes with no safe-output call yet:
112+
**If the run exceeds its hard deadline with no safe-output call yet:**
101113

102114
1. Stop analysis / article work immediately.
103-
2. Stage whatever exists on disk.
104-
3. Commit.
115+
2. Stage whatever exists on disk (analysis artifacts and/or partial articles).
116+
3. Commit with message including `[early-pr]` to signal partial content.
105117
4. Call `safeoutputs___create_pull_request` with label `analysis-only` if articles are incomplete.
106118

107-
Do not attempt to "save" work via a second PR — there is no second PR.
119+
Do not attempt to "save" work via a second PR — there is no second PR. Creating the PR early is always better than losing all work to a token expiry. The hard deadlines above leave ~7 minutes of margin on the 55-minute `timeout-minutes` cap for staging and safe-outputs publishing before the ~60-minute Copilot API token expiry.

0 commit comments

Comments
 (0)