Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 24 additions & 58 deletions .github/prompts/00-base-contract.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,71 +36,37 @@ Before producing any analysis or article content, the agent MUST have read:

No article sentence may be drafted until every required analysis artifact exists on disk and the gate in `05-analysis-gate.md` reports pass.

## Pipeline (fixed order)
## Two-run pipeline (primary model)

Every run selects one of two modes automatically — see `03-data-download.md §Pre-flight`:

**Run 1 — Analysis** (when `$ANALYSIS_DIR` is missing or incomplete):
```
Download → Read methodology → Read templates → Analysis Pass 1 → Analysis Pass 2 →
Analysis Gate → Article (if applicable) → Stage → Commit → ONE create_pull_request
MCP pre-warm → Download → Read methodology → Read templates →
Analysis Pass 1 → Pass 1 snapshot → Analysis Pass 2 → Analysis Gate →
Stage analysis → Commit → ONE create_pull_request (analysis-only)
```

No step may be skipped, reordered, or executed in parallel with its successor.

## Phase checkpoint — persist every phase to repo memory

Valuable analysis must never be lost. After each pipeline phase completes, snapshot its output to the gh-aw repo-memory mount at `$GH_AW_MEMORY_DIR` (runtime default `/tmp/gh-aw/repo-memory/default`). gh-aw pushes that directory to the `memory/news-generation` branch in a **separate post-job** — so checkpoints survive even if the content PR job fails, crashes, or times out.

### Mandatory checkpoint points

| After phase | Phase label | Source(s) |
|-------------|-------------|-----------|
| 03 Data download | `phase-03-download` | `$ANALYSIS_DIR` (manifest + fetched data summaries) |
| 04 Analysis Pass 1 | `phase-04-pass1` | `$ANALYSIS_DIR` top-level artifacts |
| 04 Analysis Pass 2 | `phase-04-pass2` | `$ANALYSIS_DIR` top-level artifacts |
| 05 Gate pass | `phase-05-gate` | `$ANALYSIS_DIR` top-level artifacts |
| 06 Article generated | `phase-06-article` | `$ANALYSIS_DIR` + today's `news/${ARTICLE_DATE}-*.html` |
| 07 Immediately before `create_pull_request` | `phase-07-final` | `$ANALYSIS_DIR` + articles from `news/${ARTICLE_DATE}-*.html` |
| `news-translate` per batch | `phase-translate-<lang>` | Translated `news/${ARTICLE_DATE}-*.html` |

Each checkpoint is mandatory. Skipping them forfeits the only cross-run safety net for analysis work.

### Reusable snippet

Run this bash block at the end of every phase (pass the phase label as `$1`). Article HTML is written directly under the flat `news/` directory, so checkpoint copies must use `news/${ARTICLE_DATE}-*.html` rather than `news/$YYYY/$MM/$DD/*.html`:

```bash
set -Eeuo pipefail
: "${GH_AW_MEMORY_DIR:=/tmp/gh-aw/repo-memory/default}"
: "${ARTICLE_DATE:?ARTICLE_DATE required for checkpoint}"
: "${SUBFOLDER:?SUBFOLDER required for checkpoint (use batch/<lang> for news-translate)}"
PHASE="${1:?phase label required, e.g. phase-04-pass1}"
ANALYSIS_DIR="${ANALYSIS_DIR:-analysis/daily/$ARTICLE_DATE/$SUBFOLDER}"
DEST="$GH_AW_MEMORY_DIR/$ARTICLE_DATE/$SUBFOLDER/$PHASE"
mkdir -p "$DEST" 2>/dev/null || { echo "[checkpoint] mkdir failed for $DEST — continuing"; exit 0; }
# Snapshot top-level analysis artifacts (never documents/ — often 100+ files — and never pass1/).
if [ -d "$ANALYSIS_DIR" ]; then
find "$ANALYSIS_DIR" -maxdepth 1 -type f \( -name '*.md' -o -name '*.json' \) \
-exec cp -f {} "$DEST"/ \; 2>/dev/null || true
fi
# Snapshot today's produced article HTML from the flat news/ directory (if any exists at this phase).
if [ -d "news" ]; then
find "news" -maxdepth 1 -type f -name "${ARTICLE_DATE}-*.html" \
-exec cp -f {} "$DEST"/ \; 2>/dev/null || true
fi
COUNT="$(find "$DEST" -maxdepth 1 -type f 2>/dev/null | wc -l | tr -d ' ')"
echo "[checkpoint] $PHASE → $DEST ($COUNT files)"
exit 0
**Run 2 — Articles** (when `$ANALYSIS_DIR` already contains all 9 core artifacts):
```
MCP pre-warm → Detect existing analysis → Read all artifacts into context →
Optionally check for new data → Article Pass 1 → Article Pass 2 →
Stage articles → Commit → ONE create_pull_request (articles)
```

### Checkpoint rules
No step may be skipped within a run. Runs must not overlap for the same `$ARTICLE_DATE` + `$SUBFOLDER`.

Same-day re-runs always use the same `$ANALYSIS_DIR` folder — never create a parallel folder for the same date + type combination unless `force_generation=true`.

## Session keepalive requirement

> ⚠️ **Critical**: The Copilot API creates a server-side session when the agent starts. That session is bound to the `github.token` baked in at step start — it is **never refreshed** mid-run. The session expires at approximately **60 minutes** (gh-aw issue #24920). After expiry, all tool calls and inference requests fail silently. The workflow appears to run but makes zero progress, and **the PR is never created**.

To mitigate MCP idle-connection drops, workflows set `sandbox.mcp.keepalive-interval: 300` (5-minute ping). This keeps MCP connections alive but does **not** refresh the Copilot API token.

**The reliable mitigation is to ensure `safeoutputs___create_pull_request` is called well before the session approaches expiry.** Plan the run so the PR is created before the agent passes ~45 minutes of work — that leaves ~10 minutes of safety margin on the 55-minute `timeout-minutes` cap and ~15 minutes on the ~60-minute token window for staging and safe-outputs publishing. See `07-commit-and-pr.md §Deadline enforcement` for the mandatory PR-timing procedure.

| Rule | Rationale |
|------|-----------|
| **Never block on checkpoint failure** — always `exit 0`. | Repo-memory is a safety net, not a gate. |
| Do **not** copy `$ANALYSIS_DIR/documents/` or `$ANALYSIS_DIR/pass1/`. | `documents/` exceeds the 50-file push cap; `pass1/` is local gate evidence only. |
| Do **not** stage or commit anything under `$GH_AW_MEMORY_DIR`. | gh-aw's `push_repo_memory` post-job publishes it; see `07-commit-and-pr.md`. |
| Prefer small summary `.md` / `.json` files (≤ 50 KB each, ≤ 50 per push). | gh-aw silently drops files exceeding the push caps. |
| Re-run the snippet at every phase, even if earlier phases already snapshotted — it overwrites with the latest content. | Ensures the final state is always preserved, and earlier snapshots remain on the branch from prior runs. |
| For `news-translate`, use `SUBFOLDER=batch/<lang-or-batch-id>` so memory paths don't collide with analysis runs. | Keeps the branch organised by article type. |
Do not add per-phase checkpoint PRs or repo-memory push steps.

## Output contract

Expand Down
5 changes: 2 additions & 3 deletions .github/prompts/02-mcp-access.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,14 @@ Authoritative per-workflow surface: the `mcp-servers:` + `tools:` blocks in that

## Servers & tool naming

News workflows declare three data MCP servers + the built-in `github` toolset (via `tools.github.toolsets: [all]`) + `bash` + `agentic-workflows` + `repo-memory`.
News workflows declare three data MCP servers + the built-in `github` toolset (via `tools.github.toolsets: [all]`) + `bash` + `agentic-workflows`.

| Server | Transport | Declared in | Tool-name style | Example tools |
|--------|-----------|-------------|-----------------|---------------|
| `riksdag-regering` | HTTP (Render) | workflow `mcp-servers:` | `snake_case` | `get_sync_status`, `search_dokument`, `get_voteringar`, `get_dokument_innehall` |
| `scb` | container (`@jarib/pxweb-mcp`) | workflow `mcp-servers:` | `snake_case` | `search_tables`, `get_table_info`, `query_table` |
| `world-bank` | container (`worldbank-mcp`) | workflow `mcp-servers:` | `kebab-case` | `get-economic-data`, `get-country-info`, `search-indicators` |
| `github` | HTTP (Copilot MCP) | workflow `tools.github` | standard | full GitHub MCP toolset |
| `repo-memory` | local helper | workflow `tools.repo-memory` | standard | persistent cross-run memory on `memory/news-generation` |
| `bash` | local helper | workflow `tools.bash` | standard | shell execution |
| `safeoutputs` | runner | always available | `snake_case` | `safeoutputs___create_pull_request`, `safeoutputs___noop`, `safeoutputs___dispatch_workflow` |

Expand Down Expand Up @@ -42,4 +41,4 @@ Run once at workflow start, then proceed — do not loop forever.

## Pre-warm step (CI job, not prompt)

Every news workflow declares a **single** `curl`-based pre-warm step with ≤ 6 retries, ≤ 20 s apart. With `curl --max-time 30`, the worst-case runtime can exceed 4 minutes, so this is a best-effort pre-warm rather than a hard ≤ 2 minute guarantee. If a strict 2 minute cap is required, the workflow's `curl` timeout and/or retry policy must be reduced accordingly. No background pingers. The `safeoutputs` session is kept alive by completing work inside its ~30-minute idle window, not by opening interim PRs.
Every news workflow declares a **single** `curl`-based pre-warm step with ≤ 6 retries, ≤ 20 s apart. With `curl --max-time 30`, the worst-case runtime can exceed 4 minutes, so this is a best-effort pre-warm rather than a hard ≤ 2 minute guarantee. If a strict 2 minute cap is required, the workflow's `curl` timeout and/or retry policy must be reduced accordingly. No background pingers. MCP session longevity is maintained via `sandbox.mcp.keepalive-interval: 300`.
40 changes: 39 additions & 1 deletion .github/prompts/03-data-download.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,43 @@
# 03 — Data Download

## Pre-flight: existing analysis check

Run this check as the **first action** after MCP pre-warm, before any download:

```bash
ANALYSIS_DIR="analysis/daily/$ARTICLE_DATE/$SUBFOLDER"

# 9 core artifacts required by every workflow
REQ=(synthesis-summary.md swot-analysis.md risk-assessment.md threat-analysis.md \
stakeholder-perspectives.md significance-scoring.md classification-results.md \
cross-reference-map.md data-download-manifest.md)

# Tier-C workflows require 5 additional artifacts (evening-analysis, week-ahead,
# month-ahead, weekly-review, monthly-review, realtime-*, deep-inspection).
# See ext/tier-c-aggregation.md for the full list.
case "$SUBFOLDER" in
evening-analysis|week-ahead|month-ahead|weekly-review|monthly-review|deep-inspection|realtime-*)
REQ+=(README.md executive-brief.md scenario-analysis.md \
comparative-international.md methodology-reflection.md)
;;
esac

SKIP_ANALYSIS=false
ALL_PRESENT=true
for f in "${REQ[@]}"; do
[ -s "$ANALYSIS_DIR/$f" ] || { ALL_PRESENT=false; break; }
done
[ "$ALL_PRESENT" = "true" ] && SKIP_ANALYSIS=true
echo "SKIP_ANALYSIS=$SKIP_ANALYSIS (required artifacts present: $ALL_PRESENT, count: ${#REQ[@]})"
```

| `SKIP_ANALYSIS` | Mode | Next step |
|-----------------|------|-----------|
| `false` | **Analysis mode** | Continue with download pipeline below → `04-analysis-pipeline.md` → analysis-only PR (see `07-commit-and-pr.md`). Do **not** generate articles in this run. |
| `true` | **Article mode** | Skip the entire download pipeline and `04-analysis-pipeline.md`. Proceed directly to `06-article-generation.md`. Optionally re-query the API and compare against `data-download-manifest.md`; add only genuinely new `dok_id` entries found since the analysis ran. |

> **Folder reuse rule**: the same `$ANALYSIS_DIR` is always reused across runs for the same `$ARTICLE_DATE` + `$SUBFOLDER` when `force_generation=false`. The legacy auto-suffix behaviour (`propositions-2`, `propositions-3`, …) is retained **only** as an explicit escape hatch when `force_generation=true`, so that a forced rerun on a merged day can produce a fresh parallel analysis without trampling the existing one.

## Goal

Populate `analysis/daily/$ARTICLE_DATE/$SUBFOLDER/` with raw Riksdag/Regering data and a provenance manifest **before** any analysis starts.
Expand All @@ -20,7 +58,7 @@ Populate `analysis/daily/$ARTICLE_DATE/$SUBFOLDER/` with raw Riksdag/Regering da
| news-realtime-monitor | `realtime-$HHMM` |
| news-article-generator (`deep-inspection`) | `deep-inspection` |

If the base subfolder already contains `synthesis-summary.md` from a prior merged run **and** `force_generation=false`, auto-suffix: `propositions-2`, `propositions-3`, …
If `force_generation=true` is supplied on a day whose base subfolder already contains `synthesis-summary.md` from a prior merged run, auto-suffix the subfolder (`propositions-2`, `propositions-3`, …) so the forced rerun does not overwrite the merged analysis. Under the default `force_generation=false`, the same base subfolder is reused across runs — see §Pre-flight above.

## Download pipeline

Expand Down
2 changes: 2 additions & 0 deletions .github/prompts/04-analysis-pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,8 @@ Plus `documents/` subfolder with **one `{dok_id}-analysis.md` file per `dok_id`*

## Execution order

> **Fast-path**: If `SKIP_ANALYSIS=true` (set by `03-data-download.md §Pre-flight`), skip all steps 1–5 below and proceed directly to `06-article-generation.md`. The full analysis already exists on disk from a prior run — do not re-run downloads, Pass 1, Pass 2, or the gate.

1. **Read all 6 methodologies first** (one tool call per file, do not skip).
2. **Read all 8 templates first.**
3. **Pass 1 — Create** all 9 artifacts + every per-document file. Minimum 15 minutes of real work.
Expand Down
32 changes: 22 additions & 10 deletions .github/prompts/07-commit-and-pr.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,17 @@

Workflows declare `safe-outputs.create-pull-request.max: 1`. Attempting a second call is a workflow error.

## Two-run PR strategy

| Run mode | What to commit | PR title prefix | Labels | After PR |
|----------|---------------|-----------------|--------|----------|
| **Analysis mode** (`SKIP_ANALYSIS=false`) | `analysis/daily/$ARTICLE_DATE/$SUBFOLDER/*.md` + `*.json` (never `pass1/`) | `📊 Analysis — ` | `analysis-only` + article-type | **Stop.** Do NOT generate articles. The next scheduled run will detect the analysis and enter Article mode automatically. |
| **Article mode** (`SKIP_ANALYSIS=true`) | `news/$YYYY/$MM/$DD/$SLUG.{en,sv}.html` + chart JSON | `📰 ` | `agentic-news` + article-type | Dispatch `news-translate` for 12 remaining languages. |

In **Analysis mode**: commit analysis artifacts, create the `analysis-only` PR, then exit. Zero articles are generated in this run. The analysis stays in the `$ANALYSIS_DIR` folder; the next run of this workflow for the same `$ARTICLE_DATE` will find it and proceed directly to articles.

In **Article mode**: generate articles from existing analysis, commit, and create the articles PR.

## Stage → commit → PR

1. **Stage scoped files only.** Never stage the whole repo.
Expand All @@ -21,8 +32,6 @@ Workflows declare `safe-outputs.create-pull-request.max: 1`. Attempting a second
| Articles (core languages) | `news/$YYYY/$MM/$DD/$SLUG.{en,sv}.html` |
| Translations (news-translate only) | `news/$YYYY/$MM/$DD/$SLUG.<lang>.html` |

Repo-memory persistence is handled separately by `tools.repo-memory` and pushed to the `memory/news-generation` branch by the safe-outputs runner job. **Do not** create, stage, or commit any `memory/news-generation/*.json` files in the content PR — there is no `memory/` directory in the working tree of `main`.

Never stage `analysis/daily/$ARTICLE_DATE/$SUBFOLDER/documents/` wholesale — it often contains 100+ files. Stage only `documents/*.md` **if** your `documents/` stays under the safe-outputs 100-file cap; otherwise stage only summary files. Never stage `analysis/daily/$ARTICLE_DATE/$SUBFOLDER/pass1/` — it is a local gate-evidence snapshot (see `04-analysis-pipeline.md`), not a deliverable.

2. **100-file guard.** Before calling safeoutputs, count staged files. If the count > 99, unstage everything under `documents/` except `synthesis-summary.md` and re-check.
Expand Down Expand Up @@ -89,19 +98,22 @@ Call `safeoutputs___noop({"message": "<reason>"})` **only** if:

In every other case, commit whatever exists and call `create_pull_request` once.

## Final checkpoint — before the PR call
## Deadline enforcement

Immediately before calling `safeoutputs___create_pull_request`, run the **phase checkpoint** from `00-base-contract.md` with label `phase-07-final`. This snapshots the final authoritative analysis + article state to repo memory, so even if the PR call, the safe-outputs runner, or the post-job push fails, the last good state survives on the `memory/news-generation` branch.
> **Root cause**: The Copilot API session is bound to the `github.token` baked in at step start. That token expires at approximately **60 minutes** and is never refreshed mid-run (gh-aw issue #24920). Every tool call and inference request fails silently after that point — the agent appears to run but makes no progress and the PR is never created. Setup steps consume ~5 minutes, so the agent has at most **~55 minutes** of usable session time, and safe-outputs publishing needs several minutes on top.

For `news-translate`, run the checkpoint with label `phase-translate-<lang>` after each per-language batch succeeds (before the final PR call), so individual language translations are preserved even if later languages fail.
The target PR-creation window depends on which mode the run is in (see `03-data-download.md §Pre-flight`):

## Deadline enforcement
| Mode | Target PR window | Hard deadline |
|------|------------------|---------------|
| Run 1 — Analysis | 40–45 min after agent start | **48 min** |
| Run 2 — Articles | 20–25 min after agent start | **30 min** |

If the run exceeds 40 minutes with no safe-output call yet:
**If the run exceeds its hard deadline with no safe-output call yet:**

1. Stop analysis / article work immediately.
2. Stage whatever exists on disk.
3. Commit.
2. Stage whatever exists on disk (analysis artifacts and/or partial articles).
3. Commit with message including `[early-pr]` to signal partial content.
4. Call `safeoutputs___create_pull_request` with label `analysis-only` if articles are incomplete.
Comment on lines +103 to 117
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new deadline rule triggers an early PR after 25 minutes without a safe-output call, but the documented Analysis-mode run plan budgets ~43–45 minutes before PR creation. This would force most analysis runs to stop early and publish partial output, contradicting the two-run design and time budgets.

To align the guidance with the stated ~60-minute token expiry and timeout-minutes: 55, consider setting this threshold closer to the real safety margin (e.g., ~45–50 minutes after agent start, leaving time for staging + safe-outputs), and ensure the same threshold is used consistently in 00-base-contract.md.

Copilot uses AI. Check for mistakes.

Do not attempt to "save" work via a second PR — there is no second PR.
Do not attempt to "save" work via a second PR — there is no second PR. Creating the PR early is always better than losing all work to a token expiry. The hard deadlines above leave ~7 minutes of margin on the 55-minute `timeout-minutes` cap for staging and safe-outputs publishing before the ~60-minute Copilot API token expiry.
Loading
Loading