Skip to content

Fix news-translate workflow losing translations after first PR call#1838

Merged
pethers merged 4 commits intomainfrom
copilot/improve-article-translations
Apr 19, 2026
Merged

Fix news-translate workflow losing translations after first PR call#1838
pethers merged 4 commits intomainfrom
copilot/improve-article-translations

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 19, 2026

Root Cause Analysis — PR #1835

PR #1835 (news/translate/2026-04-18/breaking-1705-batch1) translated only 4 files (da, no, de, fi) for 2026-04-18-breaking-1705-en.html. The agent produced 7 complete translations locally but only 4 shipped to main.

Why: 3 compounding bugs in .github/workflows/news-translate.md

  1. create-pull-request.max: 1 — agent could not open a second PR for the next batch.
  2. Misleading prompt text — no warning that safeoutputs___create_pull_request freezes the patch at call time.
  3. Speed target too optimistic — prompt said 2–3 min/language; Opus-4.7 actually takes 4–8 min on 400+ line HTML.

Checklist

  • Analyse PR feat(translations): translate 2026-04-18-breaking-1705 to da, no, de, fi #1835, identify 3 root causes in news-translate.md
  • create-pull-request.max: 5 — frontmatter + lock file propagation
  • Rewrite RULE 1, Time Budget, Batch Strategy, Step 5, Error Handling, Execution Summary for rolling batches
  • Recompile news-translate.lock.yml with gh aw compile
  • Round 1 review fixes (9ca1aee): nb standardisation; fail-loud git checkout main in two places; correct max-patch-size unit interpretation (KB → 4 MB).
  • Round 2 review fixes (81e59a5): remove // ... inline comment from example JSON payload; replace {n+1} arithmetic placeholder with plain prose; add "substitute every {placeholder}" warning.
  • Round 3 review fixes (commit pending):
    • Time-budget narrative said "3 rolling PR batches", mismatching create-pull-request.max: 5. Reworded to "typically 2–3 rolling PR batches (up to 5 allowed)" so agents don't artificially stop at 3 when more PRs are warranted.
    • Error-handling row for "Committed files on a branch that already has a PR" said "Those files are lost" — too absolute and would trigger unnecessary re-translation. Reworded to explain the files are merely not included in the existing PR and can usually be re-applied/cherry-picked from the previous branch or workspace onto main for a fresh batch PR.

@github-actions github-actions bot added documentation Documentation updates workflow GitHub Actions workflows ci-cd CI/CD pipeline changes news News articles and content generation agentic-workflow Agentic workflow changes size-m Medium change (50-250 lines) labels Apr 19, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🏷️ Automatic Labeling Summary

This PR has been automatically labeled based on the files changed and PR metadata.

Applied Labels: documentation,workflow,ci-cd,size-m,news,agentic-workflow

Label Categories

  • 🗳️ Content: news, dashboard, visualization, intelligence
  • 💻 Technology: html-css, javascript, workflow, security
  • 📊 Data: cia-data, riksdag-data, data-pipeline, schema
  • 🌍 I18n: i18n, translation, rtl
  • 🔒 ISMS: isms, iso-27001, nist-csf, cis-controls
  • 🏗️ Infrastructure: ci-cd, deployment, performance, monitoring
  • 🔄 Quality: testing, accessibility, documentation, refactor
  • 🤖 AI: agent, skill, agentic-workflow

For more information, see .github/labeler.yml.

@github-actions
Copy link
Copy Markdown
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

@pethers pethers marked this pull request as ready for review April 19, 2026 08:28
Copilot AI review requested due to automatic review settings April 19, 2026 08:28
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Updates the news-translate agent prompt and safe-outputs configuration to prevent translations from being dropped after the first safeoutputs___create_pull_request call by enabling multiple PRs and guiding the agent to use rolling batches.

Changes:

  • Increased create-pull-request.max from 1 → 5 and surfaced that limit in the generated prompt/tools metadata.
  • Rewrote RULE 1 and the time/batch strategy to reflect that PR patches are snapshotted (“frozen”) at tool-call time and require multiple PRs per run.
  • Updated performance/time targets and PR template to support batched PR creation (batch numbering, checkout guidance).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
.github/workflows/news-translate.md Adjusts agent instructions to use rolling batches and multiple PRs; documents snapshot behavior and updated time budgets.
.github/workflows/news-translate.lock.yml Regenerates compiled workflow prompt/config so the agent sees create_pull_request(max:5) and safe-outputs allows up to 5 PRs.

Comment thread .github/workflows/news-translate.md Outdated
**The #1 cause of lost work was misunderstanding how `safeoutputs___create_pull_request` actually works.** The tool captures the patch from your **current commits at the moment you call it**. Any files you create or commit *after* that call on the same branch are **NOT** added to the PR — they are **silently lost** when the ephemeral agent workspace is discarded. Multiple commits to the same local branch after the first call do **NOT** update the PR.

> 🚨 **PRODUCTION INCIDENT**: On 2026-04-14, 10 translated files were committed locally but the agent delayed `safeoutputs___create_pull_request` until minute ~50. The safeoutputs session had expired. Every retry returned "session not found". The `detection` and `safe_outputs` jobs were SKIPPED. All 10 files were lost.
> 🚨 **PRODUCTION INCIDENT (2026-04-19, PR #1835)**: The agent translated 7 languages (`da`, `no`, `de`, `fi`, `fr`, `es`, `nl`) for `2026-04-18-breaking-1705`. After committing 4 languages at minute 21, it called `safeoutputs___create_pull_request` — patch frozen. It then translated `fr`, `es`, `nl` on the same branch believing "they'll be included in the PR". **They were not.** Only 4/7 translations reached `main`; 3 complete translations were discarded.
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This incident list uses no, but the rest of the workflow consistently uses nb (and batch definitions also reference nb). If file naming/path conventions are nb, this inconsistency can cause the agent to generate the wrong locale code in filenames or language lists. Standardize the language code in the incident narrative to match the repo’s actual locale naming (likely nb).

Suggested change
> 🚨 **PRODUCTION INCIDENT (2026-04-19, PR #1835)**: The agent translated 7 languages (`da`, `no`, `de`, `fi`, `fr`, `es`, `nl`) for `2026-04-18-breaking-1705`. After committing 4 languages at minute 21, it called `safeoutputs___create_pull_request` — patch frozen. It then translated `fr`, `es`, `nl` on the same branch believing "they'll be included in the PR". **They were not.** Only 4/7 translations reached `main`; 3 complete translations were discarded.
> 🚨 **PRODUCTION INCIDENT (2026-04-19, PR #1835)**: The agent translated 7 languages (`da`, `nb`, `de`, `fi`, `fr`, `es`, `nl`) for `2026-04-18-breaking-1705`. After committing 4 languages at minute 21, it called `safeoutputs___create_pull_request` — patch frozen. It then translated `fr`, `es`, `nl` on the same branch believing "they'll be included in the PR". **They were not.** Only 4/7 translations reached `main`; 3 complete translations were discarded.

Copilot uses AI. Check for mistakes.
Comment thread .github/workflows/news-translate.md Outdated
**Do NOT put everything into one PR.** Opening a new PR for each 3–4 language batch is the only way to avoid losing work — `create-pull-request.max: 5` in the frontmatter supports this.

**Counting rule**: Before calling safe output, count your translated files. If fewer than 5, explain why in the PR body (e.g., "all today's articles already translated, working on backlog").
**Counting rule**: Before each `safeoutputs___create_pull_request` call, count the files staged in the *current* batch. Aim for 3–4 files per PR. Never let a single PR exceed 4 translated files (safe-output patch size: 4 KB headroom).
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guidance ties batch sizing to a “4 KB headroom” but the safe-outputs config limit shown in the compiled workflow is max_patch_size: 4096 (i.e., ~4 KB total, not ‘headroom’). With 400–500 line HTML (as referenced later), a single translation file’s diff is likely to exceed 4 KB, making 3–4 files per PR infeasible and potentially causing repeated PR creation failures. Recommend aligning the prompt with the actual constraint: either (a) increase the configured max_patch_size to a value that can accommodate the intended batch sizes, or (b) change the batching guidance to size by patch bytes (or to fewer files per PR) rather than a fixed 3–4 files.

Copilot uses AI. Check for mistakes.
Comment thread .github/workflows/news-translate.md Outdated
# After safeoutputs___create_pull_request succeeds for batch N
git status --short news/
# Switch back to main so the next batch starts from a clean base
git checkout main 2>/dev/null || git checkout -
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback git checkout - is not guaranteed to land on a clean base (it just toggles to the previous branch, which could be the same PR branch depending on user history/commands). Since this snippet is meant to prevent stacking commits onto the already-snapshotted branch, it should be deterministic: check out the repo’s default branch explicitly (e.g., main only) and, if unavailable, instruct the agent to resolve the branch name rather than silently toggling.

Suggested change
git checkout main 2>/dev/null || git checkout -
git checkout main 2>/dev/null || { echo "Default branch 'main' is unavailable. Resolve the repository's default branch explicitly before continuing."; exit 1; }

Copilot uses AI. Check for mistakes.
Comment thread .github/workflows/news-translate.md Outdated
```bash
# Return to main so batch N+1 starts from a clean base.
# safeoutputs___create_pull_request will create a fresh branch for the next batch automatically.
git checkout main 2>/dev/null || true
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silently ignoring a failure to switch back to main defeats the safety goal of ensuring the next batch doesn’t build on the prior PR’s commit graph. If git checkout main fails, the agent may remain on the previous branch and continue committing, recreating the ‘stacking onto frozen patch’ failure mode you’re trying to eliminate. Prefer failing loudly (or using a reliable fallback that still guarantees a clean base) so the agent can correct course before starting the next batch.

Suggested change
git checkout main 2>/dev/null || true
git checkout main || { echo "ERROR: failed to switch back to main; aborting before next batch." >&2; exit 1; }
[ "$(git branch --show-current)" = "main" ] || { echo "ERROR: repository is not on main after checkout; aborting before next batch." >&2; exit 1; }

Copilot uses AI. Check for mistakes.
@pethers
Copy link
Copy Markdown
Member

pethers commented Apr 19, 2026

@copilot apply changes based on the comments in this thread

…ax-patch-size units

Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/233df121-7c44-4391-a7e1-b3d74b38beae

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 19, 2026

@copilot apply changes based on the comments in this thread

Applied all 4 review fixes in 9ca1aee:

  • Line 331 — standardised nonb in the PR feat(translations): translate 2026-04-18-breaking-1705 to da, no, de, fi #1835 incident narrative to match the rest of the file (batch definitions, lang="nb" HTML attribute, BCP-47).
  • Line 361 — replaced git checkout main 2>/dev/null || git checkout - with explicit git checkout main || exit 1 plus a git branch --show-current = main verification, so the agent fails loudly instead of silently toggling to an arbitrary branch.
  • Line 780 — same fail-loud pattern replacing git checkout main 2>/dev/null || true; silently ignoring this failure would regress the very "stacking onto frozen patch" bug the section is designed to prevent.
  • Line 446 — removed the incorrect "safe-output patch size: 4 KB headroom" wording. The max-patch-size: 4096 value is KB-denominated (= 4 MB); PR feat(translations): translate 2026-04-18-breaking-1705 to da, no, de, fi #1835 actually shipped ~160 KB with 4 files. Updated the counting rule to cite the real 4 MB limit and typical ~40 KB/file size so the agent reasons about the actual constraint.

gh aw compile news-translate passes with 0 errors / 0 warnings. Lock file is unchanged — these are all prompt-body edits imported at runtime via {{#runtime-import .github/workflows/news-translate.md}}.

Copilot AI requested a review from pethers April 19, 2026 08:39
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Comment on lines +771 to 774
"title": "🌐 Article Translations - {date} batch {n} ({count} files)",
"body": "## Summary\n\nTranslated {article_type} articles into {count} languages (batch {n} of up to 3).\n\n### Translations\n- Source: EN\n- Languages (this batch): {lang_list}\n- Files: {count}\n- Method: AI translation (create tool)\n\n### Quality\n- Section headings: ✅ Translated\n- Body paragraphs: ✅ Translated\n- English leakage: ✅ None\n- HTMLHint: {htmlhint_status}\n\n### Source\n- Workflow: `news-translate`\n- Follow-up: batches {n+1}+ will ship as separate PRs",
"labels": ["agentic-news", "translation"] // add "needs-review" if HTMLHINT_FAILED=true
})
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example safeoutputs___create_pull_request payload is shown as JSON, but it contains a // ... inline comment on the labels line. If the agent copies this as-is, the tool call will fail to parse. Move the note outside the JSON block or express it in a JSON-compatible way (e.g., in surrounding prose).

Copilot uses AI. Check for mistakes.
Comment thread .github/workflows/news-translate.md Outdated
"title": "🌐 Article Translations - {date} ({count} files)",
"body": "## Summary\n\nTranslated {article_type} articles into {count} languages.\n\n### Translations\n- Source: EN\n- Languages: {lang_list}\n- Files: {count}\n- Method: AI translation (create tool)\n\n### Quality\n- Section headings: ✅ Translated\n- Body paragraphs: ✅ Translated\n- English leakage: ✅ None\n- HTMLHint: {htmlhint_status}\n\n### Source\n- Workflow: `news-translate`",
"title": "🌐 Article Translations - {date} batch {n} ({count} files)",
"body": "## Summary\n\nTranslated {article_type} articles into {count} languages (batch {n} of up to 3).\n\n### Translations\n- Source: EN\n- Languages (this batch): {lang_list}\n- Files: {count}\n- Method: AI translation (create tool)\n\n### Quality\n- Section headings: ✅ Translated\n- Body paragraphs: ✅ Translated\n- English leakage: ✅ None\n- HTMLHint: {htmlhint_status}\n\n### Source\n- Workflow: `news-translate`\n- Follow-up: batches {n+1}+ will ship as separate PRs",
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example PR body text includes batches {n+1}+ which looks like an expression placeholder but won’t be automatically expanded. This is likely to end up literally in the PR description. Consider rewording to avoid arithmetic placeholders (e.g., "subsequent batches will ship as separate PRs") or use a plain placeholder name that can be manually substituted.

Suggested change
"body": "## Summary\n\nTranslated {article_type} articles into {count} languages (batch {n} of up to 3).\n\n### Translations\n- Source: EN\n- Languages (this batch): {lang_list}\n- Files: {count}\n- Method: AI translation (create tool)\n\n### Quality\n- Section headings: ✅ Translated\n- Body paragraphs: ✅ Translated\n- English leakage: ✅ None\n- HTMLHint: {htmlhint_status}\n\n### Source\n- Workflow: `news-translate`\n- Follow-up: batches {n+1}+ will ship as separate PRs",
"body": "## Summary\n\nTranslated {article_type} articles into {count} languages (batch {n} of up to 3).\n\n### Translations\n- Source: EN\n- Languages (this batch): {lang_list}\n- Files: {count}\n- Method: AI translation (create tool)\n\n### Quality\n- Section headings: ✅ Translated\n- Body paragraphs: ✅ Translated\n- English leakage: ✅ None\n- HTMLHint: {htmlhint_status}\n\n### Source\n- Workflow: `news-translate`\n- Follow-up: subsequent batches will ship as separate PRs",

Copilot uses AI. Check for mistakes.
@pethers
Copy link
Copy Markdown
Member

pethers commented Apr 19, 2026

@copilot apply changes based on the comments in this thread

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 19, 2026

@copilot apply changes based on the comments in this thread

Applied both review fixes in 81e59a5:

  • Line 773 — removed the // add "needs-review" if HTMLHINT_FAILED=true inline comment from the JSON payload (strict JSON has no comments). Moved the conditional-label instruction into the surrounding prose above the block.
  • Line 772 — replaced the batches {n+1}+ will ship as separate PRs arithmetic placeholder with plain prose (subsequent batches will ship as separate PRs); the substitution engine does not evaluate {n+1}, so it would have appeared literally in the PR description.
  • Added a note above the JSON block clarifying that every {placeholder} must be substituted before sending, and that comments / trailing commas / arithmetic are not allowed in the tool call.

gh aw compile news-translate → 0 errors / 0 warnings. Lock file unchanged (runtime import).

@github-actions
Copy link
Copy Markdown
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Comment thread .github/workflows/news-translate.md Outdated
Comment on lines +411 to +413
## ⏱️ Time Budget (55 minutes of work, hard stop at 55)

The budget is now organised around **3 rolling PR batches**, not one monolithic PR. This is the direct fix for PR #1835 where 3 completed translations (`fr`, `es`, `nl`) were lost because they were committed *after* the single `safeoutputs___create_pull_request` call had frozen the patch.
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hard-codes the workflow narrative to 3 rolling batches, but the workflow is configured for create-pull-request.max: 5 and elsewhere instructs agents they may open up to 5 PRs per run. This mismatch can cause agents to stop at 3 even when time remains (or when more than 3 PRs are needed for multiple articles/types). Consider changing wording to "typically 2–3 batches" while explicitly allowing up to 5, and align the rest of the document (including templates) with max: 5.

Copilot uses AI. Check for mistakes.
Comment thread .github/workflows/news-translate.md Outdated
| HTMLHint errors | Fix with `edit` tool or run `npx tsx scripts/article-quality-enhancer.ts --fix` |
| safeoutputs "session not found" | Session expired — ALL work is LOST. **Prevention: call `safeoutputs___create_pull_request` by minute 25.** |
| safeoutputs "session not found" | Session expired — all uncreated PR intents are LOST. **Prevention: call `safeoutputs___create_pull_request` for the first batch by minute 22.** Later batches must also be called promptly (never more than 15 minutes between successive calls). |
| Committed files on a branch that already has a PR | Those files are lost — the patch was frozen at the first `safeoutputs___create_pull_request` call. Switch to `main`, re-create the translations fresh, and call `safeoutputs___create_pull_request` again for a new PR. |
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saying "Those files are lost" is a bit absolute and may prompt unnecessary re-translation. More precisely: those commits won't be added to the already-created PR, but they may still be recoverable within the current workspace by creating a new snapshot/PR containing only the additional files (e.g., by moving back to main and re-applying the changes via cherry-pick/recommit so the new PR doesn’t include files already shipped in prior batches). Recommend rewording to: "Not included in the existing PR; to ship them create a new batch PR from main containing only the new translations" and optionally mention a recovery approach that avoids redoing translation.

Suggested change
| Committed files on a branch that already has a PR | Those files are lost — the patch was frozen at the first `safeoutputs___create_pull_request` call. Switch to `main`, re-create the translations fresh, and call `safeoutputs___create_pull_request` again for a new PR. |
| Committed files on a branch that already has a PR | Not included in the existing PR — the patch was frozen at the first `safeoutputs___create_pull_request` call. Switch to `main` and create a new batch PR containing only the new translations; if the changes still exist locally, re-apply/cherry-pick or recommit them so you do not need to re-translate from scratch. |

Copilot uses AI. Check for mistakes.
@pethers
Copy link
Copy Markdown
Member

pethers commented Apr 19, 2026

@copilot apply changes based on the comments in this thread

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 19, 2026

@copilot apply changes based on the comments in this thread

Applied both review fixes in 347ca3c:

  • Line 413 — reworded the time-budget intro from "3 rolling PR batches" to "typically 2–3 rolling PR batches (up to 5 allowed by create-pull-request.max: 5)" and added a note that additional batch PRs can be opened up to the 5-per-run cap when workload warrants it. This aligns the narrative with the actual safe-outputs limit and prevents agents from artificially stopping at 3.
  • Line 852 — replaced the too-absolute "Those files are lost — … re-create the translations fresh" wording with "Not included in the existing PR … switch to main and create a new batch PR; if the files still exist in the workspace (or on the previous branch), re-apply/cherry-pick or recommit them onto main so you don't need to re-translate from scratch." This avoids triggering unnecessary re-translation when the files are still recoverable.

gh aw compile news-translate → 0 errors / 0 warnings. Lock file unchanged (runtime import).

@github-actions
Copy link
Copy Markdown
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

@pethers pethers merged commit a90b097 into main Apr 19, 2026
12 checks passed
@pethers pethers deleted the copilot/improve-article-translations branch April 19, 2026 09:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agentic-workflow Agentic workflow changes ci-cd CI/CD pipeline changes documentation Documentation updates news News articles and content generation size-m Medium change (50-250 lines) workflow GitHub Actions workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants