You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
No step may be skipped, reordered, or executed in parallel with its successor.
47
47
48
+
## Phase checkpoint — persist every phase to repo memory
49
+
50
+
Valuable analysis must never be lost. After each pipeline phase completes, snapshot its output to the gh-aw repo-memory mount at `$GH_AW_MEMORY_DIR` (runtime default `/tmp/gh-aw/repo-memory/default`). gh-aw pushes that directory to the `memory/news-generation` branch in a **separate post-job** — so checkpoints survive even if the content PR job fails, crashes, or times out.
51
+
52
+
### Mandatory checkpoint points
53
+
54
+
| After phase | Phase label | Source(s) |
55
+
|-------------|-------------|-----------|
56
+
| 03 Data download |`phase-03-download`|`$ANALYSIS_DIR` (manifest + fetched data summaries) |
| 07 Immediately before `create_pull_request`|`phase-07-final`|`$ANALYSIS_DIR` + articles from `news/${ARTICLE_DATE}-*.html`|
62
+
|`news-translate` per batch |`phase-translate-<lang>`| Translated `news/${ARTICLE_DATE}-*.html`|
63
+
64
+
Each checkpoint is mandatory. Skipping them forfeits the only cross-run safety net for analysis work.
65
+
66
+
### Reusable snippet
67
+
68
+
Run this bash block at the end of every phase (pass the phase label as `$1`). Article HTML is written directly under the flat `news/` directory, so checkpoint copies must use `news/${ARTICLE_DATE}-*.html` rather than `news/$YYYY/$MM/$DD/*.html`:
|**Never block on checkpoint failure** — always `exit 0`. | Repo-memory is a safety net, not a gate. |
99
+
| Do **not** copy `$ANALYSIS_DIR/documents/` or `$ANALYSIS_DIR/pass1/`. |`documents/` exceeds the 50-file push cap; `pass1/` is local gate evidence only. |
100
+
| Do **not** stage or commit anything under `$GH_AW_MEMORY_DIR`. | gh-aw's `push_repo_memory` post-job publishes it; see `07-commit-and-pr.md`. |
101
+
| Prefer small summary `.md` / `.json` files (≤ 50 KB each, ≤ 50 per push). | gh-aw silently drops files exceeding the push caps. |
102
+
| Re-run the snippet at every phase, even if earlier phases already snapshotted — it overwrites with the latest content. | Ensures the final state is always preserved, and earlier snapshots remain on the branch from prior runs. |
103
+
| For `news-translate`, use `SUBFOLDER=batch/<lang-or-batch-id>` so memory paths don't collide with analysis runs. | Keeps the branch organised by article type. |
104
+
48
105
## Output contract
49
106
50
107
- Commit real files on disk under `analysis/daily/` and/or `news/`.
Copy file name to clipboardExpand all lines: .github/prompts/01-bash-and-shell-safety.md
+16-1Lines changed: 16 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -32,10 +32,25 @@ bash({
32
32
33
33
Parameter expansion (`${VAR}`, `${VAR:-x}`, `${VAR##*/}`, …) and command substitution (`$(cmd)`) are **safe** under the agentic-workflow firewall — the firewall inspects outbound network egress, not shell syntax. Process substitution `<(…)` is best avoided because some runners disable `/dev/fd`.
34
34
35
+
## Banned expansion patterns (sandbox blocklist)
36
+
37
+
The execution sandbox **rejects** commands containing any of the following patterns before they run. Rewrite using the safe equivalent instead of trying to work around the block.
|`${var@P}` / `${var@Q}` / `${var@E}` / `${var@A}` / `${var@a}`| These parameter transformations can produce shell-reparsable fragments or syntax-bearing representations, which can smuggle attacker-controlled content into later parsing steps — a known prompt-injection vector. | Expand the target explicitly: `printf '%s' "$var"`, or a plain `"$var"` substitution. |
42
+
|`${!var}`| Indirect expansion uses `$var`'s *value* as another variable name — lets an attacker-controlled string pick which variable is read. | Use an associative array: `declare -A MAP; MAP[foo]=bar; echo "${MAP[$key]}"`. |
43
+
| Nested `$(…$(…)…)`| Builds a command string dynamically from inner results — the classic staged injection shape. | Split into two lines with a temporary variable: `inner=$(cmd2); outer=$(cmd1 "$inner")`. |
44
+
| Chained builder assignments that progressively construct a command substitution (`a=foo; b="$a"bar; c=$($b)`) | Same staged-injection shape, just spread across multiple statements. | Construct commands as arrays, invoke via `"${cmd[@]}"`; never re-parse a string as a command. |
45
+
|`eval` on variable contents (or eval-like constructs such as `bash -c "$var"`, `source /dev/stdin <<<"$var"`) | Direct arbitrary-code execution from data. | Never required for our workflows — refuse and rewrite using arrays, `case`, or explicit branches. |
46
+
47
+
These rules apply equally to inline bash in prompts AND to bash commands the agent composes at runtime. The sandbox rejects matching commands before they run.
48
+
35
49
## Secret safety
36
50
37
51
- Never pass secrets through `$(…)` into a log-visible command — echoing `curl -H "Authorization: $(…)"` will leak if the step is rerun in debug.
38
-
- Use env blocks (`env:` on the step) or `${{ secrets.FOO }}` directly; the runner masks secret values in output.
52
+
- Expose secrets through the step's `env:` block (for example `env: { FOO: <GitHub-Actions secrets expression for FOO> }`) rather than inlining a raw secrets expression inside the prompt; the runner masks secret values in output.
53
+
- Note: this prompt file is loaded via `runtime-import`, and the gh-aw validator rejects any GitHub Actions template expression (the double-curly-brace syntax used for `secrets`, `env`, `inputs`, etc. in workflow YAML) that is not on the safe allow-list — so never embed such an expression in prompt modules, even inside code spans. Keep secret references in the workflow YAML only.
Copy file name to clipboardExpand all lines: .github/prompts/03-data-download.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -69,3 +69,5 @@ Always produce `analysis/daily/$ARTICLE_DATE/$SUBFOLDER/data-download-manifest.m
69
69
## Next step
70
70
71
71
On success, proceed to `04-analysis-pipeline.md`. Never start analysis while `data-download-manifest.md` is missing or empty.
72
+
73
+
After the manifest is written, run the **phase checkpoint** from `00-base-contract.md` with label `phase-03-download` so the download provenance is persisted to repo memory before any analysis begins.
Copy file name to clipboardExpand all lines: .github/prompts/04-analysis-pipeline.md
+3-1Lines changed: 3 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -66,7 +66,9 @@ For each article with charts, produce accompanying JSON under `analysis/daily/$A
66
66
67
67
## Next step
68
68
69
-
On completion, proceed to `05-analysis-gate.md`. Do not start article generation until the gate passes.
69
+
Proceed to `05-analysis-gate.md`. Do not start article generation until the gate passes.
70
+
71
+
After completing Pass 1 (before Pass 2), run the **phase checkpoint** from `00-base-contract.md` with label `phase-04-pass1`. After completing Pass 2 (before the gate), run it again with label `phase-04-pass2`. This guarantees both iterations survive even if the gate, article, or commit phase later fails.
Copy file name to clipboardExpand all lines: .github/prompts/05-analysis-gate.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -153,7 +153,7 @@ Exit code 0 = pass, non-zero = fail with per-check report. Precondition for chec
153
153
154
154
## Outcome
155
155
156
-
-**Pass** → proceed to `06-article-generation.md`.
156
+
-**Pass** → run the **phase checkpoint** from `00-base-contract.md` with label `phase-05-gate`, then proceed to `06-article-generation.md`.
157
157
-**Fail** → fix flagged files (never delete them), re-run the gate, then proceed.
158
158
-**Unrecoverable fail after fixes** → stage whatever analysis exists, commit with label `analysis-only`, call `safeoutputs___create_pull_request` once (see `07-commit-and-pr.md`). Do **not** generate articles.
Copy file name to clipboardExpand all lines: .github/prompts/06-article-generation.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -84,4 +84,4 @@ Each article ≥ 1000 words, minimum 3 of 5 mandatory analytical sections presen
84
84
85
85
## Next step
86
86
87
-
Stage all analysis + article + visualisation files, then call `07-commit-and-pr.md`.
87
+
Run the **phase checkpoint** from `00-base-contract.md` with label `phase-06-article` to persist the generated articles to repo memory. Then stage all analysis + article + visualisation files, and call `07-commit-and-pr.md`.
In every other case, commit whatever exists and call `create_pull_request` once.
91
91
92
+
## Final checkpoint — before the PR call
93
+
94
+
Immediately before calling `safeoutputs___create_pull_request`, run the **phase checkpoint** from `00-base-contract.md` with label `phase-07-final`. This snapshots the final authoritative analysis + article state to repo memory, so even if the PR call, the safe-outputs runner, or the post-job push fails, the last good state survives on the `memory/news-generation` branch.
95
+
96
+
For `news-translate`, run the checkpoint with label `phase-translate-<lang>` after each per-language batch succeeds (before the final PR call), so individual language translations are preserved even if later languages fail.
97
+
92
98
## Deadline enforcement
93
99
94
100
If the run exceeds 40 minutes with no safe-output call yet:
0 commit comments