|
1 | 1 | --- |
2 | 2 | description: | |
3 | 3 | Walks instrumentation modules one-at-a-time, processing exactly one |
4 | | - module per run. Each successful run's commit is appended to a |
5 | | - per-chain `module-cleanup-wip-<chain_id>` branch. When the wip branch |
6 | | - reaches FILE_THRESHOLD modified files (or when the unprocessed-module |
7 | | - queue empties), the finalize job promotes wip to a `module-cleanup-batch-<run_id>` |
8 | | - branch and opens a PR against main. Otherwise the workflow self-dispatches |
9 | | - to process the next module, threading the same wip branch. |
10 | | -
|
11 | | - Each chain (cron tick or manual workflow_dispatch) gets its own wip |
12 | | - branch named after the first run's id. If a chain dies mid-flight (e.g. |
13 | | - PR creation fails), its wip is simply abandoned — the next cron tick |
14 | | - starts a fresh chain on a fresh wip. Old wip and batch branches can be |
15 | | - garbage-collected manually. |
| 4 | + module per run. Each successful run's commit is appended to a per-chain |
| 5 | + `module-cleanup-wip-<run_id>` branch and the workflow self-dispatches |
| 6 | + one module at a time. When the wip branch reaches FILE_THRESHOLD |
| 7 | + modified files (or when the unprocessed-module queue empties), the |
| 8 | + finalize job promotes wip to a `module-cleanup-batch-<run_id>` branch, |
| 9 | + opens a PR against main, and self-dispatches a fresh chain (new wip). |
| 10 | + The chain terminates naturally once MAX_OPEN_PRS is reached: the |
| 11 | + matrix script returns has_work=false and no further runs are |
| 12 | + Cron (every 1h) picks back up after PRs merge: when a PR merges and |
| 13 | + the open-PR count drops below MAX_OPEN_PRS, the next hourly tick |
| 14 | + starts a fresh chain. A cron tick that fires while a chain is alive |
| 15 | + is gated to a no-op so chains never fork in parallel. |
16 | 16 |
|
17 | 17 | State: |
18 | 18 | - `memory/module-cleanup` branch holds `processed.txt` (modules already |
|
31 | 31 | required: false |
32 | 32 | type: string |
33 | 33 | schedule: |
34 | | - # Walk-driver: each cron tick starts (or resumes) the chain. Inside a |
35 | | - # tick, the workflow self-dispatches one module at a time until either |
36 | | - # a PR is opened, the open-PR cap is hit, or the queue empties. |
37 | | - - cron: "every 6h" |
| 34 | + # Cron is the primary chain driver. Each tick attempts to start a new |
| 35 | + # chain, but the dispatch job exits if another module-cleanup run is |
| 36 | + # already queued or in progress, so we never fork parallel chains. |
| 37 | + # Inside a chain, the workflow self-dispatches one module at a time |
| 38 | + # until a PR is opened, the open-PR cap is hit, or the queue empties. |
| 39 | + - cron: "every 1h" |
38 | 40 |
|
39 | 41 | permissions: read-all |
40 | 42 |
|
@@ -109,8 +111,42 @@ jobs: |
109 | 111 | env: |
110 | 112 | GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} |
111 | 113 | MEMORY_BRANCH: memory/module-cleanup |
| 114 | + EVENT_NAME: ${{ github.event_name }} |
| 115 | + WORKFLOW_FILE: module-cleanup.lock.yml |
112 | 116 | run: | |
113 | 117 | set -euo pipefail |
| 118 | + # Limit concurrency to one chain alive at a time. Cron fires |
| 119 | + # hourly so that PRs merging mid-day promptly reopen the |
| 120 | + # MAX_OPEN_PRS slot, but we don't want a hourly cron tick to |
| 121 | + # fork a parallel chain on top of an already-running one. If |
| 122 | + # any other module-cleanup run is queued or in progress, the |
| 123 | + # cron tick exits cleanly. Self-dispatched runs (which carry |
| 124 | + # an inherited wip_branch) skip this gate so an in-flight |
| 125 | + # chain always makes forward progress. |
| 126 | + # |
| 127 | + # GitHub's run-state is the source of truth here, so there's |
| 128 | + # no stale-lock problem: a dead/cancelled run is reported as |
| 129 | + # completed and the next cron tick passes the gate. |
| 130 | + if [ "$EVENT_NAME" = "schedule" ]; then |
| 131 | + others=$(gh run list --repo "$GITHUB_REPOSITORY" \ |
| 132 | + --workflow "$WORKFLOW_FILE" \ |
| 133 | + --status queued --status in_progress \ |
| 134 | + --limit 50 --json databaseId \ |
| 135 | + | jq --arg me "$GITHUB_RUN_ID" \ |
| 136 | + '[.[] | select((.databaseId|tostring) != $me)] | length') |
| 137 | + echo "other queued/in_progress runs: $others" |
| 138 | + if [ "$others" -gt 0 ]; then |
| 139 | + echo "Chain already alive; cron exiting to avoid fork." |
| 140 | + { |
| 141 | + echo "has_work=false" |
| 142 | + echo "short_name=" |
| 143 | + echo "module_dir=" |
| 144 | + echo "queue_remaining=0" |
| 145 | + } >> "$GITHUB_OUTPUT" |
| 146 | + exit 0 |
| 147 | + fi |
| 148 | + fi |
| 149 | +
|
114 | 150 | # processed.txt lives at the root of the memory branch. |
115 | 151 | processed="" |
116 | 152 | if git fetch origin "$MEMORY_BRANCH" --depth=1 2>/dev/null; then |
|
0 commit comments