Skip to content

Commit 3395506

Browse files
committed
up
1 parent 434666a commit 3395506

3 files changed

Lines changed: 132 additions & 51 deletions

File tree

.github/scripts/module-cleanup/finalize.sh

Lines changed: 17 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -14,9 +14,11 @@
1414
# the queue is empty, cut a batch branch from wip and open the PR.
1515
# Wip is NOT reset — each chain has its own wip branch, so the next
1616
# chain starts fresh on a different wip.
17-
# 4. Self-dispatch the workflow unless we just opened a PR or the
18-
# queue is empty (cron will pick up later). Threads WIP_BRANCH so
19-
# the next run in the chain reuses the same wip.
17+
# 4. Self-dispatch the workflow unless the queue is empty. After a
18+
# PR is opened, dispatch WITHOUT inherited WIP_BRANCH so the next
19+
# run starts a fresh chain on a fresh wip. The chain stops on its
20+
# own once MAX_OPEN_PRS is reached (matrix script returns
21+
# has_work=false and finalize is skipped).
2022
#
2123
# No rebase-retry loops on push: the workflow uses
2224
# concurrency.group=module-cleanup with cancel-in-progress=false, so this
@@ -199,10 +201,19 @@ fi
199201

200202
# ---- 4. Self-dispatch ----
201203

202-
if [ "$OPENED_PR" = "true" ]; then
203-
echo "Opened a PR; cron will resume the chain on its next tick."
204-
elif [ "$QUEUE_REMAINING" -le 0 ]; then
204+
# The chain auto-continues past PR open: after flushing, we self-dispatch
205+
# WITHOUT inheriting WIP_BRANCH so the next run starts on a fresh
206+
# per-chain wip (we don't want subsequent commits piling onto a wip that
207+
# already underlies an open PR). The chain naturally terminates when
208+
# build-cleanup-matrix.py sees MAX_OPEN_PRS reached and returns
209+
# has_work=false, at which point neither agent nor finalize runs and no
210+
# self-dispatch fires. Cron picks back up later.
211+
212+
if [ "$QUEUE_REMAINING" -le 0 ] && [ "$OPENED_PR" != "true" ]; then
205213
echo "Queue empty; nothing to dispatch."
214+
elif [ "$OPENED_PR" = "true" ]; then
215+
echo "Opened a PR; self-dispatching to start a fresh chain (new wip)."
216+
gh workflow run "$WORKFLOW_FILE" --repo "$REPO" --ref main
206217
else
207218
echo "Self-dispatching workflow for next module on $WIP_BRANCH."
208219
gh workflow run "$WORKFLOW_FILE" --repo "$REPO" --ref main \

.github/workflows/module-cleanup.lock.yml

Lines changed: 63 additions & 29 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

.github/workflows/module-cleanup.md

Lines changed: 52 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,18 @@
11
---
22
description: |
33
Walks instrumentation modules one-at-a-time, processing exactly one
4-
module per run. Each successful run's commit is appended to a
5-
per-chain `module-cleanup-wip-<chain_id>` branch. When the wip branch
6-
reaches FILE_THRESHOLD modified files (or when the unprocessed-module
7-
queue empties), the finalize job promotes wip to a `module-cleanup-batch-<run_id>`
8-
branch and opens a PR against main. Otherwise the workflow self-dispatches
9-
to process the next module, threading the same wip branch.
10-
11-
Each chain (cron tick or manual workflow_dispatch) gets its own wip
12-
branch named after the first run's id. If a chain dies mid-flight (e.g.
13-
PR creation fails), its wip is simply abandoned — the next cron tick
14-
starts a fresh chain on a fresh wip. Old wip and batch branches can be
15-
garbage-collected manually.
4+
module per run. Each successful run's commit is appended to a per-chain
5+
`module-cleanup-wip-<run_id>` branch and the workflow self-dispatches
6+
one module at a time. When the wip branch reaches FILE_THRESHOLD
7+
modified files (or when the unprocessed-module queue empties), the
8+
finalize job promotes wip to a `module-cleanup-batch-<run_id>` branch,
9+
opens a PR against main, and self-dispatches a fresh chain (new wip).
10+
The chain terminates naturally once MAX_OPEN_PRS is reached: the
11+
matrix script returns has_work=false and no further runs are
12+
Cron (every 1h) picks back up after PRs merge: when a PR merges and
13+
the open-PR count drops below MAX_OPEN_PRS, the next hourly tick
14+
starts a fresh chain. A cron tick that fires while a chain is alive
15+
is gated to a no-op so chains never fork in parallel.
1616
1717
State:
1818
- `memory/module-cleanup` branch holds `processed.txt` (modules already
@@ -31,10 +31,12 @@ on:
3131
required: false
3232
type: string
3333
schedule:
34-
# Walk-driver: each cron tick starts (or resumes) the chain. Inside a
35-
# tick, the workflow self-dispatches one module at a time until either
36-
# a PR is opened, the open-PR cap is hit, or the queue empties.
37-
- cron: "every 6h"
34+
# Cron is the primary chain driver. Each tick attempts to start a new
35+
# chain, but the dispatch job exits if another module-cleanup run is
36+
# already queued or in progress, so we never fork parallel chains.
37+
# Inside a chain, the workflow self-dispatches one module at a time
38+
# until a PR is opened, the open-PR cap is hit, or the queue empties.
39+
- cron: "every 1h"
3840

3941
permissions: read-all
4042

@@ -109,8 +111,42 @@ jobs:
109111
env:
110112
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
111113
MEMORY_BRANCH: memory/module-cleanup
114+
EVENT_NAME: ${{ github.event_name }}
115+
WORKFLOW_FILE: module-cleanup.lock.yml
112116
run: |
113117
set -euo pipefail
118+
# Limit concurrency to one chain alive at a time. Cron fires
119+
# hourly so that PRs merging mid-day promptly reopen the
120+
# MAX_OPEN_PRS slot, but we don't want a hourly cron tick to
121+
# fork a parallel chain on top of an already-running one. If
122+
# any other module-cleanup run is queued or in progress, the
123+
# cron tick exits cleanly. Self-dispatched runs (which carry
124+
# an inherited wip_branch) skip this gate so an in-flight
125+
# chain always makes forward progress.
126+
#
127+
# GitHub's run-state is the source of truth here, so there's
128+
# no stale-lock problem: a dead/cancelled run is reported as
129+
# completed and the next cron tick passes the gate.
130+
if [ "$EVENT_NAME" = "schedule" ]; then
131+
others=$(gh run list --repo "$GITHUB_REPOSITORY" \
132+
--workflow "$WORKFLOW_FILE" \
133+
--status queued --status in_progress \
134+
--limit 50 --json databaseId \
135+
| jq --arg me "$GITHUB_RUN_ID" \
136+
'[.[] | select((.databaseId|tostring) != $me)] | length')
137+
echo "other queued/in_progress runs: $others"
138+
if [ "$others" -gt 0 ]; then
139+
echo "Chain already alive; cron exiting to avoid fork."
140+
{
141+
echo "has_work=false"
142+
echo "short_name="
143+
echo "module_dir="
144+
echo "queue_remaining=0"
145+
} >> "$GITHUB_OUTPUT"
146+
exit 0
147+
fi
148+
fi
149+
114150
# processed.txt lives at the root of the memory branch.
115151
processed=""
116152
if git fetch origin "$MEMORY_BRANCH" --depth=1 2>/dev/null; then

0 commit comments

Comments
 (0)