Commit efb8328
* docs(updater): PR 2 (Tier 2 manual-click) implementation plan
20-task TDD plan for shipping the manual-click update flow on top of the
Tier 1 (notify) work merged in #7601. Covers UpdateExecutor, RollbackHandler,
SessionDrainer, lock + trustedKeys, four admin endpoints (apply / cancel /
acknowledge / log), admin UI updates, integration tests against a tmp git
repo, and a manual smoke runbook for the spec's "before each tier ships"
gate. Plan deliberately scopes signature verification to an opt-in stub
(updates.requireSignature: false default) to avoid blocking on a separate
release-signing project.
Plan: docs/superpowers/plans/2026-05-08-auto-update-pr2-manual-click.md
Spec: docs/superpowers/specs/2026-04-25-auto-update-design.md
Issue: #7607
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(updater): extend state + settings for Tier 2 manual-click
Adds ExecutionStatus discriminated union, bootCount, and lastResult to
UpdateState, plus the preApplyGraceMinutes/drainSeconds/diskSpaceMinMB/
requireSignature/trustedKeysPath knobs that Tier 2's executor needs.
loadState backfills the new fields on Tier 1 state files so existing
installs keep working.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(updater): PID-based update.lock with stale-pid reaping
Single-flight guard for Tier 2's UpdateExecutor. Atomic O_CREAT|O_EXCL
acquire; on EEXIST, sends signal 0 to the recorded PID and reaps if dead.
Unparseable / partially-written lock files are treated as stale rather
than fatal so a half-written lock from a SIGKILL'd parent doesn't lock
the install out forever.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(updater): verifyReleaseTag — gpg-via-git stub for Tier 2 preflight
Default updates.requireSignature=false: log a warning and return ok with
reason=signature-not-required. Set true to make preflight refuse a tag
whose signature does not verify under the system keyring (or
trustedKeysPath via GNUPGHOME). Etherpad's release process does not yet
sign tags consistently; turning the check on by default would break
Tier 2 for every admin and forcing a release-signing change is out of
scope for this PR.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(updater): preflight check pipeline for Tier 2
Pure orchestrator over injected probes for install-method, working tree,
disk space, pnpm presence, lock state, remote tag existence and
signature verification. Cheap-and-definitive checks run first; first
failure short-circuits with a typed reason that the route layer will
surface in the preflight-failed admin banner.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(updater): rolling update.log helpers (appendLine + tailLines)
Direct file-append + size-based rotation rather than a log4js appender —
avoids re-configuring log4js on top of the user's existing logconfig.
appendLine creates parents, rotates at 10MB (configurable), keeps 5
backups by default. tailLines reads the last N lines for /admin/update/log.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(updater): SessionDrainer + handshake guard
Drainer schedules T-60 / -30 / -10 broadcasts and resolves at T=0;
isAcceptingConnections() flips off for the duration. PadMessageHandler
consults the flag at the start of CLIENT_READY and disconnects new
joiners with reason "updateInProgress" — existing sockets are
unaffected. Drains shorter than 30s collapse the early timers to fire
ASAP rather than queue past the drain end.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(updater): UpdateExecutor — snapshot, fetch/checkout/install/build, exit 75
Pure-DI orchestrator: spawnFn, copyFile, readSha, saveState, exit are all
injected so unit tests run the full pipeline without spawning real
children or mutating the real install. Streams stdout/stderr to
update.log via the now-best-effort appendLine helper (swallows fs errors
so the executor itself never breaks on read-only / unwritable log dirs).
Failure paths transition to rolling-back and return — the route layer
hands off to RollbackHandler which owns the rollback exit, so we don't
double-exit and lose tail lines.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(updater): RollbackHandler — health-check timer + crash-loop guard
checkPendingVerification arms a 60s timer at boot when state is
pending-verification and increments bootCount; bootCount>2 forces an
immediate rollback (crash-loop guard). markVerified persists the
verified state and stops the timer. performRollback restores the
backup lockfile, runs git checkout <fromSha> and pnpm install, lands on
rolled-back or rollback-failed (terminal) on sub-step failure, exits 75
either way so the supervisor restart brings the new state up.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(updater): wire RollbackHandler into boot + UpdatePolicy honours rollback-failed
- expressCreateServer now invokes checkPendingVerification before polling starts
so a previous boot's pending-verification either re-arms the health-check
timer or, when bootCount has climbed past the crash-loop threshold, forces
an immediate rollback.
- server.ts calls markBootHealthy after state hits RUNNING so /health-being-up
is the implicit happy-path signal that cancels the rollback timer.
- /admin/update/status surfaces execution + lastResult + lockHeld so the admin
UI can render the right Apply / Cancel / Acknowledge state.
- UpdatePolicy gains an `executionStatus` input. While it equals 'rollback-failed',
canAuto / canAutonomous are denied (reason: rollback-failed-terminal); manual
stays on because clicking Apply IS the intervention the terminal state needs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(updater): apply / cancel / acknowledge / log endpoints
Strict admin-only POSTs that drive Tier 2's manual-click flow:
- POST /admin/update/apply: acquire lock, persist preflight, run preflight,
drain $drainSeconds, executeUpdate (which exits 75 on success), or run
performRollback on a failure path (also exits 75).
- POST /admin/update/cancel: cancel a pre-execute drain/preflight, write
cancelled lastResult, release lock.
- POST /admin/update/acknowledge: clear terminal states (preflight-failed,
rolled-back, rollback-failed) back to idle. lastResult is preserved so
the admin still sees what happened.
- GET /admin/update/log: tail var/log/update.log (200 lines) for the in-
progress UI. Strict admin auth.
Also:
- socketio hook exports getIo() so the apply endpoint can broadcast the
drain shoutMessage outside the regular hook surface.
- ep.json registers updateActions after admin/updateStatus.
- 11 mocha integration tests cover auth, policy denial, execution-busy,
acknowledge-clears-terminal, log content-type.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(updater): admin UI Apply/Cancel/Acknowledge + live log stream
UpdatePage renders the right action set based on execution.status:
Apply when idle/verified and policy allows, Cancel during
preflight/draining, Acknowledge on terminal preflight-failed /
rolled-back / rollback-failed. While the executor is in flight
(preflight/draining/executing/rolling-back) the page polls
/admin/update/log + /admin/update/status once a second and shows the
rolling tail; polling stops automatically when the run terminates.
lastResult and policy denial reasons surface localised copy. Buttons
disable themselves while a network round-trip is in flight to dodge
double-clicks. New i18n keys live under update.page.{apply,cancel,
acknowledge,log,execution,policy.*,last_result.*}, update.execution.*,
update.banner.terminal.rollback-failed, and update.drain.{t60,t30,t10}.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(updater): pad shoutMessage renders update.drain.* via html10n
broadcastShout now sends {messageKey, values, sticky} so the existing
pad-side shout pipeline can route through html10n.get(). The renderer
gains a values pass-through so update.drain.t60 etc. interpolate
{{seconds}}, and gives updater shouts a different gritter title (the
banner.title localised string) so users know it's a system event
rather than a generic admin message.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(updater): rollback uses git checkout -f + integration suite over tmp git repo
RollbackHandler now does git checkout -f <fromSha> BEFORE overlaying the
backup lockfile. Without -f, git refuses checkout when there are
unstaged modifications to files it would overwrite — exactly the case
after a partial executor run that mutated the working tree. With -f the
partial mutation is discarded and the working tree returns to fromSha
cleanly. The backup-lockfile copy is still done (belt-and-braces) but
tolerates ENOENT since checkout already restored the right lockfile.
The new integration suite at src/tests/backend/specs/updater-integration.ts
exercises the full pipeline against a disposable git repo: happy path,
install-fail rollback, build-fail rollback, crash-loop guard, and a
target-sha-doesn't-exist rollback-failed terminal case. 5 mocha tests.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(updater): Playwright admin Apply / Cancel / Acknowledge flow
Stubs /admin/update/status (and /admin/update/apply for the apply path)
at the route level so we can assert UI transitions without actually
running an update. Four scenarios:
- Apply button POSTs and re-fetches status (>=2 status fetches total).
- install-method-not-writable hides the button and shows localised
denial copy.
- rollback-failed terminal state shows the Acknowledge button and the
"Manual intervention required" lastResult copy.
- lockHeld=true hides Apply even when policy.canManual is on.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(updater): admin banner shows rollback-failed terminal alert
When execution.status === 'rollback-failed' the banner switches to a
role=alert with the strong update.banner.terminal.rollback-failed copy
and overrides the regular "update available" framing — an admin who
left the system in this state needs to fix it before any other admin
work matters. Other terminal states (preflight-failed, rolled-back) are
informational and surface on the page itself, not the banner.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(updater): Tier 2 admin docs + manual smoke runbook + CHANGELOG
doc/admin/updates.md gains a full Tier 2 section: prerequisites
(git install + process supervisor with sample systemd unit), Apply
flow with timings, every failure mode and the resulting state, the
four endpoints, and the signature-verification opt-in. Settings
table picks up the new updates.* knobs.
docs/superpowers/specs/2026-04-25-auto-update-runbook.md is the
manual smoke runbook the design spec calls for: disposable VM,
systemd unit, every observable transition (happy path, install/
build-fail rollback, crash-loop guard, rollback-failed terminal,
cancel during drain) plus a sign-off checklist for the release cut.
CHANGELOG Unreleased section explains the supervisor requirement
and points readers at the runbook.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(updater): note docker-friendly update flows as follow-up work
Tier 2 refuses Apply on installMethod=docker because in-container
mutation doesn't survive a container restart. Adds a future-work note
covering the two reasonable paths for an in-product docker Apply
button (instructions-only vs deploy-webhook) and explicitly rules out
mounting /var/run/docker.sock as a footgun. Watchtower gets a pointer
for admins who want fully autonomous docker updates today.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(updater): address Qodo review (1-6) + Playwright strict-mode CI fix
1. Tier 2 endpoints now gate on tier in {manual, auto, autonomous} —
notify and off return 404 to match the prior PR-1 behaviour. Gate is
evaluated per-request via app.use middleware so a settings.json reload
takes effect without a full restart, and so integration tests can flip
the tier dynamically. Adds a regression test that exercises 404 at
tier=notify across all four endpoints.
2. cancel/apply race fixed: /admin/update/cancel no longer releases the
lock — apply's finally block owns it for the request's lifetime. Apply
now reloads state after preflight and aborts with 409 cancelled-during-
preflight if execution.status is no longer 'preflight' for the same
targetTag. Prevents a second apply from sneaking in while the first is
still running its slow checks, and prevents the post-cancel apply from
continuing into drain/execute.
3. SessionDrainer now restores acceptingConnections=true at drain
completion (not just on cancel). The lock + persisted execution.status
prevent a fresh apply from racing in — the in-memory flag was redundant
safety that turned into a wedge if the executor threw post-drain. Adds
a unit test asserting the flag is restored after natural drain end.
4. PadMessageHandler drain guard switched from socket.json.send (a
socket.io v2/v3 API that may not exist on v4) to socket.emit('message',
...) for consistency with the other disconnect paths in the file.
5. Spawn 'error' handlers added to runStep helpers in UpdateExecutor and
RollbackHandler, plus the gpg verify-tag spawn in trustedKeys. Without
them, a missing/unexecutable binary leaves the promise hanging forever
and the update flow stuck in-flight. SpawnFn type extended to allow
on('error', ...) listeners cleanly. Spawn errors now resolve with code
1 + the error message in stderr, so the existing failure-detection
branches fire normally.
6. executeUpdate body wrapped in try/catch. An exception from readSha,
saveState, copyFile, or any step now lands in a rolling-back persist +
returns failed-checkout, so the route's post-executor rollback path
picks it up. State can no longer wedge at 'executing'. The catch's
inner saveState is itself try/wrapped so a write-after-write failure
doesn't crash the route either.
CI: Playwright update-page-actions strict-mode violation fixed. Both the
banner and the lastResult <p> contain "Manual intervention required";
selector now scopes to p.last-result-rollback-failed for the lastResult
assertion specifically.
129 vitest unit tests + 23 mocha integration tests passing; ts-check clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(updater): address Qodo #7 (status leak) + #8 (short-drain values)
#7. /admin/update/status now redacts diagnostic strings for unauth callers
even when requireAdminForStatus is left at its default (false). Status
enum + outcome enum are kept (the admin banner / pad-side badge need them
to render the right UI) but execution.reason / execution.fromSha /
execution.targetTag and the same fields on lastResult are stripped.
Authed admin sessions still get the full payload — they're looking at
their own server's diagnostics. Two new mocha tests cover both paths:
"redacts execution.reason / lastResult.reason for unauth callers" and
"returns full diagnostic payload to authed admin sessions".
#8. SessionDrainer no longer schedules T-30 / T-10 broadcasts when the
configured drainSeconds can't honour them. Previously, with drainSeconds
< 30 the T-30 timer fired at zero remaining but the broadcast still
claimed "30 seconds" — misleading. Now T-30 only schedules when
drainSeconds > 30 and T-10 only when > 10. Admins picking a short drain
get fewer announcements but each carries an accurate countdown. The
opening announcement now reports the configured drain length rather
than a hardcoded 60. Two updated unit tests: drainSeconds=15 (skips
T-30, still fires T-10) and drainSeconds=5 (skips both).
131 vitest unit + 26 mocha integration tests passing; ts-check clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(updater): address Qodo follow-up — tag injection, rollback rejections, state validation
Qodo posted three new concerns after the first fix push.
1. Git tag option injection (security). The release tag from GitHub's
tag_name flowed into `git checkout` / `git verify-tag` as a positional
arg. A tag starting with '-' would be parsed as an option and could
bypass signature verification or change checkout semantics. Mitigated
in three layers:
- New refSafety helper (isValidTag / assertValidTag / refsTagsForm)
enforces a strict subset of git's check-ref-format spec: rejects
leading '-' or '.', whitespace, control chars, and ~ ^ : ? * [ \\
and the '..' sequence.
- VersionChecker validates tag_name before persisting to state, so a
malformed value from a misconfigured githubRepo never lands on disk.
- UpdateExecutor calls assertValidTag and uses the refs/tags/<tag>
form for git checkout. trustedKeys also validates and adds '--' to
git verify-tag for an end-of-options marker. updateActions does an
up-front isValidTag check on state.latest.tag so a corrupt state
file gets a clean 409 instead of a 500.
2. Unhandled rollback rejections. checkPendingVerification was firing
`void deps.saveState(...)` and `void performRollback(...)` without
.catch(), so an fs error during boot's rollback path would bubble out
as an unhandled rejection. Both callsites now go through fireSaveState
/ fireRollback helpers that catch and log; rollback rejections fall
through to a best-effort terminal-state write + exit 75 so the
supervisor can re-try the next boot with bootCount++.
3. Execution state under-validated. isValidExecution previously checked
only that `status` was a known enum value, so a hand-edited state file
with `{execution: {status: 'pending-verification'}}` (missing fromSha
/ targetTag / deadlineAt) would pass validation and reach
RollbackHandler with undefined refs. The validator now consults a
per-status required-fields map mirroring the ExecutionStatus union in
types.ts and rejects empty strings as well as missing fields. Same
tightening applied to lastResult.outcome (must be in the allowed enum,
not just any string). Six new unit tests cover hand-edited corruption.
145 vitest + 26 mocha tests green; ts-check clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 4f1b524 commit efb8328
45 files changed
Lines changed: 6991 additions & 47 deletions
File tree
- admin/src
- components
- pages
- store
- docs/superpowers
- plans
- specs
- doc/admin
- src
- locales
- node
- handler
- hooks/express
- updater
- utils
- static/js
- tests
- backend-new/specs/updater
- backend/specs
- frontend-new/admin-spec
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
1 | 13 | | |
2 | 14 | | |
3 | 15 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
20 | | - | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
21 | 35 | | |
22 | 36 | | |
23 | 37 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| 12 | + | |
| 13 | + | |
12 | 14 | | |
13 | 15 | | |
14 | 16 | | |
15 | 17 | | |
| 18 | + | |
| 19 | + | |
16 | 20 | | |
17 | 21 | | |
18 | 22 | | |
19 | 23 | | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
20 | 44 | | |
21 | 45 | | |
22 | 46 | | |
23 | | - | |
24 | | - | |
25 | | - | |
26 | | - | |
27 | | - | |
28 | | - | |
29 | | - | |
30 | | - | |
31 | | - | |
32 | | - | |
33 | | - | |
34 | | - | |
35 | | - | |
36 | | - | |
37 | | - | |
38 | | - | |
39 | | - | |
40 | | - | |
| 47 | + | |
41 | 48 | | |
42 | | - | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
43 | 81 | | |
44 | 82 | | |
45 | 83 | | |
| |||
61 | 99 | | |
62 | 100 | | |
63 | 101 | | |
64 | | - | |
| 102 | + | |
65 | 103 | | |
66 | 104 | | |
67 | 105 | | |
68 | | - | |
| 106 | + | |
69 | 107 | | |
70 | 108 | | |
71 | 109 | | |
72 | 110 | | |
73 | 111 | | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
74 | 118 | | |
75 | 119 | | |
76 | 120 | | |
| |||
86 | 130 | | |
87 | 131 | | |
88 | 132 | | |
| 133 | + | |
| 134 | + | |
89 | 135 | | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
90 | 180 | | |
91 | 181 | | |
92 | 182 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
6 | 26 | | |
7 | 27 | | |
8 | 28 | | |
| |||
18 | 38 | | |
19 | 39 | | |
20 | 40 | | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
21 | 45 | | |
22 | 46 | | |
23 | 47 | | |
| |||
45 | 69 | | |
46 | 70 | | |
47 | 71 | | |
| 72 | + | |
| 73 | + | |
48 | 74 | | |
49 | 75 | | |
50 | 76 | | |
| |||
70 | 96 | | |
71 | 97 | | |
72 | 98 | | |
| 99 | + | |
| 100 | + | |
73 | 101 | | |
0 commit comments