You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
scripts: forward ADMIN_* env vars to remote SSH heredoc (follow-up to #669) (#678)
## Summary
Follow-up to #669 (now merged). The admin-flag plumbing I added in
that PR has a real bug I caught while re-reading the script: every
rollout would crash on the first remote node because the `ADMIN_*`
variables were never forwarded across the SSH heredoc.
## What was broken
`build_admin_flags` lives inside the remote SSH heredoc in
`update_one_node` (`bash -s <<'REMOTE'`), but the `env` block that
seeds the remote shell only forwarded the existing `IMAGE` /
`RAFT_PORT` / `EXTRA_ENV` / etc. variables — no `ADMIN_*`. With
`set -u` active on the remote, the first access of
`${ADMIN_ENABLED}` inside `build_admin_flags` crashes the rollout
**regardless** of whether admin is enabled (the helper is invoked
unconditionally from `run_container`).
So an operator running this script after #669 with the default
`ADMIN_ENABLED=false` would have seen `ADMIN_ENABLED: unbound variable`
on the first node touched, leaving at most one node restarted but
otherwise the cluster intact (the per-node health check would
exit non-zero before moving on).
## Fix
1. **Forward all 9 `ADMIN_*` variables through `env`**, alongside
the existing forwarding pattern. Path-like values (`*_FILE`,
`*_KEYS`, `ADDRESS`) get `printf %q` quoting at the bottom of
the local script (matches the existing `RAFT_TO_REDIS_MAP_Q`
etc. pattern). The three boolean flags (`ADMIN_ENABLED`,
`ADMIN_ALLOW_PLAINTEXT_NON_LOOPBACK`, `ADMIN_ALLOW_INSECURE_DEV_COOKIE`)
are forwarded unquoted for readability — but only after a
local validation pass that rejects anything other than the
literal `true` / `false` so the unquoted forwarding stays
metacharacter-safe.
2. **Defense-in-depth `:-` defaults inside `build_admin_flags`.**
Every `ADMIN_*` reference inside the helper now reads through
`${VAR:-}` once at the top into a local. A future refactor
that ever drops one of the forwarded variables will produce
the targeted "ADMIN_* required" error instead of an opaque
`unbound variable` crash with no hint at which variable.
## Test plan
- [x] `bash -n scripts/rolling-update.sh` — passes
- [x] `ADMIN_ENABLED=invalid bash scripts/rolling-update.sh` →
"must be 'true' or 'false', got 'invalid'"
- [x] `ADMIN_ALLOW_PLAINTEXT_NON_LOOPBACK=yes` → same validator catches
it
- [x] `ADMIN_ENABLED=true` (no signing key) → reaches the remote branch
where build_admin_flags would 'aborting' on the missing key
- [ ] End-to-end rollout against a 3-node staging cluster with
`ADMIN_ENABLED=true` (operator to verify before merging into
a production deploy.env)
- [ ] End-to-end rollout with `ADMIN_ENABLED=false` (the previously
broken default path)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Chores**
* Improved validation and error reporting for deployment configuration
parameters.
* Enhanced environment variable forwarding and path handling in
deployment scripts to increase reliability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
0 commit comments