Skip to content

Commit 96bc2ca

Browse files
bpamiriPeter Amiri
andauthored
fix(cli): five deploy roll-up fixes from the 2957 wave-4 checklist (#3169)
- DEP-11a: same-version redeploy — force-remove a conflicting same-name container before docker run (exact-anchored name filter, xargs -r idempotent) and stop superseded versions best-effort after the kamal-proxy cutover (AppCommands.remove_conflicting / stop_old_versions, wired into DeployMainCli.$deploy). - DEP-11b: 'wheels deploy secrets extract' key match is now exact — CFML == is case-insensitive, so 'extract path' matched the PATH line. - DEP-11c: kamal-proxy config volume derives the remote home from the ssh user (/root for the default root user) instead of hardcoding /home/<user>. - SecretResolver: bounded stdout drain + deadline (default 60s, opts.timeoutSeconds) replaces the unbounded waitFor()/read-to-EOF — an interactively-blocking secrets command now kills bash and throws SecretResolver.ResolutionFailed with a clear message. - rollback() forwards {destination} into ConfigLoader.load — it was the last verb not applying the --destination overlay (same class as 3085). All five red-first specs plus full CLI suite green in the lucee7 docker harness (1015 pass / 0 fail; 2 known docker-env artifacts). Refs #2957 Signed-off-by: Peter Amiri <peter@alurium.com> Signed-off-by: Peter Amiri <petera@pai.com> Co-authored-by: Peter Amiri <petera@pai.com>
1 parent 5fb4450 commit 96bc2ca

11 files changed

Lines changed: 273 additions & 11 deletions

File tree

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
- `wheels deploy`: a same-version redeploy no longer hits a guaranteed `docker run --name` conflict — deploy now force-removes an existing same-name container (exact-anchored, idempotent) before each app run, and stops superseded versions best-effort after the kamal-proxy cutover instead of leaving every old container running forever (#2957 DEP-11a)
2+
- `wheels deploy secrets extract <key>` now matches the key case-sensitively, like Kamal — previously CFML's case-insensitive `==` let `extract path` return the `PATH` line's value (#2957 DEP-11b)
3+
- `wheels deploy`: the kamal-proxy config volume now derives the remote home from the ssh user (`/root` for the default root user, `/home/<user>` otherwise) instead of hardcoding the `/home/<user>` layout, which was wrong on every default config (#2957 DEP-11c)
4+
- `wheels deploy`: secret resolution from `.kamal/secrets` is now bounded by a timeout (default 60s, configurable via `timeoutSeconds`) — a command blocking on interactive input (e.g. `op read` prompting for sign-in) used to hang the deploy thread forever; it now kills the bash process and throws a clear `SecretResolver.ResolutionFailed` (#2957)
5+
- `wheels deploy rollback --destination=X` now applies the destination overlay — rollback was the last verb loading `deploy.yml` without forwarding `--destination`, so it targeted the base config's hosts (#2957; same defect class as the `config` fix in #3085)

cli/lucli/services/deploy/cli/DeployMainCli.cfc

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -142,6 +142,9 @@ component {
142142

143143
for (var role in cfg.roles()) {
144144
for (var host in role.hosts()) {
145+
// A same-version redeploy would hit a guaranteed
146+
// docker run --name conflict otherwise (#2957 DEP-11a).
147+
$dispatch([host], app.remove_conflicting(role, ver), dryRun);
145148
if (arrayLen(secretNames)) {
146149
$deliverEnvFile(
147150
[host],
@@ -163,6 +166,11 @@ component {
163166
dryRun
164167
);
165168
}
169+
// Post-cutover cleanup: stop superseded versions
170+
// (#2957 DEP-11a). Best-effort (allowFail) — a failed
171+
// stop of an OLD container must not fail an
172+
// otherwise-complete deploy of the NEW one.
173+
$dispatch([host], app.stop_old_versions(role, ver), dryRun, true);
166174
}
167175
}
168176
$dispatch(hosts, auditor.record("completed deploy of version " & ver), dryRun, true);
@@ -220,7 +228,13 @@ component {
220228
message = "rollback requires a version (pass opts.version)"
221229
);
222230
}
223-
var cfg = variables.loader.load(arguments.opts.configPath);
231+
// Forward the destination overlay — rollback() was the last
232+
// loader.load() call site not doing so (#2957; same defect class
233+
// as the config() fix in #3085).
234+
var cfg = variables.loader.load(
235+
arguments.opts.configPath,
236+
{destination: arguments.opts.destination ?: ""}
237+
);
224238
var app = new modules.wheels.services.deploy.commands.AppCommands(cfg);
225239
var proxy = new modules.wheels.services.deploy.commands.ProxyCommands(cfg);
226240
var dryRun = arguments.opts.dryRun ?: false;

cli/lucli/services/deploy/cli/DeploySecretsCli.cfc

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,10 @@ component {
6767
var eq = find("=", line);
6868
// eq > 1, not > 0: a line starting with '=' has no key, and
6969
// left(line, 0) crashes Lucee 7 (Cross-Engine Invariant 8).
70-
if (eq > 1 && left(line, eq - 1) == key) {
70+
// compare(), not ==: CFML == is case-insensitive, but env var
71+
// names are case-sensitive and Kamal's extract is an exact
72+
// match (#2957 DEP-11b).
73+
if (eq > 1 && compare(left(line, eq - 1), key) == 0) {
7174
return mid(line, eq + 1, 99999);
7275
}
7376
}

cli/lucli/services/deploy/commands/AppCommands.cfc

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,43 @@ component extends="Base" {
2929
);
3030
}
3131

32+
/**
33+
* Idempotent same-name conflict guard (#2957 DEP-11a): `docker run
34+
* --name X` hard-fails when ANY container — running or stopped —
35+
* already holds the name, which is the guaranteed state on a
36+
* same-version redeploy. The name filter is regex-anchored for an
37+
* exact match and `xargs -r` makes the whole pipeline a no-op
38+
* (exit 0) when nothing matches.
39+
*/
40+
public string function remove_conflicting(required any role, required string version) {
41+
return pipe([
42+
docker("container", "ls", "--all",
43+
"--filter name=^#container_name(arguments.role, arguments.version)#$",
44+
"--quiet"),
45+
"xargs -r docker container rm --force"
46+
]);
47+
}
48+
49+
/**
50+
* Stop every superseded version of this role's container after the
51+
* proxy cutover (#2957 DEP-11a) — previously old versions kept
52+
* running (and auto-restarting) forever. Scoped by the same
53+
* service/role/destination labels $labelArgs stamps on at run time;
54+
* the just-deployed container is excluded by exact name. `xargs -r`
55+
* keeps a first deploy (no old versions) a no-op.
56+
*/
57+
public string function stop_old_versions(required any role, required string version) {
58+
return pipe([
59+
docker("ps",
60+
"--filter label=service=#variables.config.service()#",
61+
"--filter label=role=#arguments.role.name()#",
62+
"--filter label=destination=#variables.config.destination()#",
63+
"--format {{.Names}}"),
64+
"grep -v '^#container_name(arguments.role, arguments.version)#$'",
65+
"xargs -r docker stop"
66+
]);
67+
}
68+
3269
public string function start(required any role, required string version) {
3370
return docker("start", container_name(arguments.role, arguments.version));
3471
}

cli/lucli/services/deploy/commands/ProxyCommands.cfc

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ component extends="Base" {
2727
"--network kamal",
2828
"--publish 80:80",
2929
"--publish 443:443",
30-
"--volume /home/#variables.config.ssh().user()#/.config/kamal-proxy:/home/kamal-proxy/.config/kamal-proxy",
30+
"--volume #$remoteHome()#/.config/kamal-proxy:/home/kamal-proxy/.config/kamal-proxy",
3131
variables.PROXY_IMAGE
3232
);
3333
}
@@ -91,4 +91,15 @@ component extends="Base" {
9191
public string function restart() {
9292
return docker("restart", variables.PROXY_CONTAINER_NAME);
9393
}
94+
95+
/**
96+
* Home directory of the deploy user on the remote host, for the proxy
97+
* config volume. The previous hardcoded `/home/<user>` was wrong for
98+
* the DEFAULT ssh user (root's home is /root) — #2957 DEP-11c.
99+
* compare(), not ==: unix usernames are case-sensitive.
100+
*/
101+
private string function $remoteHome() {
102+
var sshUser = variables.config.ssh().user();
103+
return compare(sshUser, "root") == 0 ? "/root" : "/home/" & sshUser;
104+
}
94105
}

cli/lucli/services/deploy/lib/SecretResolver.cfc

Lines changed: 46 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,11 @@ component {
3838
// (e.g. a non-standard Git Bash install on Windows). Also the
3939
// seam the BashUnavailable spec uses.
4040
variables.bashCmd = arguments.opts.bashCmd ?: "bash";
41+
// Upper bound on resolution, in seconds (#2957): a command inside
42+
// the secrets file that blocks on interactive input — `op read`
43+
// prompting for sign-in, `bw get` waiting to unlock — used to hang
44+
// the deploy thread forever on an unbounded waitFor()/stdout read.
45+
variables.timeoutSeconds = arguments.opts.timeoutSeconds ?: 60;
4146
variables.resolved = $loadAll();
4247
return this;
4348
}
@@ -165,7 +170,7 @@ component {
165170
// the stderr write while we block on the stdout read. A file sink
166171
// never fills, so bash always runs to completion. Secret values
167172
// travel on stdout (read in-memory); only diagnostics touch disk,
168-
// and the file is deleted in the finally block even when waitFor()
173+
// and the file is deleted in the finally block even when the drain
169174
// or the throw paths interrupt the happy path.
170175
var errPath = getTempFile(getTempDirectory(), "wheels-secret-err");
171176
try {
@@ -181,7 +186,9 @@ component {
181186
detail = "Secret resolution requires a local bash for $(cmd) expansion. On Windows, run inside WSL or Git Bash."
182187
);
183188
}
184-
var out = $readStream(proc.getInputStream());
189+
var out = $drainWithDeadline(proc);
190+
// The drain only returns once the process has exited, so this
191+
// waitFor() is an immediate exit-code read, not a blocking wait.
185192
var exitCode = proc.waitFor();
186193
var err = fileExists(errPath) ? fileRead(errPath, "UTF-8") : "";
187194
return {exitCode: exitCode, out: out, err: err};
@@ -190,12 +197,43 @@ component {
190197
}
191198
}
192199

193-
private string function $readStream(required any inputStream) {
194-
var scanner = createObject("java", "java.util.Scanner").init(arguments.inputStream, "UTF-8");
195-
scanner.useDelimiter("\A");
196-
var content = scanner.hasNext() ? scanner.next() : "";
197-
scanner.close();
198-
return content;
200+
/**
201+
* Drain the process's stdout to completion, bounded by
202+
* variables.timeoutSeconds (#2957). The previous shape — a blocking
203+
* read-to-EOF followed by an unbounded waitFor() — hung the deploy
204+
* thread forever when a command inside the secrets file blocked on
205+
* interactive input while holding stdout open. Polling `available()`
206+
* against a deadline bounds both hang shapes (silent never-exits AND
207+
* endless streamers); on expiry the bash process is force-killed and
208+
* a clear ResolutionFailed surfaces.
209+
*/
210+
private string function $drainWithDeadline(required any proc) {
211+
var deadlineAt = getTickCount() + (variables.timeoutSeconds * 1000);
212+
var stdoutStream = arguments.proc.getInputStream();
213+
var buffer = createObject("java", "java.io.ByteArrayOutputStream").init();
214+
while (arguments.proc.isAlive()) {
215+
if (getTickCount() >= deadlineAt) {
216+
arguments.proc.destroyForcibly();
217+
throw(
218+
type = "SecretResolver.ResolutionFailed",
219+
message = "Resolving .kamal/secrets timed out after " & variables.timeoutSeconds
220+
& " second(s) — a command inside the secrets file may be waiting for"
221+
& " interactive input (e.g. a credential manager prompting for sign-in)."
222+
& " The bash process was killed.",
223+
detail = "Sign in to the secret manager in a terminal first, or raise the limit"
224+
& " via SecretResolver init opts.timeoutSeconds for legitimately slow commands."
225+
);
226+
}
227+
var avail = stdoutStream.available();
228+
if (avail > 0) {
229+
buffer.writeBytes(stdoutStream.readNBytes(avail));
230+
} else {
231+
sleep(25);
232+
}
233+
}
234+
// Process exited — anything left in the pipe reads to EOF instantly.
235+
buffer.writeBytes(stdoutStream.readAllBytes());
236+
return buffer.toString("UTF-8");
199237
}
200238

201239
private string function $shellEscape(required string path) {

cli/lucli/tests/specs/deploy/cli/DeployMainCliSpec.cfc

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -961,6 +961,59 @@ component extends="wheels.wheelstest.system.BaseSpec" {
961961
expect(out).toInclude("--name demo-web-v1");
962962
});
963963

964+
// Regression suite for #2957 (Wave 4 roll-ups).
965+
// (DEP-11a) deploy() issued `docker run --name <service>-<role>-<version>`
966+
// with no prior conflict removal — a same-version redeploy hit a
967+
// guaranteed name conflict — and superseded containers were never
968+
// stopped after cutover. (rollback --destination) rollback() was the
969+
// last loader.load() call site not forwarding the destination overlay
970+
// (same defect class as the config() fix in #3085).
971+
972+
it("deploy removes a conflicting same-name container before docker run (##2957 DEP-11a)", () => {
973+
var fake = new cli.lucli.services.deploy.lib.FakeSshPool();
974+
var dc = new cli.lucli.services.deploy.cli.DeployMainCli(fake);
975+
dc.deploy({configPath: variables.fixture, version: "v1"});
976+
var cmds = $cmds(fake);
977+
var rmIdx = 0; var runIdx = 0;
978+
for (var i = 1; i <= arrayLen(cmds); i++) {
979+
if (!rmIdx && findNoCase("name=^demo-web-v1$", cmds[i])
980+
&& findNoCase("docker container rm --force", cmds[i])) rmIdx = i;
981+
// "--name demo-web-v1" pins the APP container run — the
982+
// proxy's start_or_run also embeds a "docker run --detach".
983+
if (!runIdx && findNoCase("docker run --detach", cmds[i])
984+
&& findNoCase("--name demo-web-v1", cmds[i])) runIdx = i;
985+
}
986+
expect(rmIdx).toBeGT(0);
987+
expect(runIdx).toBeGT(rmIdx);
988+
});
989+
990+
it("deploy stops superseded versions after the proxy cutover (##2957 DEP-11a)", () => {
991+
var fake = new cli.lucli.services.deploy.lib.FakeSshPool();
992+
var dc = new cli.lucli.services.deploy.cli.DeployMainCli(fake);
993+
dc.deploy({configPath: variables.fixture, version: "v1"});
994+
var cmds = $cmds(fake);
995+
var proxyIdx = 0; var stopOldIdx = 0;
996+
for (var i = 1; i <= arrayLen(cmds); i++) {
997+
if (!proxyIdx && findNoCase("kamal-proxy deploy", cmds[i])) proxyIdx = i;
998+
if (!stopOldIdx && findNoCase("grep -v '^demo-web-v1$'", cmds[i])
999+
&& findNoCase("xargs -r docker stop", cmds[i])) stopOldIdx = i;
1000+
}
1001+
expect(proxyIdx).toBeGT(0);
1002+
expect(stopOldIdx).toBeGT(proxyIdx);
1003+
});
1004+
1005+
it("a failing old-version stop does not fail an otherwise-complete deploy (##2957 DEP-11a)", () => {
1006+
var fake = new cli.lucli.services.deploy.lib.FakeSshPool();
1007+
var cfg = new cli.lucli.services.deploy.config.ConfigLoader().load(variables.fixture);
1008+
var app = new cli.lucli.services.deploy.commands.AppCommands(cfg);
1009+
fake.expect("1.2.3.4", app.stop_old_versions(cfg.roles()[1], "v1"), {
1010+
exitCode: 1, stdout: "", stderr: "Cannot connect to the Docker daemon"
1011+
});
1012+
var dc = new cli.lucli.services.deploy.cli.DeployMainCli(fake);
1013+
var out = dc.deploy({configPath: variables.fixture, version: "v1"});
1014+
expect(out).toInclude("Deployed");
1015+
});
1016+
9641017
// Regression suite for #2957 (Wave 3 — observability). (DEP-6a) the
9651018
// dispatch closures dropped every ssh.run() result, so read verbs
9661019
// (`audit`, `details`) returned only host-count summaries in live
@@ -1059,6 +1112,34 @@ component extends="wheels.wheelstest.system.BaseSpec" {
10591112
expect(out).toInclude("Deployed");
10601113
});
10611114

1115+
it("rollback --dry-run applies the destination overlay (##2957, same class as ##3085)", () => {
1116+
var base = getTempFile(getTempDirectory(), "yml");
1117+
fileWrite(
1118+
base,
1119+
"service: demo#chr(10)#image: acme/demo#chr(10)#servers: [192.0.2.10]"
1120+
& "#chr(10)#registry: {username: u, password: [X]}"
1121+
);
1122+
var overlay = new cli.lucli.services.deploy.config.ConfigLoader()
1123+
.$overlayPathFor(base, "staging");
1124+
fileWrite(overlay, "servers:#chr(10)# - 192.0.2.99");
1125+
try {
1126+
var dc = new cli.lucli.services.deploy.cli.DeployMainCli(
1127+
new cli.lucli.services.deploy.lib.FakeSshPool()
1128+
);
1129+
var out = dc.rollback({
1130+
configPath: base,
1131+
destination: "staging",
1132+
version: "v1",
1133+
dryRun: true
1134+
});
1135+
expect(out).toInclude("192.0.2.99");
1136+
expect(out).notToInclude("192.0.2.10");
1137+
} finally {
1138+
fileDelete(base);
1139+
fileDelete(overlay);
1140+
}
1141+
});
1142+
10621143
// Regression for #2671 — git's stderr ("fatal: not a git repository...") used to leak through as the version string.
10631144
it("$gitShortSha() returns 'unknown' when run outside a git repo", () => {
10641145
var nonGitDir = getTempDirectory() & "/wheels-2671-main-" & createUUID();

cli/lucli/tests/specs/deploy/cli/DeploySecretsCliSpec.cfc

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,17 @@ component extends="wheels.wheelstest.system.BaseSpec" {
5656
expect(cli.extract({key: "", from: "FOO=bar"})).toBe("");
5757
});
5858

59+
// #2957 DEP-11b — CFML `==` is case-insensitive, so `extract path`
60+
// matched the PATH line. Kamal's extract is an exact (case-sensitive)
61+
// key match; env var names are case-sensitive in every shell.
62+
it("extract key match is case-sensitive (##2957 DEP-11b)", () => {
63+
var cli = new cli.lucli.services.deploy.cli.DeploySecretsCli();
64+
var block = "PATH=/usr/bin" & chr(10) & "path=lower-value";
65+
expect(cli.extract({key: "path", from: block})).toBe("lower-value");
66+
expect(cli.extract({key: "PATH", from: block})).toBe("/usr/bin");
67+
expect(cli.extract({key: "Path", from: block})).toBe("");
68+
});
69+
5970
it("extract skips malformed lines that start with '='", () => {
6071
// left(line, 0) on a keyless '=...' line crashes Lucee 7
6172
// (Cross-Engine Invariant 8); such lines must be skipped.

cli/lucli/tests/specs/deploy/commands/AppCommandsSpec.cfc

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -161,6 +161,32 @@ component extends="wheels.wheelstest.system.BaseSpec" {
161161
expect(state.message).notToInclude("PRESENT");
162162
});
163163

164+
// #2957 DEP-11a — `docker run --name X` hard-fails when ANY container
165+
// (running or stopped) already holds the name, which is the guaranteed
166+
// state on a same-version redeploy. The conflict guard must be exact-name
167+
// anchored and a no-op when nothing matches (xargs -r).
168+
it("remove_conflicting() force-removes an existing same-name container, idempotently (##2957 DEP-11a)", () => {
169+
var cmd = new cli.lucli.services.deploy.commands.AppCommands(variables.cfg)
170+
.remove_conflicting(variables.cfg.roles()[1], "v1");
171+
expect(cmd).toInclude("docker container ls --all");
172+
expect(cmd).toInclude("--filter name=^demo-web-v1$");
173+
expect(cmd).toInclude("--quiet");
174+
expect(cmd).toInclude("xargs -r docker container rm --force");
175+
});
176+
177+
// #2957 DEP-11a — superseded containers were never stopped after
178+
// cutover, so every old version kept running (and restarting) forever.
179+
it("stop_old_versions() stops same-role containers except the current version (##2957 DEP-11a)", () => {
180+
var cmd = new cli.lucli.services.deploy.commands.AppCommands(variables.cfg)
181+
.stop_old_versions(variables.cfg.roles()[1], "v2");
182+
expect(cmd).toInclude("docker ps");
183+
expect(cmd).toInclude("--filter label=service=demo");
184+
expect(cmd).toInclude("--filter label=role=web");
185+
expect(cmd).toInclude("--filter label=destination=");
186+
expect(cmd).toInclude("grep -v '^demo-web-v2$'");
187+
expect(cmd).toInclude("xargs -r docker stop");
188+
});
189+
164190
it("remove() chains docker stop and docker rm", () => {
165191
var cmd = new cli.lucli.services.deploy.commands.AppCommands(variables.cfg)
166192
.remove(variables.cfg.roles()[1], "v9");

0 commit comments

Comments
 (0)