Skip to content

Fix config and CLI correctness/safety bugs in pilotctl#313

Merged
TeoSlayer merged 2 commits into
mainfrom
fix/pilotctl-config-cli-safety
Jun 22, 2026
Merged

Fix config and CLI correctness/safety bugs in pilotctl#313
TeoSlayer merged 2 commits into
mainfrom
fix/pilotctl-config-cli-safety

Conversation

@TeoSlayer

Copy link
Copy Markdown
Collaborator

Fixes a batch of config-handling and CLI correctness/safety bugs in pilotctl, almost all in cmd/pilotctl/main.go.

Changes

  1. Atomic + 0600 config writessaveConfig serializes to a temp file in the config dir, chmod 0600, fsync, then rename over config.json. A crash mid-write can no longer leave a truncated or world-readable secrets file, and pre-existing loose permissions are not preserved.
  2. loadConfig surfaces errors — a missing config stays silent (defaults apply); a permission/I-O error or a corrupt/half-written file is logged via slog.Warn instead of silently vanishing settings.
  3. Honor PILOT_HOMEconfigDir checks PILOT_HOME before $HOME, so config/socket/PID state can be relocated.
  4. set-hostname / clear-hostname report save failures — a failed saveConfig now aborts with an error instead of letting persisted config diverge from the running daemon.
  5. Pipe-mode stdin scanner — buffer raised to 16 MiB and scanner.Err() is checked, so long lines and read errors no longer fail silently.
  6. send-message --json --wait single document — the reply is folded into the one JSON envelope (data.reply) instead of printing a second concatenated JSON doc that breaks machine parsers. Human (non-JSON) output is unchanged.
  7. bench size validation — rejects non-positive, NaN/Inf, and absurd sizes, caps at 4 GiB, and validates before dialing the daemon (fail-fast).
  8. traceroute checks conn.Write — a write failure mid-trace is recorded and breaks the loop instead of blocking in Read for an echo that will never arrive.
  9. parseFlags accepts dash-leading values — new isFlagValue lets --key -1, --key -3.14, --key - (stdin), and --key -3x pass through, while --name/-name tokens are still treated as the next flag.
  10. daemon stop without a PID file — when the socket is live but the PID file is missing, the owning PID is discovered via lsof -t -U (same-UID only) and stopped, instead of punting to a manual kill.
  11. Dial timeout/cancellationping, traceroute, connect, and bench now use DialAddrTimeout, so a timed-out command cancels the daemon-side dial rather than leaving a dangling connection or leaking a goroutine.

Tests

  • zz_config_safety_test.go (new): config 0600 perms + no loose-perm preservation, atomic write leaves no temp file, PILOT_HOME precedence + round-trip, bench bad-size rejection, send-message --json --wait single-document assertion.
  • zz_parsers_test.go: updated/added parseFlags cases for dash-leading values and "next flag is not a value".
  • Test home helpers now clear PILOT_HOME for hermeticity.

Validation

  • GOWORK=off go build ./..., go vet ./cmd/pilotctl/..., gofmt clean.
  • GOWORK=off go test -count=1 -parallel 4 ./cmd/pilotctl/... green.
  • No new dependencies; gating security CI (govulncheck, gitleaks) unaffected. cli-reference-check unaffected (no help-text changes).

config.json now writes atomically (temp file + fsync + rename) at 0600
and no longer inherits loose permissions from a pre-existing file, since
it holds admin_token and other secrets. loadConfig distinguishes a
missing config (silent, defaults apply) from an unreadable or corrupt one
(logged). configDir honors PILOT_HOME so state can be relocated without
rewriting $HOME.

set-hostname/clear-hostname now report saveConfig failures instead of
letting persisted config silently diverge from the running daemon.

Pipe-mode stdin reads raise the bufio.Scanner buffer to 16 MiB and check
scanner.Err() so long lines and read errors no longer fail silently.

send-message --json --wait emits a single JSON document (the reply is
folded into the envelope) so machine parsers don't choke on two
concatenated documents.

bench validates the size argument (rejects non-positive, NaN/Inf, and
absurd values, caps at 4 GiB) before dialing.

traceroute checks the conn.Write error before reading the response.

parseFlags accepts flag values that begin with "-" (negative numbers,
bare "-" for stdin, "-3x") via isFlagValue, while still treating
"--name"/"-name" tokens as the next flag.

daemon stop now discovers the daemon PID from the socket owner (lsof)
when the PID file is missing, instead of punting to a manual kill.

ping, traceroute, connect, and bench dials use DialAddrTimeout so a
timed-out command cancels the daemon-side dial rather than leaving a
dangling connection or leaking a goroutine.
Comment thread cmd/pilotctl/main.go Fixed
Comment thread cmd/pilotctl/main.go Fixed
Comment thread cmd/pilotctl/main.go Fixed
Comment thread cmd/pilotctl/main.go Fixed
Comment thread cmd/pilotctl/main.go Fixed
@TeoSlayer TeoSlayer merged commit 3a71dbb into main Jun 22, 2026
15 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants