Skip to content

fix(browse): wrap telemetry bash script with bash.exe on Windows#1119

Open
scarson wants to merge 1 commit intogarrytan:mainfrom
scarson:fix/windows-telemetry-spawn
Open

fix(browse): wrap telemetry bash script with bash.exe on Windows#1119
scarson wants to merge 1 commit intogarrytan:mainfrom
scarson:fix/windows-telemetry-spawn

Conversation

@scarson
Copy link
Copy Markdown

@scarson scarson commented Apr 21, 2026

Summary

reportAttemptTelemetry() in browse/src/security.ts calls spawn(bin, args) where bin is the gstack-telemetry-log bash script. On Windows this fails silently with ENOENT on every attack-attempt telemetry event, because Windows CreateProcess cannot launch shebang-scripted files. The error is swallowed by the fire-and-forget child.on('error', ...) handler, so the failure is invisible — telemetry drops on the floor, no log, no surfaced warning. The local attempts.jsonl write still lands, so there's a local audit trail, but the tier-aware upload path (community / anonymous / off, handled inside the bash binary) never runs on any Windows install.

Follow-up to #1118 (fix(make-pdf): Windows .exe resolution...): same family of bug — Node's Windows subprocess layer can't resolve the script layouts gstack actually ships — in a new call site.

Empirical reproduction

Windows 11, Node v25, bun 1.3.11. Stand-in script stands in for gstack-telemetry-log (same shebang shape):

$ cat /tmp/fake-telemetry
#!/usr/bin/env bash
echo "telemetry received: $@"
exit 0

$ node -e "
import('node:child_process').then(({spawn}) => {
  const p = spawn('/tmp/fake-telemetry', ['--event-type', 'attack_attempt']);
  p.on('error', e => console.log('err:', e.code));
  p.on('close', c => console.log('exit:', c));
});
"
err: ENOENT
exit: -4058

-4058 is UV_ENOENT / ERROR_FILE_NOT_FOUND — libuv trying to execute the file as a PE image and getting rejected. The file is present on disk and readable; Windows just can't dispatch on the shebang.

With the fix applied (spawn('bash', [script, ...args])):

exit: 0
stdout: telemetry received: --event-type attack_attempt

Root cause

Windows does not implement kernel-level shebang handling the way POSIX execve does. When Node's spawn() hands an absolute path (with no extension) to CreateProcess, Windows tries to load the file as a PE image. For gstack-telemetry-log — a 10-KB file starting with #!/usr/bin/env bash\n# gstack-telemetry-log ... — the PE header check fails and CreateProcess returns ERROR_FILE_NOT_FOUND (counterintuitively — the file exists, but "no loadable image" is the closest documented error).

The resolution helper findTelemetryBinary() (L316) already does the right thing: fs.accessSync(bin, X_OK) returns true because on Windows X_OK is degraded to existence-checking (same Node-docs quirk cited in #1118). So the binary is "found" and then spawned, and then the spawn silently fails.

Scope

Audited every spawn / execFile / Bun.spawn / Bun.$ call site in the repo TypeScript sources looking for this class of bug:

Target Call site In scope?
gstack-telemetry-log (bash) browse/src/security.ts:346 ✅ Fixed here
Bash preamble generators scripts/generate-preamble-*.ts No — those spawn bash explicitly already
Test helpers browse/test/helpers/eval-store.ts:26 No — uses spawnSync('bash', ['-c', ...]), already wrapped
spawn('claude', ...) x 3 sites browse/src/{security-classifier,sidebar-agent}.ts No — on Windows Claude Code's standalone installer puts claude.exe on PATH, which Node resolves natively. A subset of users (npm-global install) may have claude.cmd, which recent Node versions block spawning without shell: true (CVE-2024-27980 hardening). Verified empirically on this box with claude.exe — works. npm-install variant deserves its own follow-up with proper execFile semantics since args contain user-controlled prompt text.

This PR fixes the one site in the audit that's unconditionally broken for every Windows user.

Why this shipped unnoticed

Same story as #1118. .github/workflows/ runs every job on ubuntu-latest or macos-latest. On both POSIX platforms, execve reads the shebang and delegates to bash, so the spawn succeeds. Windows is not in any matrix, and make-pdf-gate.yml:25 explicitly comments that Windows-CI is on the TODO list.

The fix

Extract a pure function buildTelemetrySpawnCommand(bin, args) that returns the platform-correct { cmd, cmdArgs } tuple:

export function buildTelemetrySpawnCommand(
  bin: string,
  args: string[],
): { cmd: string; cmdArgs: string[] } {
  if (process.platform === 'win32') {
    return { cmd: 'bash', cmdArgs: [bin, ...args] };
  }
  return { cmd: bin, cmdArgs: args };
}

reportAttemptTelemetry consumes the tuple and hands it to spawn unchanged:

const { cmd, cmdArgs } = buildTelemetrySpawnCommand(bin, [...args]);
const child = spawn(cmd, cmdArgs, { stdio: 'ignore', detached: true });

The POSIX path is bit-identical to the old code. On Windows, spawn('bash', [bin, ...args]) works unconditionally for Git Bash installs (Git Bash puts bash.exe on the system PATH by default — that's how every Windows developer ends up with gstack installed in the first place, since ./setup is a bash script). If bash is genuinely missing, spawn fires the same 'error' event that's already swallowed — no new failure mode.

Alternatives considered and rejected

  • spawn(bin, args, { shell: true }). Routes through cmd.exe argument parsing. record.urlDomain flows from user-browsed URLs; record.payloadHash / record.layer / record.verdict all feed in from the security classifier's analysis of user-controlled input. A URL containing " or & would be an injection vector. Rejected.
  • No-op on Windows, document the gap. Possible but loses actual telemetry capability (not just the upload — the bash binary also handles tier gating, which would need to be re-implemented in TS). Larger change, worse outcome.
  • Ship a .ps1 or .cmd wrapper alongside gstack-telemetry-log. Requires changes outside browse/src/ and infrastructure to keep the Windows wrapper in sync. Worth doing eventually but out of scope for this PR.

Test plan

  • bun test browse/test/security.test.ts on win32 — 32 pass, 0 fail (includes 3 new assertions: POSIX pass-through, win32 bash wrap, arg-array immutability)
  • Empirical before/after on Windows 11 with a stand-in bash script — ENOENTexit 0 (stdout captured above)
  • Pure function is trivially testable on POSIX CI too — the Windows branch test early-returns on process.platform !== 'win32', and the POSIX pass-through test early-returns on win32, so both platforms exercise the branch they should
  • Ubuntu/macOS CI pass — relying on the existing matrix. POSIX branch returns unchanged { cmd: bin, cmdArgs: args }.
  • Real attack-attempt telemetry on Windows verified end-to-end — not tested; would require triggering the security classifier and confirming the server-side event was received, which depends on telemetry tier config. The unit test + empirical bash-script repro gives me high confidence without needing the full path.

Commits

  • d7098a3fix(browse): wrap telemetry bash script with bash.exe on Windows (source + test)

Single commit; the change is small and cohesive. Source and test go together.

What this PR doesn't do

  • Fix the spawn('claude', ...) sites in sidebar-agent.ts and security-classifier.ts. Conditional on install method (standalone .exe works; npm .cmd may fail on Node ≥20.12 per CVE-2024-27980 hardening). Args contain user prompt text, so shell: true isn't safe — the right fix is execFile (which does auto-shell .cmd on Windows without exposing args to shell parsing) or explicit extension resolution. Separate PR once scope is validated.
  • Enable Windows CI. Same reason as fix(make-pdf): Windows .exe resolution for browse + pdftotext #1118 — the matrix decision is independent of any single fix.

🤖 Generated with Claude Code

`reportAttemptTelemetry()` in browse/src/security.ts spawns
`gstack-telemetry-log` directly via Node's `child_process.spawn(bin, args)`.
On Windows, that fails with `ENOENT` / `-4058` (UV_ENOENT) even when the
script exists on disk, because `gstack-telemetry-log` is a shebang-scripted
bash file (`#!/usr/bin/env bash`) and Windows `CreateProcess` cannot launch
shebangs — it treats the file as a PE image and bails when the header
doesn't match. The `child.on('error', ...)` handler swallows the error per
the fire-and-forget semantics, so the failure is invisible: attack-attempt
telemetry drops on the floor on every Windows install.

Empirical reproduction on Windows 11, Node v25, bun 1.3.11:

    spawn('/path/to/gstack-telemetry-log', [...])
    → spawn error: ENOENT - spawn /path/to/gstack-telemetry-log ENOENT
    → exit: -4058

    spawn('bash', ['/path/to/gstack-telemetry-log', ...])
    → exit: 0 | stdout: telemetry received: ...

Fix: extract `buildTelemetrySpawnCommand(bin, args)` as a pure function of
(platform, bin, args) that returns `{ cmd: 'bash', cmdArgs: [bin, ...args] }`
on win32 and `{ cmd: bin, cmdArgs: args }` on POSIX. `reportAttemptTelemetry`
consumes the tuple and hands it to `spawn` unchanged. The POSIX path is
bit-identical to the old code.

Windows dev boxes running gstack invariably ship Git Bash (the installer
puts `bash.exe` on PATH), so in practice the wrapped invocation works
without further setup. If `bash` is missing, `spawn` fires an 'error' event
and the same existing swallow path kicks in — local `attempts.jsonl` keeps
the audit trail intact either way. No new failure modes introduced.

Why this shipped unnoticed: no gstack workflow runs on Windows — the
matrix in every CI file is `ubuntu-latest` and/or `macos-latest` only.
On both POSIX platforms the shebang is honored by `execve` and the
spawn succeeds, so the bug never triggered in CI.

Why `shell: true` is NOT an acceptable alternative fix: `record.urlDomain`
is extracted from user-browsed URLs, and `record.payloadHash`/`record.layer`/
`record.verdict` also flow through this call. Passing `shell: true` would
route those through `cmd.exe` arg parsing and open a shell-injection path
the moment a URL contains a quote or backslash. Explicit `spawn('bash',
[...])` keeps every arg in argv, so escaping stays honest.

Tests: three new assertions covering the POSIX pass-through path, the
win32 bash wrap, and arg-array immutability. Guarded by
`if (process.platform === 'win32') return;` / its inverse, so POSIX CI
exercises the platform-relevant branch automatically and the Windows
branch is covered whenever a contributor runs `bun test` on Windows.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@scarson
Copy link
Copy Markdown
Author

scarson commented Apr 21, 2026

For context: opened #1122 today ("Boiling the Windows CI Lake") proposing a Phase-1 Windows smoke CI that would have caught this class of bug at PR time. This PR is cited in the RFC's coverage table as one of the five recent examples that motivated it — not asking for anything on this PR, it can merge on its own merits. Cross-linking because the "Why this shipped unnoticed" paragraph above reads differently with a systemic answer in flight.

Also a companion issue at #1123 for higher-level discussion if the PR's review thread gets crowded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant