Skip to content

feat(cli): native Windows support + cross-platform CI matrix#581

Open
JOhnsonKC201 wants to merge 2 commits into
recodeeai:mainfrom
JOhnsonKC201:feat/native-windows-support
Open

feat(cli): native Windows support + cross-platform CI matrix#581
JOhnsonKC201 wants to merge 2 commits into
recodeeai:mainfrom
JOhnsonKC201:feat/native-windows-support

Conversation

@JOhnsonKC201
Copy link
Copy Markdown

Summary

  • Replaces the POSIX shell bin shim (apps/cli/bin/colony.sh) with a cross-platform Node ES module (apps/cli/bin/colony.mjs). npm's generated .cmd / .ps1 wrappers now run colony natively on Windows (cmd, PowerShell, Git Bash) without WSL.
  • Preserves the daemon fast-path for colony bridge lifecycle --json using node:http + a node:net connect probe — same 1s connect / 2s request timeouts as the curl-based version. Stdin is buffered and replayed on fallback so rule feat(hooks): auto-claim edited files + cross-session overlap warnings #10 (a dead daemon must never lose or block a write) still holds.
  • Adds macos-latest and windows-latest to the build CI matrix across Node 20 and 22 with fail-fast: false, so this regression cannot recur. The Linux-only jobs (e2e-publish, scenarios) stay on ubuntu-latest because they run bash scripts.

Why

apps/cli/package.json#bin pointed at bin/colony.sh. On Windows, npm cannot execute a POSIX shell script as a CLI bin — colony was effectively broken for every Windows user without WSL. CI ran only on ubuntu-latest, so the regression had no chance of being caught.

The shim is now pure node:* builtins — no curl, no /bin/sh, no external deps. The fast-path's optimization is intact because we dispatch the non-fast-path through a dynamic import() of dist/index.js in the same Node process (no second Node fork). process.argv[1] is rewritten to point at the resolved CLI entry so isMainEntry() in apps/cli/src/index.ts still matches via realpath comparison.

Notable changes

  • apps/cli/bin/colony.mjs (new) — Node-based bin shim. colony.sh stays in the repo for reference but is no longer in package.json#files.
  • apps/cli/package.jsonbin and files updated to point at the .mjs.
  • apps/cli/test/bin-shim.test.ts — rewritten to spawn the shim via node directly rather than sh. Same six contract assertions (fallback preserves stdin + argv with spaces, COLONY_BRIDGE_FAST=0, passthrough variants). Uses a new COLONY_CLI_ENTRY env-var test seam to point at a stub instead of needing dist/.
  • apps/worker/src/server.ts, scripts/bench-bridge-fastpath.mjs, CLAUDE.md — updated live references from colony.sh to colony.mjs.
  • .github/workflows/ci.ymlbuild job now runs on [ubuntu-latest, macos-latest, windows-latest] × Node [20, 22].
  • .changeset/native-windows-support.md — patch-level changeset on colonyq.

Test plan

  • Smoke-tested the new shim locally on Windows (cmd / Git Bash):
    • --version passthrough → CLI receives correct argv.
    • bridge lifecycle --json --ide claude-code --cwd "/tmp/has spaces" with daemon down → falls back to CLI with stdin replayed and the cwd-with-spaces preserved.
    • COLONY_BRIDGE_FAST=0 → skips fast-path entirely.
    • bridge replay foo.json → falls through immediately (no fast-path eligibility).
    • bridge lifecycle without --json → falls through (humans want pretty output).
  • CI runs pnpm lint / typecheck / test / build across the new OS × Node matrix. The shim tests are the only ones I rewrote for cross-platform; if other tests in the suite have latent POSIX assumptions, they'll surface here. I'd suggest landing those follow-ups as separate PRs rather than letting this one balloon.
  • e2e-publish (bash scripts/e2e-publish.sh) stays on Linux. The bin shim it installs is now colony.mjs; the test -x "$BIN" check at step 5 will still pass because the file is committed with mode 100755.

Risk / scope notes

  1. Windows job may surface unrelated test failures. This PR fixes the bin shim and the shim's regression test. Other tests in the suite haven't been audited for Windows compatibility. If the Windows job is red because of POSIX assumptions elsewhere, a reasonable triage is to mark them .skip on process.platform === 'win32' in follow-up PRs and either accept the partial coverage or switch to continue-on-error: ${{ matrix.os == 'windows-latest' }} until the suite is fully ported.
  2. Performance. The fast-path latency budget is unchanged on POSIX: same one Node start, same HTTP timeouts. The fallback now skips one process fork (dynamic-import vs spawn), which is a small win, not a regression.
  3. No upstream behavior change for existing POSIX installs. The shim's contract — fast-path triage, fallback rules, stdin buffering — is identical; only the implementation language changed.

The bin entry was a POSIX shell script (bin/colony.sh) which npm could
not execute on Windows without WSL, breaking every Windows install of
the package. Replace it with a Node ES module (bin/colony.mjs) using
only node:* builtins so npm's generated .cmd / .ps1 wrappers run it
natively under cmd, PowerShell, and Git Bash.

The daemon fast-path for `colony bridge lifecycle --json` is preserved:
the HTTP POST to 127.0.0.1:$COLONY_WORKER_PORT/api/bridge/lifecycle now
goes through node:http, with a node:net connect probe (1s) before the
request (2s) so the fallback latency stays close to the curl-based
version when the daemon isn't running. Stdin is buffered and replayed
on fallback, preserving rule recodeeai#10 (a dead daemon must never lose or
block a write).

bin-shim.test.ts is rewritten to spawn the shim via `node` directly
rather than `sh`, so the regression contract runs on all three OSes.

CI now runs the build matrix on ubuntu-latest, macos-latest, and
windows-latest across Node 20 and 22 with fail-fast disabled, so this
regression cannot recur. The Linux-only jobs (e2e-publish, scenarios)
stay on ubuntu-latest.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f190cf4fa4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread apps/cli/bin/colony.mjs

function probeDaemon() {
return new Promise((resolveOuter) => {
const socket = connect({ port: PORT, host: HOST });
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Guard daemon probe against invalid COLONY_WORKER_PORT

Parsing COLONY_WORKER_PORT with Number(...) allows NaN/out-of-range values to reach net.connect, which throws synchronously and aborts the shim before CLI fallback. In this commit that means colony bridge lifecycle --json can exit with an error instead of replaying stdin to the in-process CLI when the port env var is malformed (for example COLONY_WORKER_PORT=abc), breaking the documented "daemon failures must fall back" contract for write-path safety.

Useful? React with 👍 / 👎.

Comment thread apps/cli/bin/colony.mjs
Comment on lines +88 to +90
if (a === '--ide') {
out.ide = rest[i + 1] ?? '';
i += 2;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve malformed --ide/--cwd instead of dropping them

The new flag parser treats --ide/--cwd without a following value as empty strings and then rebuildSafeArgv removes them, so malformed invocations are silently rewritten (e.g. ... --json --ide becomes ... --json). Before this change, the shell shim failed fast on this invalid input; now the command can proceed with different semantics, hiding caller bugs and potentially running lifecycle writes with missing context.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant