Skip to content

Commit dfed162

Browse files
committed
test: add comprehensive docker test suite and update backlog
Introduces `test-in-docker.sh` to provide an exhaustive test suite for all CLI subcommands and code paths within a standard Ubuntu 24.04 environment. Additionally, populates the backlog with several new task definitions addressing bugs and improvements discovered during testing.
1 parent 92e76b9 commit dfed162

6 files changed

Lines changed: 1098 additions & 0 deletions
Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
# 0037 — mcp-bridge: Start and Respond to initialize When No Backends Are Found
2+
3+
**Priority:** P1
4+
**Type:** bugfix
5+
**Module:** `src/cli/mcp_bridge.rs`, `src/mcp/bridge/server.rs`
6+
**Status:** Backlog
7+
**Complexity:** S
8+
**Created:** 2026-03-02
9+
10+
## Context
11+
12+
`great mcp-bridge` currently calls `anyhow::bail!` if `discover_backends`
13+
returns an empty list (lines 127-131 of `src/cli/mcp_bridge.rs`). This causes
14+
the process to exit 1 before reading a single byte from stdin, so any MCP
15+
client that opens the bridge immediately loses its connection with no valid
16+
JSON-RPC response.
17+
18+
The problem is most acute during onboarding: a brand-new install of great.sh
19+
on a machine that has no AI CLIs yet (a common Docker / CI scenario) makes the
20+
bridge completely unusable. MCP clients (Claude Desktop, Cursor, etc.) see a
21+
dead process rather than a bridge in degraded mode.
22+
23+
The correct behavior is: the bridge is an MCP server. MCP servers MUST always
24+
respond to `initialize` and `tools/list`. When no backends are present the
25+
response to `tools/list` is an empty array; tool calls return a JSON-RPC error
26+
(`-32601 Method not found` or a structured tool error) explaining that no
27+
backends are installed.
28+
29+
**Reproduction** (Ubuntu 24.04, no AI CLIs on PATH):
30+
31+
```
32+
echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test","version":"0.1"}}}' \
33+
| great mcp-bridge
34+
```
35+
36+
Actual output:
37+
```
38+
Error: no AI CLI backends found on PATH. Install at least one of: gemini, codex, claude, grok, ollama
39+
exit: 1
40+
```
41+
42+
Expected output: a valid JSON-RPC 2.0 `initialize` response on stdout, then
43+
the server reading further messages from stdin (returning empty `tools/list`,
44+
or a descriptive error on tool calls).
45+
46+
**Code location of the hard guard:**
47+
`src/cli/mcp_bridge.rs` lines 127-131:
48+
```rust
49+
let backends = discover_backends(&backend_filter);
50+
if backends.is_empty() {
51+
anyhow::bail!(
52+
"no AI CLI backends found on PATH. Install at least one of: gemini, codex, claude, grok, ollama"
53+
);
54+
}
55+
```
56+
57+
## Acceptance Criteria
58+
59+
- [ ] Piping an MCP `initialize` request to `great mcp-bridge` on a machine
60+
with no AI CLIs returns a well-formed JSON-RPC 2.0 response on stdout (i.e.
61+
the process does NOT exit 1 before responding). Verified with:
62+
`echo '{"jsonrpc":"2.0","id":1,"method":"initialize",...}' | great mcp-bridge`
63+
producing valid JSON on stdout.
64+
65+
- [ ] `tools/list` returns an empty `tools` array (not an error) when no
66+
backends are discovered, allowing MCP clients to connect without crashing.
67+
68+
- [ ] Any tool call made while no backends are available returns a JSON-RPC
69+
error response with a human-readable message such as
70+
`"No AI CLI backends found. Install at least one of: gemini, codex, claude, grok, ollama."`
71+
rather than silently crashing the bridge process.
72+
73+
- [ ] A `WARN`-level log line is emitted to stderr when the bridge starts with
74+
zero backends (e.g. `warn: no AI CLI backends found; bridge running in
75+
degraded mode`), so operators can diagnose the situation without reading
76+
client-side errors.
77+
78+
- [ ] `cargo test` continues to pass; existing unit tests in
79+
`src/mcp/bridge/backends.rs` are unaffected, and at least one new
80+
integration test (or `#[test]`) asserts that `run()` does not return an
81+
`Err` when no backends are present and an `initialize` message is sent.
82+
83+
## Files That Need to Change
84+
85+
- `src/cli/mcp_bridge.rs` — remove the hard `bail!` guard; pass an empty
86+
`Vec<BackendConfig>` to `start_bridge` with a warn-level log instead.
87+
- `src/mcp/bridge/server.rs` — ensure `GreatBridge::new` and the tool
88+
dispatch path handle an empty backend list gracefully (tool calls return
89+
a structured error rather than panicking or unwrapping).
90+
91+
## Dependencies
92+
93+
None. All changes are within the MCP bridge module.
94+
95+
## Out of Scope
96+
97+
- Auto-installing any backend CLI (belongs in `great doctor --fix` / task 0005).
98+
- Surfacing backend availability in `tools/list` metadata (future enhancement).
99+
- Changing the behavior when a non-empty `--backends` filter is specified but
100+
none of the named binaries are found (same fix applies, but keep as a
101+
follow-on to keep this change small).
Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
# 0038 — CLI: Handle SIGPIPE / EPIPE Gracefully Instead of Panicking
2+
3+
**Priority:** P2
4+
**Type:** bugfix
5+
**Module:** `src/main.rs`, all subcommands that write to stdout
6+
**Status:** Backlog
7+
**Complexity:** S
8+
**Created:** 2026-03-02
9+
10+
## Context
11+
12+
Any `great` subcommand that writes to stdout panics (exit code 101) when the
13+
read end of a pipe closes before the write is complete. This is a POSIX SIGPIPE
14+
/ EPIPE condition. Rust's standard library does not install a SIGPIPE handler
15+
by default on Linux; instead, `println!` / `write!` propagate an `io::Error`
16+
which — when it reaches `main() -> Result<()>` — causes anyhow to print a
17+
backtrace and exit 101 (Rust's panic exit code).
18+
19+
The failure is most reproducible via `great status --json` because it emits a
20+
large single `println!` call, but the same root cause affects every subcommand
21+
(`great doctor`, `great diff`, `great template list`, etc.).
22+
23+
**Reproduction** (Ubuntu 24.04, Docker container, or any Linux shell):
24+
25+
```bash
26+
# head reads 0 lines then closes the pipe — great exits 101
27+
great status --json | head -0
28+
29+
# process-substitution variant — stderr shows broken pipe message
30+
great status --json > >(head -0)
31+
```
32+
33+
**Actual behavior:**
34+
35+
- Exit code: 101 (Rust panic/anyhow error propagation)
36+
- stderr: `Error: write /dev/stdout: broken pipe` (or similar anyhow error)
37+
38+
**Expected behavior:**
39+
40+
- Exit code: 0 (or the process is terminated by SIGPIPE, which shells report
41+
as exit 141 — both are acceptable to downstream tooling)
42+
- No output to stderr
43+
44+
**Root cause (code):**
45+
46+
`src/cli/status.rs` line 415:
47+
48+
```rust
49+
println!("{}", serde_json::to_string_pretty(&report)?);
50+
```
51+
52+
The `?` propagates the `BrokenPipe` `io::Error` to `main()`, which calls the
53+
anyhow error handler and exits 101. The same pattern exists wherever `println!`
54+
or `print!` is used across all subcommands.
55+
56+
**Why this matters:**
57+
58+
- CI pipelines doing `great status --json | jq '.has_issues'` panic if jq
59+
exits early (e.g. after finding the first match with `jq -e`).
60+
- `great status --json | head -1` — a common "is the tool healthy?" one-liner
61+
— reliably panics.
62+
- Any pipeline using `| grep -q`, `| head`, or `| wc -l` may trigger this.
63+
- The panic message on stderr is alarming and confusing; users file bug reports
64+
thinking the CLI is broken, not that the pipe closed.
65+
- `great status --json` is documented to always exit 0 in JSON mode (by
66+
design); exiting 101 on EPIPE is an undocumented and inconsistent exception.
67+
68+
## Acceptance Criteria
69+
70+
- [ ] `great status --json | head -0` exits 0 (or 141) and produces no output
71+
on stderr. Verified by: `great status --json | head -0; echo $?` returning
72+
0 or 141, with stderr empty.
73+
74+
- [ ] `great status --json | head -1` exits 0 (or 141) with no stderr output,
75+
confirming the fix works when some output is consumed before the pipe closes.
76+
77+
- [ ] At least two additional subcommands that write to stdout (`great doctor`,
78+
`great diff`, or `great template list`) are verified to not panic on
79+
`| head -0` — confirming the fix is applied at the binary entry point rather
80+
than per-call-site.
81+
82+
- [ ] Normal operation is unaffected: `great status --json` with stdout open
83+
exits 0 and produces valid JSON; `great status` (human mode) with issues
84+
still exits 1 as before.
85+
86+
- [ ] `cargo test` passes with no regressions.
87+
88+
## Suggested Fix Approaches
89+
90+
Three options exist; exactly one should be chosen and documented in the commit:
91+
92+
**Option A — `unix_sigpipe` attribute (Rust 1.73+, simplest):**
93+
Add `#[unix_sigpipe = "sig_dfl"]` to `fn main()` in `src/main.rs`. This
94+
restores the default POSIX SIGPIPE disposition so the kernel kills the process
95+
cleanly (exit 141) rather than delivering an `io::Error`.
96+
97+
**Option B — Ignore SIGPIPE at startup, swallow BrokenPipe errors:**
98+
Call `unsafe { libc::signal(libc::SIGPIPE, libc::SIG_IGN) }` early in `main`,
99+
then wrap the top-level `Result` handler to treat `io::ErrorKind::BrokenPipe`
100+
as a clean exit 0.
101+
102+
**Option C — Wrap stdout writes per-call-site:**
103+
Replace `println!` calls with explicit `writeln!(std::io::stdout(), ...)` and
104+
match on `io::ErrorKind::BrokenPipe` to return `Ok(())` instead of
105+
propagating.
106+
107+
Option A is preferred: it is the least invasive, requires no unsafe code or
108+
extra dependencies, and matches the behavior users expect from a POSIX CLI tool.
109+
110+
## Files That Need to Change
111+
112+
- `src/main.rs` — primary fix location (Option A: one-line attribute)
113+
- Potentially `src/cli/status.rs`, `src/cli/doctor.rs`, `src/cli/diff.rs`,
114+
`src/cli/template.rs` if a per-call-site approach (Option C) is chosen
115+
116+
## Dependencies
117+
118+
None. This is a self-contained fix to the binary entry point (or individual
119+
write sites).
120+
121+
## Out of Scope
122+
123+
- Changing exit codes for non-EPIPE error paths (separate concern).
124+
- Handling SIGPIPE in the MCP bridge server (that process is long-lived and
125+
EPIPE there has different semantics — track separately if needed).
126+
- Windows behavior (Windows does not have SIGPIPE; the fix is no-op or
127+
conditional on `#[cfg(unix)]`).
Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
# 0039 — Docker-on-WSL2 container falsely detected as WSL
2+
3+
| Field | Value |
4+
|----------------------|------------------------------------|
5+
| Priority | P2 |
6+
| Type | bugfix |
7+
| Module | `src/platform/detection.rs` |
8+
| Status | backlog |
9+
| Estimated Complexity | Small |
10+
11+
## Context
12+
13+
`great status` (and any command that branches on platform) reports `WSL Ubuntu 24.04` when run inside a Docker container on a WSL2 host. The container should be identified as plain `Linux Ubuntu 24.04`.
14+
15+
**Reproduction**
16+
17+
```bash
18+
docker run --rm \
19+
-v /path/to/great:/usr/local/bin/great:ro \
20+
ubuntu:24.04 \
21+
great status 2>&1 | grep Platform
22+
# Actual: ℹ Platform: WSL Ubuntu 24.04 (x86_64)
23+
# Expected: ℹ Platform: Linux Ubuntu 24.04 (x86_64)
24+
```
25+
26+
**Root cause — three detection tiers, all reachable inside Docker-on-WSL2**
27+
28+
`is_wsl()` in `src/platform/detection.rs` (lines 169–181) runs three checks in order:
29+
30+
1. `WSL_DISTRO_NAME` env var — normally not set inside Docker, so this tier is safe.
31+
2. `/proc/sys/fs/binfmt_misc/WSLInterop` path check — Docker containers on WSL2 share the host kernel's binfmt_misc namespace; this file **is present** inside containers, causing a false positive.
32+
3. `/proc/version` contains "microsoft" — inherited from the WSL2 kernel (`6.6.87.2-microsoft-standard-WSL2`), also a false positive.
33+
34+
Both tiers 2 and 3 are hit. `is_wsl2()` (lines 187–189) has the same WSLInterop false positive, so `PlatformCapabilities.is_wsl2` is also wrong.
35+
36+
**Impact of false positive**
37+
38+
- `great apply` may attempt WSL-specific actions: copying fonts to `C:\Users\...\AppData\Local\Microsoft\Windows\Fonts`, invoking `cmd.exe` (fails with "No such file or directory").
39+
- `great doctor` may suggest WSL-specific tooling (`wslu`, etc.) to a container.
40+
- Install paths and generated scripts may be incorrect for container environments.
41+
42+
**Container indicators to check (before concluding WSL)**
43+
44+
| Indicator | Path / mechanism | Notes |
45+
|-----------|-----------------|-------|
46+
| `/.dockerenv` | File present in all Docker containers | Most reliable; Docker always creates it |
47+
| `DOCKER_CONTAINER` env var | Set by some base images / compose setups | Secondary |
48+
| `container` env var | Set by some OCI runtimes (podman, etc.) | Catches non-Docker containers too |
49+
| cgroup v1 `/proc/1/cgroup` | Contains "docker" in container | Fallback; cgroup v2 may not have it |
50+
51+
The recommended fix: before returning `true` from `is_wsl()`, check for any container indicator and return `false` if one is found. Same guard needed in `is_wsl2()`.
52+
53+
## Acceptance Criteria
54+
55+
1. `is_wsl_with_probe()` returns `false` when `/.dockerenv` exists, even if `/proc/sys/fs/binfmt_misc/WSLInterop` exists and `/proc/version` contains "microsoft".
56+
2. `is_wsl_with_probe()` returns `false` when the `container` or `DOCKER_CONTAINER` env var is set, regardless of other WSL indicators.
57+
3. `is_wsl_with_probe()` continues to return `true` on a genuine WSL2 environment (WSLInterop present, no container indicators).
58+
4. `is_wsl2_with_probe()` is subject to the same container-exclusion guard and returns `false` inside a detected container.
59+
5. `great status` run inside an `ubuntu:24.04` Docker container on a WSL2 host prints `Platform: Linux Ubuntu 24.04 (x86_64)`, not `WSL Ubuntu 24.04`.
60+
61+
## Files That Need to Change
62+
63+
- `src/platform/detection.rs``is_wsl()`, `is_wsl2()`, `is_wsl_with_probe()`, `is_wsl2_with_probe()`, `MockProbe`-based tests (add container scenarios).
64+
65+
## Dependencies
66+
67+
None — self-contained platform module change.
68+
69+
## Out of Scope
70+
71+
- Detecting other container runtimes (LXC, systemd-nspawn) beyond the indicators listed.
72+
- Changing the `Platform` enum to add a `Container` variant (separate task if needed).
73+
- Any changes to `great apply` WSL-specific logic (that is downstream; fixing detection unblocks it).

0 commit comments

Comments
 (0)