Skip to content

Commit 121f9aa

Browse files
committed
Phase 1: CI debugging refactor — structured output, discover, capability gating, log capture
- Add structured output envelope (--structured) with error taxonomy (transport/capability/protocol/application/validation), timing, and log capture for CI-consumable diagnostics - Add discover pseudo-method that returns server shape: capabilities, tools, resources, and prompts in a single invocation - Add capability gating: methods now fail fast with a clear category if the server lacks the required capability - Add --fail-on-error flag to exit 1 on application-level tool errors - Add ping method support - Capture logging/message notifications from the server and include them in structured output or emit to stderr in raw mode - Fix SSE transport in test server: add POST /messages route, remove double-start of SSEServerTransport (connect() already calls start() internally) - Add 23 tests covering all new features across stdio, HTTP, and SSE transports - Add CLI test step to main.yml workflow - Write CI_DEBUGGING_REFACTOR.md design document
1 parent 9f3b4ff commit 121f9aa

10 files changed

Lines changed: 1258 additions & 44 deletions

File tree

.github/workflows/main.yml

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,18 @@ jobs:
3737
working-directory: ./client
3838
run: npm test
3939

40+
- name: Run CLI tests
41+
working-directory: ./cli
42+
run: |
43+
cd ..
44+
npm ci --ignore-scripts
45+
cd cli
46+
npm run build
47+
npm test
48+
env:
49+
NPM_CONFIG_YES: true
50+
CI: true
51+
4052
- run: npm run build
4153

4254
publish:

CI_DEBUGGING_REFACTOR.md

Lines changed: 251 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,251 @@
1+
# CI Debugging Refactor: Inspector as an Automated MCP Server Debugging Tool
2+
3+
## Goal
4+
5+
Transform the MCP Inspector CLI into a CI-first debugging tool that AI agents (Claude, etc.) can use to programmatically test, validate, and diagnose MCP servers — without a browser UI.
6+
7+
This is **not** another general-purpose MCP client. For interactive command-line use of MCP servers, use [mcpc](https://github.com/apify/mcp-cli). Inspector's CLI is the **debugging companion**: structured diagnostics, batch debugging workflows, and CI-clean semantics.
8+
9+
---
10+
11+
## Current State
12+
13+
The Inspector CLI (`--cli` mode) supports single-method invocations across three transports (stdio, SSE, Streamable HTTP):
14+
15+
| Method | Implemented | Tested | Notes |
16+
| -------------------------- | ----------- | ------ | ----------------------------------------------- |
17+
| `tools/list` ||| |
18+
| `tools/call` ||| Fetches schema first for type coercion (2 RPCs) |
19+
| `resources/list` ||| Zero test coverage |
20+
| `resources/read` || stdio | Only tested over stdio |
21+
| `resources/templates/list` ||| Zero test coverage |
22+
| `prompts/list` ||| |
23+
| `prompts/get` ||| |
24+
| `logging/setLevel` || HTTP | Sets level but discards all log notifications |
25+
| `ping` || | |
26+
| `discover` || | |
27+
| `completion/complete` || | |
28+
29+
**Key architectural limitations:**
30+
31+
1. **One method per process** — each invocation connects, runs one call, disconnects. Stdio servers respawn every time.
32+
2. **No structured output envelope** — raw JSON on stdout, bare strings on stderr. No programmatic error categorization.
33+
3. **Server logs discarded** — debug logging is enabled on connect via `logging/setLevel`, but the `logging/message` notifications are never captured.
34+
4. **No capability gating** — methods are dispatched without checking server capabilities. Failures are ambiguous.
35+
5. **Exit code ambiguity** — server-side errors (`isError: true`) exit 0; only client-side failures exit non-zero. Invisible to CI pipelines using `set -e`.
36+
37+
---
38+
39+
## Differentiation vs. mcpc
40+
41+
| Feature | mcpc (apify) | Inspector CLI (this refactor) |
42+
| ---------------------- | ------------------------- | ---------------------------------------- |
43+
| Primary audience | Interactive CLI users | AI agents and CI pipelines |
44+
| Output format | Raw MCP JSON (`--json`) | Structured envelope with diagnostics |
45+
| Error handling | Undocumented exit codes | Typed error taxonomy, `--fail-on-error` |
46+
| Session model | Persistent named sessions | One-shot (default) + batch script mode |
47+
| Capability discovery | Implicit per-method | Explicit `discover` command |
48+
| Server log capture || ✓ Buffer `logging/message` notifications |
49+
| Sampling/elicitation || ✓ Reject + capture for inspection |
50+
| Batch workflows | Shell scripts | JSON script with `onError` control flow |
51+
| CI exit code semantics ||`--fail-on-error` |
52+
53+
---
54+
55+
## Design Decisions
56+
57+
### 1. Invocation Model: One-Shot + Batch Script
58+
59+
**Default** remains one-shot (backward compatible): connect, run one method, output result, disconnect.
60+
61+
**New**: `--script <file>` flag accepts a JSON array of operations executed sequentially on a single persistent connection.
62+
63+
```json
64+
[
65+
{ "method": "discover" },
66+
{
67+
"method": "tools/call",
68+
"toolName": "echo",
69+
"toolArgs": { "message": "hello" },
70+
"onError": "continue"
71+
},
72+
{ "method": "resources/list", "onError": "stop" },
73+
{
74+
"method": "resources/read",
75+
"uri": "demo://example",
76+
"onError": "skip-to:5"
77+
},
78+
{ "method": "ping" }
79+
]
80+
```
81+
82+
**`onError` control flow** (per step):
83+
84+
| Value | Behavior |
85+
| ------------- | --------------------------------------------- |
86+
| `"stop"` | Abort script, return results so far (default) |
87+
| `"continue"` | Record the error, proceed to next step |
88+
| `"skip-to:N"` | Jump to step index N on error (0-based) |
89+
90+
**Rationale**: Claude expresses the whole debugging plan declaratively in one tool call. No shell scripting, no jq parsing between steps.
91+
92+
### 2. Structured Output Envelope
93+
94+
Enabled via `--structured` flag. Raw JSON output remains the default for backward compatibility.
95+
96+
```json
97+
{
98+
"structuredVersion": 1,
99+
"success": true,
100+
"method": "tools/call",
101+
"durationMs": 234,
102+
"result": { "content": [{ "type": "text", "text": "Echo: hello" }] },
103+
"error": null,
104+
"logs": [
105+
{
106+
"level": "debug",
107+
"message": "tool echo invoked",
108+
"timestamp": "2026-01-30T12:00:00.000Z"
109+
}
110+
]
111+
}
112+
```
113+
114+
In script mode, the top level becomes an array of these envelopes (one per step).
115+
116+
**Error taxonomy** (mutually exclusive `error.category`):
117+
118+
| Category | Meaning |
119+
| ------------- | -------------------------------------------------------------------- |
120+
| `transport` | Could not connect — bad URL, subprocess crash, ECONNREFUSED, timeout |
121+
| `capability` | Server does not support the requested method |
122+
| `protocol` | Malformed JSON-RPC, handshake failure |
123+
| `application` | Tool/resource/prompt returned an error in its content |
124+
| `validation` | Client-side failure — missing required arg, bad metadata |
125+
126+
### 3. Server Log Capture
127+
128+
Register a `logging/message` notification handler on every connection. Buffer all log messages for the session lifetime. Include them in the structured output envelope.
129+
130+
In non-structured mode, emit captured logs to stderr.
131+
132+
**Rationale**: The CLI already enables debug-level logging on connect. Discarding the notifications is an existing bug — fixing it is the single highest-value diagnostic change.
133+
134+
### 4. `discover` Command
135+
136+
A pseudo-method that connects once and returns the full server shape:
137+
138+
```json
139+
{
140+
"serverInfo": { "name": "my-server", "version": "1.0.0" },
141+
"capabilities": {
142+
"tools": true,
143+
"resources": true,
144+
"prompts": false,
145+
"logging": true,
146+
"completions": false
147+
},
148+
"tools": [...],
149+
"resources": [...],
150+
"prompts": []
151+
}
152+
```
153+
154+
Runs: `initialize` → read capabilities → conditionally call `tools/list`, `resources/list`, `prompts/list`. One connection, one output.
155+
156+
### 5. Exit Code Contract
157+
158+
| Flag | Server `isError: true` | Client validation error | Transport error |
159+
| ----------------- | ---------------------- | ----------------------- | --------------- |
160+
| (default) | exit 0 | exit 1 | exit 1 |
161+
| `--fail-on-error` | exit 1 | exit 1 | exit 1 |
162+
163+
**Rationale**: Backward compatible by default. CI pipelines opt into strict semantics explicitly.
164+
165+
### 6. Sampling / Elicitation Policy
166+
167+
Default policy: **reject/decline all** server-initiated requests. The incoming request payloads are captured and included in the structured output envelope so the caller can inspect what the server attempted.
168+
169+
Rationale: No user to approve in a headless tool. Reject is safe. Capture provides visibility.
170+
171+
### 7. Capability Gating
172+
173+
Before dispatching any method, check that the server's `initialize` response advertises the relevant capability. If not, fail immediately with a `capability` category error.
174+
175+
---
176+
177+
## Implementation Phases
178+
179+
### Phase 1 — Debugging Primitives
180+
181+
**New files:**
182+
183+
- `cli/src/output.ts` — Output envelope formatting, error categorization
184+
- `cli/src/discover.ts` — Capability discovery logic
185+
186+
**Modified files:**
187+
188+
- `cli/src/index.ts` — Add `discover` and `ping` methods; add `--structured` and `--fail-on-error` flags; add capability gating before dispatch
189+
- `cli/src/client/connection.ts` — Register `logging/message` notification handler; expose captured logs
190+
- `cli/src/error-handler.ts` — Produce categorized `StructuredError` objects
191+
- `.github/workflows/main.yml` — Add CLI test step (currently only in narrow `cli_tests.yml`)
192+
193+
**New tests:**
194+
195+
- `cli/__tests__/ci-debugging.test.ts` — Covers: `discover`, structured output, log capture, exit codes, `resources/list` (zero coverage today), `resources/templates/list` (zero coverage today), SSE transport success paths
196+
197+
### Phase 2 — Batch Debugging Workflows
198+
199+
**New files:**
200+
201+
- `cli/src/script.ts` — Script parser, validator, and sequential executor with `onError` control flow
202+
203+
**Modified files:**
204+
205+
- `cli/src/index.ts` — Wire `--script` flag into the dispatch path
206+
207+
**New tests:**
208+
209+
- Multi-operation script over stdio
210+
- `onError` control flow (stop, continue, skip-to)
211+
- Malformed script validation
212+
213+
### Phase 3 — Advanced Diagnostics
214+
215+
- Sampling/elicitation capture (reject + include in envelope)
216+
- `completion/complete` subcommand
217+
- `--watch <duration>` notification capture mode
218+
219+
---
220+
221+
## Out of Scope
222+
223+
- Browser-based OAuth flows (require human interaction; use mcpc for these)
224+
- Full MCP server lifecycle management (we connect to servers, not manage them)
225+
- Interactive shell (mcpc does this)
226+
- Performance optimization of the double-RPC in `tools/call`
227+
- Streaming output during long-running tool calls (delivered atomically on completion)
228+
229+
---
230+
231+
## File Layout After Refactor
232+
233+
```
234+
cli/
235+
├── src/
236+
│ ├── cli.ts # Entry point (unchanged)
237+
│ ├── index.ts # Main dispatch — refactored with capability gating, new flags
238+
│ ├── transport.ts # Transport factory (unchanged)
239+
│ ├── error-handler.ts # Produces StructuredError with category
240+
│ ├── output.ts # NEW: envelope formatting, structured/raw modes
241+
│ ├── discover.ts # NEW: capability discovery + list enumeration
242+
│ ├── script.ts # NEW (Phase 2): script parser and executor
243+
│ └── client/
244+
│ ├── connection.ts # Log capture via logging/message handler
245+
│ ├── tools.ts # Unchanged
246+
│ ├── resources.ts # Unchanged
247+
│ └── prompts.ts # Unchanged
248+
└── __tests__/
249+
├── ci-debugging.test.ts # NEW: CI-focused coverage
250+
└── helpers/ # Unchanged
251+
```

0 commit comments

Comments
 (0)