Skip to content

Commit b48e32d

Browse files
committed
docs(design): add MCP Test Harness design document
- Comprehensive design for focused MCP client test harness - 5-step process: test suite input, server launch, schema validation, protocol testing, reporting - Component architecture with stdio transport focus - API design with Rust structs and YAML configuration format - Implementation plan with 3 phases (P0 compilation fixes → P1 protocol → P1 validation) - Configuration examples for tool and prompt testing - Clear success criteria and explicitly defined out-of-scope items - Follows design-first development approach per project best practices Addresses core implementation scope without feature creep. closes #134
1 parent 8895a3e commit b48e32d

1 file changed

Lines changed: 203 additions & 0 deletions

File tree

docs/design/mcp-test-harness.md

Lines changed: 203 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,203 @@
1+
# MCP Test Harness Design Document
2+
3+
## Problem Statement
4+
5+
We need a simple, focused MCP test harness that acts as an MCP client to validate MCP server functionality based on test specifications. The current implementation has scope creep, compilation issues, and doesn't align with the official MCP protocol specification.
6+
7+
**Core Problem**: Test any MCP server for protocol compliance and functional correctness without performance monitoring or unnecessary complexity.
8+
9+
## Proposed Solution
10+
11+
### High-Level Approach
12+
13+
A focused MCP test harness that operates as a **MCP client** connecting to servers under test via stdio transport, validating their capabilities and testing their functionality against expected specifications.
14+
15+
### Component Architecture
16+
17+
```
18+
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
19+
│ Test Suite │ │ MCP Test │ │ Test Report │
20+
│ Configuration │───▶│ Harness │───▶│ Generator │
21+
│ (YAML) │ │ (Client) │ │ (JSON/Table) │
22+
└─────────────────┘ └──────────────────┘ └─────────────────┘
23+
24+
25+
┌──────────────────┐
26+
│ MCP Server │
27+
│ (Under Test) │
28+
│ stdio/JSON-RPC │
29+
└──────────────────┘
30+
```
31+
32+
## Requirements (5-Step Process)
33+
34+
Based on user-defined scope, the test harness must implement exactly these steps:
35+
36+
1. **Test Suite Input**: Accept test suite with server spec and expected capabilities
37+
2. **Server Launch**: Launch MCP server as subprocess via stdio transport
38+
3. **Schema Validation**: Validate all tools/prompts/resources are present with correct I/O schemas
39+
4. **Protocol Testing**: Test all built-in/protocol-specified API calls work as expected
40+
5. **Test Execution & Reporting**: Run all test suites and publish comprehensive report
41+
42+
## API Design
43+
44+
### Core Components
45+
46+
```rust
47+
/// Main test harness orchestrator
48+
pub struct McpTestHarness {
49+
config: TestSuiteConfig,
50+
client: McpClient,
51+
validator: SchemaValidator,
52+
reporter: TestReporter,
53+
}
54+
55+
/// MCP client for stdio communication
56+
pub struct McpClient {
57+
process: Option<Child>,
58+
stdin: Option<ChildStdin>,
59+
stdout: BufReader<ChildStdout>,
60+
request_id: AtomicU64,
61+
}
62+
63+
/// Schema validation for MCP capabilities
64+
pub struct SchemaValidator {
65+
json_schema_validator: JsonSchemaValidator,
66+
}
67+
68+
/// Test result reporting
69+
pub struct TestReporter {
70+
results: Vec<TestResult>,
71+
output_format: OutputFormat,
72+
}
73+
```
74+
75+
### Test Suite Configuration Format
76+
77+
```rust
78+
#[derive(Debug, Serialize, Deserialize)]
79+
pub struct TestSuiteConfig {
80+
/// Server configuration
81+
pub server: ServerConfig,
82+
83+
/// Expected capabilities from the server
84+
pub expected_capabilities: ExpectedCapabilities,
85+
86+
/// Individual test cases to execute
87+
pub test_cases: Vec<TestCase>,
88+
89+
/// Global test configuration
90+
pub settings: TestSettings,
91+
}
92+
93+
#[derive(Debug, Serialize, Deserialize)]
94+
pub struct ServerConfig {
95+
/// Command to launch the MCP server
96+
pub command: String,
97+
98+
/// Arguments for the server command
99+
pub args: Vec<String>,
100+
101+
/// Working directory for the server
102+
pub working_dir: Option<String>,
103+
104+
/// Environment variables
105+
pub env: HashMap<String, String>,
106+
107+
/// Startup timeout in seconds
108+
pub startup_timeout: u64,
109+
}
110+
111+
#[derive(Debug, Serialize, Deserialize)]
112+
pub struct ExpectedCapabilities {
113+
/// Expected tools with their schemas
114+
pub tools: Vec<ExpectedTool>,
115+
116+
/// Expected prompts with their schemas
117+
pub prompts: Vec<ExpectedPrompt>,
118+
119+
/// Expected resources with their schemas
120+
pub resources: Vec<ExpectedResource>,
121+
}
122+
```
123+
124+
## Implementation Plan
125+
126+
### Phase 1: Core Infrastructure (P0)
127+
1. Fix compilation issues
128+
2. Basic MCP client implementation with stdio transport
129+
3. JSON-RPC 2.0 message handling per MCP spec
130+
131+
### Phase 2: MCP Protocol Implementation (P1)
132+
1. MCP initialization handshake (`initialize`)
133+
2. Capability discovery (`list_tools`, `list_prompts`, `list_resources`)
134+
3. Functional testing (`tools/call`, `prompts/get`, `resources/read`)
135+
136+
### Phase 3: Validation & Reporting (P1)
137+
1. Schema validation for expected vs actual capabilities
138+
2. Test execution engine
139+
3. Comprehensive reporting (table, JSON, YAML)
140+
141+
## Configuration Examples
142+
143+
### Basic Tool Testing
144+
```yaml
145+
server:
146+
command: "node"
147+
args: ["filesystem-server.js", "/tmp"]
148+
working_dir: "/path/to/server"
149+
startup_timeout: 10
150+
151+
expected_capabilities:
152+
tools:
153+
- name: "read_file"
154+
description: "Read contents of a file"
155+
input_schema:
156+
type: "object"
157+
properties:
158+
path:
159+
type: "string"
160+
description: "Path to the file to read"
161+
required: ["path"]
162+
required: true
163+
164+
test_cases:
165+
- id: "test_read_existing_file"
166+
description: "Test reading an existing file"
167+
test_type:
168+
ToolCall:
169+
tool_name: "read_file"
170+
input:
171+
path: "/tmp/test.txt"
172+
expected_result:
173+
success: true
174+
```
175+
176+
## Success Criteria
177+
178+
### Functional Requirements
179+
- [ ] Can launch any MCP server via stdio transport
180+
- [ ] Validates server capabilities match expected schemas
181+
- [ ] Tests all MCP protocol APIs (initialize, list_*, tools/call, etc.)
182+
- [ ] Generates comprehensive test reports
183+
- [ ] Handles errors gracefully with detailed messages
184+
- [ ] Simple YAML configuration format
185+
186+
### Out of Scope (Explicitly)
187+
- Performance monitoring/benchmarking
188+
- HTTP transport (stdio only for now)
189+
- Custom validation scripts
190+
- Advanced metrics beyond pass/fail
191+
192+
## Alternative Approaches Considered
193+
194+
### 1. Performance-First Approach
195+
**Rejected**: Added unnecessary complexity for benchmarking when the goal is functional validation.
196+
197+
### 2. Multi-Transport Support
198+
**Rejected**: HTTP transport adds complexity; stdio covers majority of MCP servers and is simpler to implement reliably.
199+
200+
### 3. Plugin-Based Validation
201+
**Rejected**: Custom validation scripts add security concerns and implementation complexity beyond the core goal.
202+
203+
The chosen approach focuses on **simplicity, reliability, and clear scope** for MCP protocol compliance testing without feature creep.

0 commit comments

Comments
 (0)