Skip to content

Commit 1998a18

Browse files
Port the end-to-end test suite to v2 (#2179)
1 parent 5fc42e9 commit 1998a18

46 files changed

Lines changed: 20108 additions & 1 deletion

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.changeset/add-e2e-test-suite.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
'@modelcontextprotocol/test-e2e': patch
3+
---
4+
5+
Add the end-to-end behavior test suite as a workspace package: a requirements manifest covering protocol-visible SDK behavior across the in-memory, stdio, and Streamable HTTP transports, ported from the v1.x branch and extended with coverage for v2 features.

.github/workflows/main.yml

Lines changed: 27 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,33 @@ jobs:
5454

5555
- run: pnpm install
5656

57-
- run: pnpm test:all
57+
# The e2e suite has its own job below; everything else runs here.
58+
- run: pnpm -r --filter '!@modelcontextprotocol/test-e2e' test
59+
60+
test-e2e:
61+
runs-on: ubuntu-latest
62+
strategy:
63+
fail-fast: false
64+
matrix:
65+
node-version: [20, 22, 24]
66+
67+
steps:
68+
- uses: actions/checkout@v6
69+
70+
- name: Install pnpm
71+
uses: pnpm/action-setup@fc06bc1257f339d1d5d8b3a19a8cae5388b55320 # v5.0.0
72+
id: pnpm-install
73+
with:
74+
run_install: false
75+
- uses: actions/setup-node@v6
76+
with:
77+
node-version: ${{ matrix.node-version }}
78+
cache: pnpm
79+
cache-dependency-path: pnpm-lock.yaml
80+
81+
- run: pnpm install
82+
83+
- run: pnpm --filter @modelcontextprotocol/test-e2e test
5884

5985
test-runtimes:
6086
runs-on: ubuntu-latest

pnpm-lock.yaml

Lines changed: 81 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

test/e2e/CLAUDE.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
# E2E test suite
2+
3+
Conformance-style tests for the SDK's public surface. `requirements.ts` is a pure-data manifest: every behavior the SDK must satisfy, with its spec/source link. Test files in `scenarios/` cite the requirement id(s) they prove via `verifies()` (`helpers/verifies.ts`), which
4+
registers one cell per applicable (transport, spec version). `coverage.test.ts` statically checks that every non-deferred requirement is cited and that the manifest is internally consistent.
5+
6+
## Writing a test
7+
8+
Add a `verifies()` call with an anonymous async body to `scenarios/<area>.test.ts`:
9+
10+
```ts
11+
verifies('tools:call:content:text', async ({ transport }) => {
12+
const makeServer = () => {
13+
const s = new McpServer({ name: 't', version: '0' });
14+
s.registerTool('echo', { inputSchema: z.object({ text: z.string() }) }, ({ text }) => ({
15+
content: [{ type: 'text', text }]
16+
}));
17+
return s;
18+
};
19+
const client = new Client({ name: 'c', version: '0' });
20+
21+
await using _ = await wire(transport, makeServer, client);
22+
23+
const r = await client.callTool({ name: 'echo', arguments: { text: 'hi' } });
24+
expect(r.content).toEqual([{ type: 'text', text: 'hi' }]);
25+
});
26+
```
27+
28+
Self-contained: build server inline (factory), build client inline, `wire()`, assert. No shared fixture files. Pass an array of ids when one body genuinely proves several requirements; pass `{ title: '...' }` as the third argument only when a requirement needs more than one body
29+
(the title is how knownFailures target a specific body).
30+
31+
The corresponding manifest entry is pure data:
32+
33+
```ts
34+
'tools:call:content:text': {
35+
source: 'https://modelcontextprotocol.io/...',
36+
behavior: 'tools/call returns content[] with type:text...'
37+
},
38+
```
39+
40+
## knownFailures, deferred, and transport restrictions
41+
42+
When a test asserts required behavior the SDK does not satisfy, keep the test exact and record it in the manifest:
43+
44+
```ts
45+
knownFailures: [{ note: 'changed in v2: ...' /* optional: test: '<verifies title>', transport, specVersion */ }];
46+
```
47+
48+
`verifies()` runs matching cells as `test.fails()` — they pass while the SDK misbehaves and fail once it is fixed (then remove the entry). When the behavior cannot be expressed against the public surface at all (e.g. an API removed in v2), mark the requirement
49+
`deferred: '<reason>'` instead — deferred ids must not be cited by any `verifies()` call.
50+
51+
When a transport structurally cannot express the behavior (e.g. server→client roundtrip on stateless hosting), restrict the requirement itself rather than skipping tests:
52+
53+
```ts
54+
transports: STATEFUL_TRANSPORTS, // or an explicit list
55+
note: 'stateless hosting has no server→client back-channel'
56+
```
57+
58+
`addedInSpecVersion` / `removedInSpecVersion` bound the spec versions a requirement applies to; a behavior changed by a spec release gets a sibling entry linked via `supersedes`.
59+
60+
## Running
61+
62+
From the repo root (the suite is the `@modelcontextprotocol/test-e2e` workspace package):
63+
64+
```bash
65+
pnpm --filter @modelcontextprotocol/test-e2e test # all
66+
pnpm --filter @modelcontextprotocol/test-e2e exec vitest run scenarios/tools.test.ts # one area
67+
pnpm --filter @modelcontextprotocol/test-e2e exec vitest run -t 'tools:' # one requirement-id prefix
68+
pnpm --filter @modelcontextprotocol/test-e2e exec vitest run coverage.test.ts # manifest gates
69+
pnpm --filter @modelcontextprotocol/test-e2e typecheck
70+
pnpm --filter @modelcontextprotocol/test-e2e lint
71+
```
72+
73+
Slugs prefixed `typescript:` are TypeScript-SDK-specific requirements (they describe this SDK's own API surface and intentionally have no shared cross-SDK meaning); unprefixed slugs share their id and behavior wording with the Python interaction suite where both cover the
74+
behavior.

test/e2e/coverage.test.ts

Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
/**
2+
* Manifest gates for the e2e suite.
3+
*
4+
* The linkage is inverted: test files cite the requirement id(s) they prove via
5+
* `verifies(...)` (helpers/verifies.ts) and requirements.ts is pure data. These
6+
* tests statically scan test/e2e/scenarios/*.test.ts for the cited ids and check them
7+
* against the manifest, plus the manifest's own internal consistency rules.
8+
*/
9+
10+
import { readdirSync, readFileSync } from 'node:fs';
11+
import path from 'node:path';
12+
import { fileURLToPath } from 'node:url';
13+
14+
import { expect, test } from 'vitest';
15+
16+
import { REQUIREMENTS } from './requirements.js';
17+
18+
const E2E_DIR = path.dirname(fileURLToPath(import.meta.url));
19+
20+
interface VerifiesCall {
21+
file: string;
22+
/** Explicit `{ title: '...' }` passed to verifies(), if any (undefined for an untitled body). */
23+
title: string | undefined;
24+
ids: string[];
25+
}
26+
27+
/** Statically scan test/e2e/scenarios/*.test.ts for `verifies(<ids>, ...)` calls. */
28+
function scanVerifiesCalls(): VerifiesCall[] {
29+
const calls: VerifiesCall[] = [];
30+
const scenariosDir = path.join(E2E_DIR, 'scenarios');
31+
const files = readdirSync(scenariosDir)
32+
.filter(f => f.endsWith('.test.ts'))
33+
.toSorted();
34+
for (const file of files) {
35+
const text = readFileSync(path.join(scenariosDir, file), 'utf8');
36+
// Each call spans from its header to the first column-0 close (`});` for an
37+
// untitled hugged call, `);` for a call expanded by an opts third argument).
38+
for (const m of text.matchAll(/verifies\(\s*('[^']*'|\[[^\]]*\])\s*,\s*async\s*\([\s\S]*?\n(?:\}\);|\);)/g)) {
39+
const ids = [...(m[1] ?? '').matchAll(/'([^']*)'/g)].map(x => x[1]).filter(id => id !== undefined);
40+
const title = m[0].match(/\{\s*title:\s*'([^']*)'\s*\}\s*\n?\);$/)?.[1];
41+
calls.push({ file, title, ids });
42+
}
43+
}
44+
return calls;
45+
}
46+
47+
const CALLS = scanVerifiesCalls();
48+
const CITED = new Set(CALLS.flatMap(c => c.ids));
49+
50+
test('every non-deferred requirement id is cited by at least one verifies() call', () => {
51+
const missing = Object.entries(REQUIREMENTS)
52+
.filter(([id, r]) => !r.deferred && !CITED.has(id))
53+
.map(([id]) => id);
54+
expect(missing).toEqual([]);
55+
});
56+
57+
test('every cited requirement id exists in the manifest and is not deferred', () => {
58+
const bad: string[] = [];
59+
for (const c of CALLS) {
60+
for (const id of c.ids) {
61+
const req = REQUIREMENTS[id];
62+
if (!req) bad.push(`${c.file}: a verifies() call cites unknown requirement '${id}'`);
63+
else if (req.deferred) bad.push(`${c.file}: a verifies() call cites deferred requirement '${id}'`);
64+
}
65+
}
66+
expect(bad).toEqual([]);
67+
});
68+
69+
test('every knownFailure with a test string names an explicit verifies() title that cites the requirement', () => {
70+
const bad: string[] = [];
71+
for (const [id, r] of Object.entries(REQUIREMENTS)) {
72+
for (const kf of r.knownFailures ?? []) {
73+
if (kf.test === undefined) continue;
74+
const cited = CALLS.some(c => c.title === kf.test && c.ids.includes(id));
75+
if (!cited)
76+
bad.push(
77+
`${id}: knownFailure references title '${kf.test}', which is not an explicit verifies() title citing this requirement`
78+
);
79+
}
80+
}
81+
expect(bad).toEqual([]);
82+
});
83+
84+
test('every transport-restricted requirement explains why in note', () => {
85+
const missing = Object.entries(REQUIREMENTS)
86+
.filter(([, r]) => r.transports !== undefined && !r.note)
87+
.map(([id]) => id);
88+
expect(missing).toEqual([]);
89+
});
90+
91+
test('every supersedes reference points at an existing requirement id', () => {
92+
for (const [id, req] of Object.entries(REQUIREMENTS)) {
93+
if (req.supersedes !== undefined) {
94+
expect(REQUIREMENTS[req.supersedes], `${id} supersedes unknown id '${req.supersedes}'`).toBeDefined();
95+
}
96+
}
97+
});

test/e2e/eslint.config.mjs

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
// @ts-check
2+
3+
import baseConfig from '@modelcontextprotocol/eslint-config';
4+
5+
export default [
6+
...baseConfig,
7+
{
8+
rules: {
9+
// `await using _ = await wire(...)` holds the connection open for the test body; the binding is intentionally unused
10+
'@typescript-eslint/no-unused-vars': ['error', { argsIgnorePattern: '^_', varsIgnorePattern: '^_$' }],
11+
// scenario files keep the kebab-case names they share with the v1.x suite
12+
'unicorn/filename-case': ['error', { cases: { camelCase: true, kebabCase: true } }]
13+
}
14+
}
15+
];

test/e2e/fixtures/stdio-server.ts

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
/**
2+
* Runnable stdio MCP server fixture for the transport:stdio:* e2e tests.
3+
*
4+
* Spawned as a real child process by test/e2e/scenarios/stdio.ts. Registers a
5+
* single `echo` tool, writes a readiness marker line to stderr once it is
6+
* serving, and — when E2E_IGNORE_SIGTERM=1 — keeps running after stdin EOF and
7+
* swallows SIGTERM so the client transport's shutdown escalation
8+
* (stdin EOF → SIGTERM → SIGKILL) is observable.
9+
*/
10+
11+
/* eslint-disable unicorn/no-process-exit -- standalone spawned executable; exit codes are the behavior under test */
12+
13+
import { McpServer } from '@modelcontextprotocol/server';
14+
import { StdioServerTransport } from '@modelcontextprotocol/server/stdio';
15+
import { z } from 'zod/v4';
16+
17+
const server = new McpServer({ name: 'stdio-echo-server', version: '1.0.0' });
18+
19+
server.registerTool(
20+
'echo',
21+
{
22+
description: 'Echoes the input text back as a text content block, including multi-line text.',
23+
inputSchema: z.object({ text: z.string() })
24+
},
25+
({ text }) => ({ content: [{ type: 'text', text }] })
26+
);
27+
28+
// env-report tool: returns JSON array of environment variable names (sorted) that
29+
// reached the child process. This allows tests to verify the env safelist behavior.
30+
server.registerTool(
31+
'env-report',
32+
{
33+
description: 'Returns sorted array of environment variable names present in this process.',
34+
inputSchema: z.object({})
35+
},
36+
() => {
37+
const envKeys = Object.keys(process.env).toSorted();
38+
return { content: [{ type: 'text', text: JSON.stringify(envKeys) }] };
39+
}
40+
);
41+
42+
if (process.env.E2E_IGNORE_SIGTERM === '1') {
43+
// Misbehaving-server mode: keep alive after stdin EOF via interval (load-bearing — without it the child exits on stdin EOF and SIGTERM never arrives) and ignore SIGTERM, so only SIGKILL can end the process.
44+
setInterval(() => {}, 1000);
45+
setTimeout(() => process.exit(1), 30_000);
46+
process.on('SIGTERM', () => {
47+
process.stderr.write('[stdio-server] sigterm ignored\n');
48+
});
49+
}
50+
51+
if (process.env.E2E_GARBAGE_STDOUT === '1') {
52+
// Broken-server mode: write non-JSON garbage to stdout before the server connects, simulating a broken or misconfigured server that pollutes the JSON-RPC channel.
53+
process.stdout.write('GARBAGE LINE 1: not json\n');
54+
process.stdout.write('GARBAGE LINE 2: {malformed json\n');
55+
process.stdout.write('GARBAGE LINE 3: also not valid jsonrpc\n');
56+
// Valid JSON but not a valid JSON-RPC message: v2 silently skips non-JSON noise, but schema-invalid messages must still surface via onerror.
57+
process.stdout.write('{"jsonrpc":"1.0","bogus":true}\n');
58+
process.stdin.resume();
59+
process.stdin.on('end', () => process.exit(0));
60+
setTimeout(() => process.exit(1), 30_000);
61+
} else {
62+
await server.connect(new StdioServerTransport());
63+
process.stderr.write('[stdio-server] ready\n');
64+
}

0 commit comments

Comments
 (0)