Skip to content

Commit ea46f4c

Browse files
add options F (ctx.once) and G (ToolBuilder) for footgun prevention
F and G address a different axis from A-E. A-E are about dual-path (old client vs new). F and G are about the MRTR footgun: code above the inputResponses guard runs on every retry, so a DB write there executes N times for N-round elicitation. F (ctx.once): idempotency guard inside the monolithic handler. Opt-in, one line per side-effect. Makes safe code visually distinct from unsafe code; doesn't eliminate the footgun, makes it reviewable. G (ToolBuilder): Marcelo's step decomposition. incompleteStep may return IncompleteResult or data; endStep runs exactly once when all steps complete. No 'above the guard' zone because the SDK's step-tracking is the guard. Boilerplate: two function defs + .build() to replace the 3-line check. Both depend on requestState integrity - real SDK MUST HMAC-sign the blob or the client can forge step-done markers. Demos use plain base64 for clarity.
1 parent b1da71c commit ea46f4c

4 files changed

Lines changed: 393 additions & 10 deletions

File tree

examples/README-mrtr-dual-path.md

Lines changed: 30 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -25,22 +25,39 @@ you choose to carry SSE infra." The rows collapse for E, which is the argument f
2525

2626
## Options
2727

28-
| | Author writes | SDK does | Hidden re-entry | Old client gets |
29-
| ---------------------------------------------------------------- | ------------------------------- | ------------------------------ | ------------------------------------------- | ---------------------------------------------------- |
30-
| [A](./server/src/mrtr-dual-path/optionAShimMrtrCanonical.ts) | MRTR-native only | Emulates retry loop over SSE | Yes, but safe (guard is explicit in source) | Full elicitation |
31-
| [B](./server/src/mrtr-dual-path/optionBShimAwaitCanonical.ts) | `await elicit()` only | Exception → `IncompleteResult` | Yes, **unsafe** (invisible in source) | Full elicitation |
32-
| [C](./server/src/mrtr-dual-path/optionCExplicitVersionBranch.ts) | One handler, `if (mrtr)` branch | Version accessor | No | Full elicitation |
33-
| [D](./server/src/mrtr-dual-path/optionDDualRegistration.ts) | Two handlers | Picks by version | No | Full elicitation |
34-
| [E](./server/src/mrtr-dual-path/optionEDegradeOnly.ts) | MRTR-native only | Nothing | No | Result with default, or error — tool author's choice |
28+
| | Author writes | SDK does | Hidden re-entry | Old client gets |
29+
| ---------------------------------------------------------------- | ------------------------------- | ------------------------------ | ------------------------------------------- | ------------------------------------------------------ |
30+
| [A](./server/src/mrtr-dual-path/optionAShimMrtrCanonical.ts) | MRTR-native only | Emulates retry loop over SSE | Yes, but safe (guard is explicit in source) | Full elicitation |
31+
| [B](./server/src/mrtr-dual-path/optionBShimAwaitCanonical.ts) | `await elicit()` only | Exception → `IncompleteResult` | Yes, **unsafe** (invisible in source) | Full elicitation |
32+
| [C](./server/src/mrtr-dual-path/optionCExplicitVersionBranch.ts) | One handler, `if (mrtr)` branch | Version accessor | No | Full elicitation |
33+
| [D](./server/src/mrtr-dual-path/optionDDualRegistration.ts) | Two handlers | Picks by version | No | Full elicitation |
34+
| [E](./server/src/mrtr-dual-path/optionEDegradeOnly.ts) | MRTR-native only | Nothing | No | Result with default, or error — tool author's choice |
35+
| [F](./server/src/mrtr-dual-path/optionFCtxOnce.ts) | MRTR-native + `ctx.once` wraps | `once()` guard in requestState | No | (same as E — F/G are orthogonal to the dual-path axis) |
36+
| [G](./server/src/mrtr-dual-path/optionGToolBuilder.ts) | Step functions + `.build()` | Step-tracking in requestState | No | (same as E) |
3537

3638
"Hidden re-entry" = the handler function is invoked more than once for a single logical tool call, and the author can't tell from the source text. A is safe because MRTR-native code has the re-entry guard (`if (!prefs) return`) visible in the source even though the _loop_ is
3739
hidden. B is unsafe because `await elicit()` looks like a suspension point but is actually a re-entry point on MRTR sessions — see the `auditLog` landmine in that file.
3840

41+
## Footgun prevention (F, G)
42+
43+
A–E are about the dual-path axis (old client vs new). F and G are about a different axis: even in a pure-MRTR world, the naive handler shape has a footgun. Code above the `if (!prefs)` guard runs on every retry. If that code is a DB write or HTTP POST, it executes N times for
44+
N-round elicitation. The guard is present in A/E but nothing _enforces_ putting side-effects below it — safety depends on the developer knowing the convention. The analogy raised in SDK-WG review: the naive MRTR handler is de-facto GOTO — re-entry jumps to the top, and the state
45+
machine progression is implicit in the `inputResponses` checks.
46+
47+
**F (`ctx.once`)** keeps the monolithic handler but wraps side-effects in an idempotency guard. `ctx.once('audit', () => auditLog(...))` checks `requestState` — if the key is already marked executed, skip. Opt-in: an unwrapped mutation still fires twice. The footgun isn't
48+
eliminated; it's made _visually distinct_ from safe code, which is reviewable.
49+
50+
**G (`ToolBuilder`)** decomposes the handler into named step functions. `incompleteStep` may return `IncompleteResult` or data; `endStep` receives everything and runs exactly once. There is no "above the guard" zone because there is no guard — the SDK's step-tracking is the
51+
guard. Side-effects go in `endStep`; it's structurally unreachable until all elicitations complete. Boilerplate: two function definitions + `.build()` to replace A/E's 3-line check. Worth it at 3+ rounds; overkill for single-question tools where F is lighter.
52+
53+
Both F and G depend on `requestState` integrity. The demos use plain base64 JSON; a real SDK MUST HMAC-sign the blob, because otherwise the client can forge step-done / once-executed markers and skip the guards. Per-session key derived from `initialize` keeps it stateless.
54+
Without signing, the safety story is advisory.
55+
3956
## Client impact
4057

41-
None. All five options present identical wire behaviour to each client version. A 2025-11 client sees either a standard `elicitation/create` over SSE (A/B/C/D) or a plain `CallToolResult` (E — either a real result with a default, or an error, tool author's choice). All vanilla
42-
2025-11 shapes. A 2026-06 client sees `IncompleteResult` in every case. The server's internal choice doesn't leak. This is the cleanest argument against per-feature `-mrtr` capability flags: there's nothing for them to signal, because the client's behaviour is already fully
43-
determined by `protocolVersion` plus the existing `elicitation`/`sampling` capabilities.
58+
None. All seven options present identical wire behaviour to each client version (F and G are the same as E on the wire — the footgun-prevention is server-internal). A 2025-11 client sees either a standard `elicitation/create` over SSE (A/B/C/D) or a plain `CallToolResult` (E —
59+
either a real result with a default, or an error, tool author's choice). All vanilla 2025-11 shapes. A 2026-06 client sees `IncompleteResult` in every case. The server's internal choice doesn't leak. This is the cleanest argument against per-feature `-mrtr` capability flags:
60+
there's nothing for them to signal, because the client's behaviour is already fully determined by `protocolVersion` plus the existing `elicitation`/`sampling` capabilities.
4461

4562
For the reverse direction — new client SDK connecting to an old server — see `examples/client/src/mrtr-dual-path/`. Split into two files to make the boundary explicit: [`clientDualPath.ts`](./client/src/mrtr-dual-path/clientDualPath.ts) is ~55 lines of what the app developer
4663
writes (one `handleElicitation` function, one registration, one tool call); [`sdkLib.ts`](./client/src/mrtr-dual-path/sdkLib.ts) is the retry loop + `IncompleteResult` parsing the SDK would ship. The app file is small on purpose — the delta from today's client code is zero.
@@ -62,6 +79,9 @@ the dual-path burden on the tool author rather than the SDK.
6279

6380
**A vs C/D** is about who owns the SSE fallback. A: SDK owns it, author writes once. C/D: author owns it, writes twice. A is less code for authors but more magic; C/D is more code for authors but no magic.
6481

82+
**F vs G** is the footgun-prevention trade. F is minimal — one line per side-effect, composes with any handler shape (A, E, or raw MRTR). G is structural — the step decomposition makes double-execution impossible for `endStep`, but costs two function definitions per tool. Neither
83+
replaces A–E; they layer on top. The likely SDK answer is: ship F as a primitive on the MRTR context, ship G as an opt-in builder, recommend G for multi-round tools and F for single-question tools with one side-effect.
84+
6585
## Running
6686

6787
All demos use `DEMO_PROTOCOL_VERSION` to simulate the negotiated version, since the real SDK doesn't surface it to handlers yet. Server demos run from `examples/server`:
Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
/**
2+
* Option F: ctx.once — idempotency guard inside the monolithic handler.
3+
*
4+
* Same MRTR-native shape as A/E, but side-effects get wrapped in
5+
* `ctx.once(key, fn)`. The guard lives in `requestState` — on retry,
6+
* keys marked executed skip their fn. Makes the hazard *visible* at
7+
* the call site without restructuring the handler.
8+
*
9+
* Opt-in: an unwrapped `db.write()` above the guard still fires twice.
10+
* The footgun isn't eliminated — it's made reviewable. `ctx.once('x', …)`
11+
* reads differently from a bare call; a reviewer can grep for effects
12+
* that aren't wrapped.
13+
*
14+
* When to reach for this over G (ToolBuilder): single elicitation, one
15+
* or two side-effects, handler fits in ten lines. When the step count
16+
* hits 3+, the ToolBuilder boilerplate pays for itself.
17+
*
18+
* Run: DEMO_PROTOCOL_VERSION=2026-06 pnpm tsx src/mrtr-dual-path/optionFCtxOnce.ts
19+
*/
20+
21+
import type { CallToolResult } from '@modelcontextprotocol/server';
22+
import { McpServer, StdioServerTransport } from '@modelcontextprotocol/server';
23+
import * as z from 'zod/v4';
24+
25+
import { acceptedContent, elicitForm, MrtrCtx, readMrtr, wrap } from './shims.js';
26+
27+
type Units = 'metric' | 'imperial';
28+
29+
function lookupWeather(location: string, units: Units): string {
30+
const temp = units === 'metric' ? '22°C' : '72°F';
31+
return `Weather in ${location}: ${temp}, partly cloudy.`;
32+
}
33+
34+
// The side-effect the footgun is about. In Option B this was commented
35+
// out; here it's live, because the guard makes it safe.
36+
let auditCount = 0;
37+
function auditLog(location: string): void {
38+
auditCount++;
39+
console.error(`[audit] lookup requested for ${location} (count=${auditCount})`);
40+
}
41+
42+
const server = new McpServer({ name: 'mrtr-option-f', version: '0.0.0' });
43+
44+
server.registerTool(
45+
'weather',
46+
{
47+
description: 'Weather lookup (Option F: ctx.once idempotency guard)',
48+
inputSchema: z.object({ location: z.string(), _mrtr: z.unknown().optional() })
49+
},
50+
async ({ location, _mrtr }): Promise<CallToolResult> => {
51+
const ctx = new MrtrCtx(readMrtr({ _mrtr }));
52+
53+
// ───────────────────────────────────────────────────────────────────
54+
// This is the hazard line. In A/E it would run on every retry.
55+
// Here it runs once — `ctx.once` checks requestState, skips on retry.
56+
// A reviewer sees `ctx.once` and knows the author considered
57+
// re-entry. A bare `auditLog(location)` would be the red flag.
58+
// ───────────────────────────────────────────────────────────────────
59+
ctx.once('audit', () => auditLog(location));
60+
61+
const prefs = acceptedContent<{ units: Units }>(ctx.inputResponses, 'units');
62+
if (!prefs) {
63+
// `ctx.incomplete()` encodes the executed-keys set into
64+
// requestState so the `once` guard holds across retry.
65+
return wrap(
66+
ctx.incomplete({
67+
units: elicitForm({
68+
message: 'Which units?',
69+
requestedSchema: {
70+
type: 'object',
71+
properties: { units: { type: 'string', enum: ['metric', 'imperial'], title: 'Units' } },
72+
required: ['units']
73+
}
74+
})
75+
})
76+
);
77+
}
78+
79+
return { content: [{ type: 'text', text: lookupWeather(location, prefs.units) }] };
80+
}
81+
);
82+
83+
const transport = new StdioServerTransport();
84+
await server.connect(transport);
85+
console.error('[option-F] ready');
Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
/**
2+
* Option G: ToolBuilder — Marcelo's explicit step decomposition.
3+
*
4+
* The monolithic handler becomes a sequence of named step functions.
5+
* `incompleteStep` may return `IncompleteResult` (needs more input) or
6+
* a data object (satisfied, pass to next step). `endStep` receives
7+
* everything and runs exactly once — it's structurally unreachable
8+
* until every prior step has returned data.
9+
*
10+
* The footgun is eliminated by code shape, not discipline. There is
11+
* no "above the guard" zone because there is no guard — the SDK's
12+
* step-tracking (via `requestState`) is the guard. Side-effects go
13+
* in `endStep`; anything in an `incompleteStep` is documented as
14+
* must-be-idempotent, and the return-type split makes the distinction
15+
* visible at the function signature level.
16+
*
17+
* Boilerplate: two function definitions + `.build()` to replace
18+
* A/E's 3-line `if (!prefs) return`. Worth it at 3+ rounds or when
19+
* the side-effect story matters. Overkill for a single-question tool.
20+
*
21+
* Run: DEMO_PROTOCOL_VERSION=2026-06 pnpm tsx src/mrtr-dual-path/optionGToolBuilder.ts
22+
*/
23+
24+
import type { CallToolResult } from '@modelcontextprotocol/server';
25+
import { McpServer, StdioServerTransport } from '@modelcontextprotocol/server';
26+
import * as z from 'zod/v4';
27+
28+
import { acceptedContent, elicitForm, readMrtr, ToolBuilder, wrap } from './shims.js';
29+
30+
type Units = 'metric' | 'imperial';
31+
32+
function lookupWeather(location: string, units: Units): string {
33+
const temp = units === 'metric' ? '22°C' : '72°F';
34+
return `Weather in ${location}: ${temp}, partly cloudy.`;
35+
}
36+
37+
let auditCount = 0;
38+
function auditLog(location: string): void {
39+
auditCount++;
40+
console.error(`[audit] lookup requested for ${location} (count=${auditCount})`);
41+
}
42+
43+
const server = new McpServer({ name: 'mrtr-option-g', version: '0.0.0' });
44+
45+
// ───────────────────────────────────────────────────────────────────────────
46+
// Step 1: ask for units. Returns IncompleteResult if not yet provided,
47+
// or `{ units }` to pass forward. MUST be idempotent — it can re-run
48+
// if requestState is tampered with (demo doesn't sign) or if the step
49+
// before it isn't the most-recently-completed one. No side-effects here.
50+
// ───────────────────────────────────────────────────────────────────────────
51+
52+
const askUnits = (_args: { location: string }, inputs: Parameters<typeof acceptedContent>[0]) => {
53+
const prefs = acceptedContent<{ units: Units }>(inputs, 'units');
54+
if (!prefs) {
55+
return {
56+
inputRequests: {
57+
units: elicitForm({
58+
message: 'Which units?',
59+
requestedSchema: {
60+
type: 'object',
61+
properties: { units: { type: 'string', enum: ['metric', 'imperial'], title: 'Units' } },
62+
required: ['units']
63+
}
64+
})
65+
}
66+
};
67+
}
68+
return { units: prefs.units };
69+
};
70+
71+
// ───────────────────────────────────────────────────────────────────────────
72+
// End step: has everything, does the work. Runs exactly once. This is
73+
// where side-effects live — the SDK guarantees this function is not
74+
// reached until `askUnits` (and any other incompleteSteps) have all
75+
// returned data. The `auditLog` call here fires once regardless of how
76+
// many MRTR rounds it took to collect the inputs.
77+
// ───────────────────────────────────────────────────────────────────────────
78+
79+
const fetchWeather = ({ location }: { location: string }, collected: Record<string, unknown>): CallToolResult => {
80+
auditLog(location);
81+
const units = collected.units as Units;
82+
return { content: [{ type: 'text', text: lookupWeather(location, units) }] };
83+
};
84+
85+
// ───────────────────────────────────────────────────────────────────────────
86+
// Assembly. Steps are named (not ordinal) so reordering during
87+
// development doesn't silently remap data. The builder is the
88+
// MRTR-native handler; everything from A/E's dual-path discussion
89+
// still applies (wrap in sseRetryShim for top-left, degrade for
90+
// bottom-left). The footgun-prevention is orthogonal to that axis.
91+
// ───────────────────────────────────────────────────────────────────────────
92+
93+
const weatherHandler = new ToolBuilder<{ location: string }>().incompleteStep('askUnits', askUnits).endStep(fetchWeather).build();
94+
95+
server.registerTool(
96+
'weather',
97+
{
98+
description: 'Weather lookup (Option G: ToolBuilder step decomposition)',
99+
inputSchema: z.object({ location: z.string(), _mrtr: z.unknown().optional() })
100+
},
101+
async ({ location, _mrtr }) => wrap(await weatherHandler({ location }, readMrtr({ _mrtr }), undefined as never))
102+
);
103+
104+
const transport = new StdioServerTransport();
105+
await server.connect(transport);
106+
console.error('[option-G] ready');

0 commit comments

Comments
 (0)