Skip to content

Commit fee585c

Browse files
authored
docs: add review-scenario skill for SEP scenario PR review (#276)
1 parent 39f34b2 commit fee585c

2 files changed

Lines changed: 54 additions & 0 deletions

File tree

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
---
2+
name: review-scenario
3+
description: Use when reviewing a conformance PR that adds or changes scenario .ts files for a SEP — before approving, before requesting changes, or as a self-check before opening one.
4+
---
5+
6+
# review-scenario
7+
8+
## What to check
9+
10+
**Spec diff is ground truth.** Pull the SEP's actual spec changes and read the RFC-2119 sentences yourself — don't trust the PR description or SEP summary for keyword levels:
11+
12+
```bash
13+
gh api "repos/modelcontextprotocol/modelcontextprotocol/pulls/<SEP>/files" \
14+
--jq '.[] | select(.filename | test("^docs/specification/draft/.*\\.mdx$")) | {filename, patch}'
15+
```
16+
17+
If the SEP includes a conformance-test-case table, that table is authoritative for the cases it lists. A table/prose mismatch is a spec gap to flag, not something to silently resolve either way.
18+
19+
**Traceability YAML.** `src/seps/sep-<SEP>.yaml` should exist (run `/new-sep <SEP>` first if not). Diff its rows against the spec sentences you extracted; flag rows that paraphrase rather than quote, claim a keyword level the spec doesn't, or assert something the spec never says. Check IDs follow `sep-<NNNN>-<kebab-slug>`.
20+
21+
**Per-scenario-file:**
22+
23+
- **Spec backing** — would a fully spec-compliant implementation FAIL this check? If yes — or if two compliant SDKs in different languages would get different results — the spec hasn't pinned the behavior; note it as a gap rather than enforce it.
24+
- **Dead checks** — emits FAILURE with no reachable SUCCESS counterpart, or sits behind an always-false guard.
25+
- **Logic** — does a missing/malformed input silently pass? Does the assertion distinguish "rejected for the right reason" from "rejected at all"?
26+
27+
**Coverage.** Count YAML `check:` rows vs how many the PR's scenarios actually exercise; list the gaps.
28+
29+
**Proof it runs.** The PR should reference at least one real implementation the scenario ran green against — the in-repo everything-client/server, or an external SDK via `npx https://pkg.pr.new/@modelcontextprotocol/conformance@<PR>`. No run referenced → ask for one before approving.
30+
31+
**Negative test.** Pins the specific failing slugs, not just `failures.length > 0` (AGENTS.md §Examples: prove it passes and fails).
32+
33+
## Output
34+
35+
This is a first pass for a human reviewer — give them what they need to verify each finding without re-deriving it.
36+
37+
**Open with a summary:** N scenarios added/changed, M distinct check IDs emitted, X/Y YAML `check:` rows covered, and which implementation it was run against.
38+
39+
**Then one bullet per finding.** Each bullet makes its own case — the reviewer should be able to confirm or refute it from the bullet alone:
40+
41+
> **`<check-id>`**[`file.ts:Lnn`](https://github.com/modelcontextprotocol/conformance/blob/<HEAD-SHA>/path/file.ts#Lnn) — claim. Spec: _"quoted normative sentence"_ ([page#anchor](https://modelcontextprotocol.io/specification/draft/...#anchor)). Consequence: what a compliant impl would do and how this check would mis-report it.
42+
43+
e.g.
44+
45+
> **`client-consistent-version`**[`stateless.ts:86`](…/blob/abc123/src/scenarios/client/stateless.ts#L86) — no spec backing. Spec: _"Servers MUST NOT rely on prior requests over the same connection to establish context (e.g., capabilities, protocol version)"_ ([lifecycle#stateless]()). A compliant client may change `protocolVersion` per request; this check FAILs it. The `flippingVersionClient` negative test enforces a non-requirement.
46+
47+
Get `<HEAD-SHA>` once via `gh pr view <PR> --json headRefOid -q .headRefOid` and use it for all permalinks so they don't drift on rebase.
48+
49+
Order: spec-backing → logic/dead → coverage (gap list) → conventions. Put spec gaps in a separate trailing list — those go upstream, not to the PR author.
50+
51+
**Self-review:** fix in place and re-run.
52+
53+
**If asked to push fixes** (stacked diff on top of the PR head): one commit per finding, commit message is the finding. Leave design-level items (scenario count, refactors) as prose.

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,3 +4,4 @@ lefthook-local.yml
44
dist/
55
.vscode/
66
.idea/
7+
.claude/settings.local.json

0 commit comments

Comments
 (0)