|
| 1 | +--- |
| 2 | +name: new-sep |
| 3 | +description: >- |
| 4 | + Scaffold a sep-NNNN.yaml requirement-traceability file for the MCP |
| 5 | + conformance repo from a SEP PR's spec diff. Runs the new-sep CLI, then |
| 6 | + parses the modelcontextprotocol/modelcontextprotocol spec diff to populate |
| 7 | + `requirements[]` with the RFC 2119 sentences and proposed check IDs. |
| 8 | +argument-hint: '<sep-number> [--pr <num>] [--target client|server|authorization-server]' |
| 9 | +--- |
| 10 | + |
| 11 | +# new-sep: SEP traceability YAML scaffolding |
| 12 | + |
| 13 | +You are bootstrapping a `sep-NNNN.yaml` file for a new SEP in the MCP conformance repo. The output is the requirement-traceability file specified by SEP-2484: a YAML that maps each normative sentence from the SEP's spec diff to a `check:` ID (testable) or an `excluded:` reason (not testable). The CLI gets the skeleton; you fill in the rows by reading the spec diff. |
| 14 | + |
| 15 | +## Step 0: Pre-flight checks |
| 16 | + |
| 17 | +Before doing anything else, verify GitHub CLI authentication: |
| 18 | + |
| 19 | +```bash |
| 20 | +gh auth status 2>&1 |
| 21 | +``` |
| 22 | + |
| 23 | +If this fails, stop immediately and tell the user: |
| 24 | + |
| 25 | +> GitHub authentication is required for this skill. Please run `gh auth login` first, then re-run. |
| 26 | +
|
| 27 | +Verify you're running inside the conformance repo: |
| 28 | + |
| 29 | +```bash |
| 30 | +test -f package.json && jq -r '.name' package.json |
| 31 | +``` |
| 32 | + |
| 33 | +The name should be `@modelcontextprotocol/conformance`. If not, stop and ask the user to `cd` into the conformance repo first. |
| 34 | + |
| 35 | +## Step 1: Parse arguments |
| 36 | + |
| 37 | +Extract from the user's input: |
| 38 | + |
| 39 | +- **sep-number** (required): the SEP number, e.g. `2164`. |
| 40 | +- **--pr <num>** (optional): the PR number in `modelcontextprotocol/modelcontextprotocol`. If omitted, the CLI searches for a PR titled `SEP-<NNNN>` and fails loudly on 0 or >1 hits. |
| 41 | +- **--target client|server|authorization-server** (optional): which scenarios subdirectory to write to. Inferred from the spec path if omitted. |
| 42 | + |
| 43 | +## Step 2: Generate the skeleton |
| 44 | + |
| 45 | +Run the CLI: |
| 46 | + |
| 47 | +```bash |
| 48 | +npm run --silent build |
| 49 | +node dist/index.js new-sep <NNNN> [--pr <num>] [--target <target>] |
| 50 | +``` |
| 51 | + |
| 52 | +(For development against a non-built source tree: `npx tsx src/index.ts new-sep ...`.) |
| 53 | + |
| 54 | +The CLI writes `src/scenarios/<target>/sep-<NNNN>.yaml` with `sep`, `spec_url`, and two TODO `requirements[]` rows. Capture the output path from the CLI's `Wrote …` line and remember it as `$YAML`. |
| 55 | + |
| 56 | +If the CLI errors with "No PRs match" or "Multiple PRs match", read the message, ask the user for the right `--pr <num>`, and rerun. Do not guess. |
| 57 | + |
| 58 | +## Step 3: Fetch the spec diff |
| 59 | + |
| 60 | +`AGENTS.md` (lines 64–72) is explicit that severity must come from the spec text itself, not the SEP markdown or the conformance PR description: |
| 61 | + |
| 62 | +```bash |
| 63 | +PR=$(node dist/index.js new-sep <NNNN> --help >/dev/null 2>&1; echo <pr-from-step-2>) |
| 64 | +gh api "repos/modelcontextprotocol/modelcontextprotocol/pulls/$PR/files" \ |
| 65 | + --jq '.[] | select(.filename | test("^docs/specification/draft/.*\\.mdx$")) | {filename, patch}' |
| 66 | +``` |
| 67 | + |
| 68 | +For each file, pull the added (`+`-prefixed) lines from `patch`. If `patch` is truncated for a large file, fall back to fetching the whole file at the PR's head ref: |
| 69 | + |
| 70 | +```bash |
| 71 | +gh api "repos/modelcontextprotocol/modelcontextprotocol/contents/<path>?ref=<sep-branch>" \ |
| 72 | + --jq '.content' | base64 -d |
| 73 | +``` |
| 74 | + |
| 75 | +## Step 4: Extract RFC 2119 requirements |
| 76 | + |
| 77 | +Walk the added lines and identify sentences containing the keywords: **MUST**, **MUST NOT**, **SHOULD**, **SHOULD NOT**, **REQUIRED**, **SHALL**, **SHALL NOT**, **MAY**, **OPTIONAL**. |
| 78 | + |
| 79 | +**Quote the whole sentence**, not just the matched line. The matched word may sit inside a bullet point whose lead-in sentence supplies the keyword by inheritance — e.g.: |
| 80 | + |
| 81 | +> Servers SHOULD return standard JSON-RPC errors for common failure cases: |
| 82 | +> |
| 83 | +> - Resource not found: -32602 (Invalid Params) |
| 84 | +
|
| 85 | +The bullet inherits `SHOULD`. The yaml row should quote the _combined_ obligation: `'Servers SHOULD return standard JSON-RPC errors for common failure cases: Resource not found: -32602 (Invalid Params)'` — see `src/scenarios/server/sep-2164.yaml` for the canonical example. |
| 86 | + |
| 87 | +**Regex alone is insufficient** (this is called out in Issue #243). Read for context: pronouns, "the server", and "such cases" all refer back to the lead-in. |
| 88 | + |
| 89 | +## Step 5: Map severity → check vs. excluded |
| 90 | + |
| 91 | +From `AGENTS.md:50-56`: |
| 92 | + |
| 93 | +| Keyword | Severity | YAML field | |
| 94 | +| ---------------------------------------------- | ------------------------- | -------------------------- | |
| 95 | +| MUST / MUST NOT / SHALL / SHALL NOT / REQUIRED | FAILURE | `check: sep-<NNNN>-<slug>` | |
| 96 | +| SHOULD / SHOULD NOT | WARNING | `check: sep-<NNNN>-<slug>` | |
| 97 | +| MAY / OPTIONAL | (not enforced as a check) | `excluded: '<reason>'` | |
| 98 | + |
| 99 | +If a requirement is testable in principle but you can't see how to drive it from the harness, write a `check:` row anyway and leave it for the human to wire up — do **not** silently demote to `excluded:`. |
| 100 | + |
| 101 | +Use `excluded:` only when the requirement genuinely can't be protocol-observed (e.g. "clients SHOULD also accept -32002" — the conformance harness tests servers, so client-side acceptance is not observable here). When you use `excluded:`, write the reason verbatim and add an `issue:` URL if there's a tracking issue. |
| 102 | + |
| 103 | +Slug convention: lowercase-kebab, derived from the verb phrase. Examples from `sep-2164.yaml`: `no-empty-contents`, `error-code`. Same `id` is used for SUCCESS and FAILURE (`AGENTS.md:52`). |
| 104 | + |
| 105 | +## Step 6: Rewrite the YAML |
| 106 | + |
| 107 | +Replace the two TODO rows the CLI generated with one row per extracted requirement. Preserve the CLI's quoting style (single quotes, two-space indent — see `src/scenarios/server/sep-2164.yaml`). |
| 108 | + |
| 109 | +If a requirement is ambiguous or you're not confident, leave it as a `TODO:` row rather than guessing — humans review this yaml before scenarios get written. |
| 110 | + |
| 111 | +Also fix the `spec_url`: the CLI emits the page URL with no anchor. If the requirements you extracted live under a specific spec subsection (e.g. `#error-handling`), append it. |
| 112 | + |
| 113 | +Write the result back to `$YAML`. |
| 114 | + |
| 115 | +## Step 7: Hand-off |
| 116 | + |
| 117 | +Report to the user, in this order: |
| 118 | + |
| 119 | +1. Path to the generated yaml. |
| 120 | +2. Number of rows extracted (e.g. "3 `check:` rows, 1 `excluded:` row"). |
| 121 | +3. Any requirements you marked TODO and why. |
| 122 | +4. Reminder of the next steps the user still owns: |
| 123 | + - implement the TypeScript scenario under `src/scenarios/<target>/`, |
| 124 | + - register it in the appropriate suite list in `src/scenarios/index.ts` (`AGENTS.md:48`), |
| 125 | + - add a passing example to the everything-client/server and a negative test, per `AGENTS.md:74-81`. |
| 126 | + |
| 127 | +Do **not** generate the scenario `.ts` file or touch `src/scenarios/index.ts`. The skill's scope ends at the yaml. |
0 commit comments