Skip to content

feat: per-command MCP outputSchema — Phase 4#941

Merged
thymikee merged 2 commits into
mainfrom
feat/phase4-mcp-output-schema
Jun 30, 2026
Merged

feat: per-command MCP outputSchema — Phase 4#941
thymikee merged 2 commits into
mainfrom
feat/phase4-mcp-output-schema

Conversation

@thymikee

Copy link
Copy Markdown
Member

Phase 4 (agent-cost): per-command MCP outputSchema

MCP agents currently have to re-parse the text content to learn a command's result shape. This slice advertises a per-command outputSchema so agents can trust structuredContent directly.

What this does

  • Adds src/mcp/command-output-schemas.ts: a hand-authored, partial-coverage COMMAND_OUTPUT_SCHEMAS registry keyed by daemon command name. It mirrors the typed-result spine CommandResultMap (src/core/command-descriptor/command-result.ts) one-for-one — schemas authored by hand from the matching src/contracts/* types (there is no type→JSON-Schema generator in the repo).
  • Wires it into listCommandTools() so tools/list now returns outputSchema for the typed commands. No router/server changes (the protocol version supports outputSchema).
  • Tests in src/mcp/__tests__/command-tools.test.ts covering a typed command's discriminant, the byte-identical untyped path, and structuredContent↔schema consistency.

The 13 typed commands

press, fill, longpress, boot, shutdown, viewport, home, back, rotate, app-switcher, clipboard, appstate, keyboard.

The genuinely-dynamic commands (snapshot overlays, gestures, perf, logs, …) are intentionally absent — exactly as CommandResultMap omits them rather than inventing a shape.

Design notes

  • Additive only. Untyped tools carry no outputSchema key and stay byte-identical to today.
  • Non-strict. No additionalProperties: false anywhere, so the additive cost object (opted in via --cost / includeCost) and any other additive fields ride into structuredContent and still validate.
  • Accurate, never invented. Required-vs-optional, enums, const discriminants, and discriminated-union branches (clipboard on action, appstate on platform, the interaction trio on kind) mirror the source contract types. Union shapes use oneOf with mutually-exclusive const discriminants so additive fields never break the exactly-one-of contract.

Verification

  • tsc --noEmit: exit 0
  • oxfmt --write + oxlint --deny-warnings: exit 0
  • fallow audit --base origin/main: CLEAN
  • vitest run src/mcp: 22 passed
  • Layering guard (daemon/platforms/kernel ↛ commands/): empty

Hand-author per-command MCP outputSchemas for the 13 typed commands whose
closed result shapes live in the contracts layer (mirroring CommandResultMap):
press, fill, longpress, boot, shutdown, viewport, home, back, rotate,
app-switcher, clipboard, appstate, keyboard.

The new COMMAND_OUTPUT_SCHEMAS registry is injected into tools/list via
listCommandTools(). It is additive-only: untyped/dynamic tools (snapshot,
gestures, perf, logs, …) carry no outputSchema key and stay byte-identical.
Schemas are non-strict (no additionalProperties:false) so the additive cost
object rides into structuredContent and still validates. MCP agents can now
trust structuredContent against the advertised schema instead of re-parsing
text.
@github-actions

github-actions Bot commented Jun 29, 2026

Copy link
Copy Markdown

Size Report

Metric Base Current Diff
JS raw 1.4 MB 1.4 MB +3.8 kB
JS gzip 452.3 kB 453.8 kB +1.5 kB
npm tarball 556.5 kB 558.1 kB +1.6 kB
npm unpacked 2.0 MB 2.0 MB +3.8 kB

Startup median (7 runs, lower is better):

Scenario Base Current Diff
CLI --version 27.9 ms 26.6 ms -1.3 ms
CLI --help 48.3 ms 48.5 ms +0.2 ms

Top changed chunks:

Chunk Raw diff Gzip diff
dist/src/server.js +8.7 kB +3.5 kB
dist/src/2948.js +142 B +76 B

@thymikee

Copy link
Copy Markdown
Member Author

Checked current head 27079e5. The concrete schema branches I sampled line up with the current typed result contracts, and all 21 checks are green now.

One cleanup before I would call this ready: please type-tie COMMAND_OUTPUT_SCHEMAS to the typed-result spine instead of Partial<Record<string, JsonSchema>>.

Right now a misspelled key or a new CommandResultMap entry without a matching MCP outputSchema would compile and simply omit the schema from tools/list. That undercuts the PR’s stated one-for-one invariant with CommandResultMap. A small change like importing type CommandResultMap and declaring the literal with satisfies Record<keyof CommandResultMap, JsonSchema> would make coverage drift a type error while still keeping the schemas hand-authored.

Replace Partial<Record<string, JsonSchema>> with
`satisfies Record<keyof CommandResultMap, JsonSchema>`, so the one-for-one
invariant with the typed-result spine is compiler-enforced: a new
CommandResultMap entry without an output schema is now a missing-key error, and
a misspelled/extra key is an excess-property error (previously both compiled and
silently omitted the schema). The lookup in listCommandTools guards with an `in`
check since the registry is keyed by the typed commands only.
@thymikee

Copy link
Copy Markdown
Member Author

Done — pushed ec3f908. COMMAND_OUTPUT_SCHEMAS is now } satisfies Record<keyof CommandResultMap, JsonSchema> (importing type CommandResultMap). Coverage drift is now a compile error:

  • a new CommandResultMap entry without a matching schema → missing-key error
  • a misspelled/extra key → excess-property error

The listCommandTools lookup guards with an in check (the registry is keyed by the typed commands only, so untyped tools resolve to no outputSchema). The schema VALUE types stay the precise hand-authored literals. tsc/oxlint/fallow clean, 22 MCP tests pass.

@thymikee

Copy link
Copy Markdown
Member Author

Re-reviewed current head ec3f908 after the schema registry update. The earlier blocker is resolved: COMMAND_OUTPUT_SCHEMAS now satisfies Record<keyof CommandResultMap, JsonSchema>, so missing or misspelled typed-result schemas fail at compile time instead of silently disappearing from tools/list.

I rechecked the scoped diff against Phase 4 in plans/perfect-shape.md, CommandResultMap, and the MCP tool projection. Typed commands advertise outputSchema, untyped/dynamic tools still omit the key, the default response path remains additive-only, and git diff --check is clean. All 21 PR checks are passing. No actionable blockers from my side; labeling ready-for-human for maintainer judgment.

@thymikee thymikee added the ready-for-human Valid work that needs human implementation, judgment, or maintainer merge label Jun 29, 2026
@thymikee thymikee merged commit 2e41c81 into main Jun 30, 2026
21 of 22 checks passed
@thymikee thymikee deleted the feat/phase4-mcp-output-schema branch June 30, 2026 05:24
@github-actions

Copy link
Copy Markdown
PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-06-30 05:24 UTC

thymikee added a commit that referenced this pull request Jun 30, 2026
The kernel move (#940) deleted src/utils/device.ts; #941's
command-output-schemas.ts (merged after #940's codemod ran) still imported the
old path. Same one-line fix as #943; de-dups once that lands.
thymikee added a commit that referenced this pull request Jun 30, 2026
…943)

#941 (MCP outputSchema) and #940 (errors/redaction/device -> src/kernel) merged
in an order where #940's import codemod never saw #941's new
command-output-schemas.ts, so it still imports '../utils/device.ts' — which no
longer exists. tsc/build on main is red. Repoint the import at
'../kernel/device.ts'. The only stale reference on main.
thymikee added a commit that referenced this pull request Jun 30, 2026
…ase 4 (#942)

* feat: leveled response views + --level knob, with a snapshot digest — Phase 4

Add the agent-cost leveled-response system: a responseLevel knob
(digest | default | full) plumbed end to end behind a global --level flag
(mirroring --cost), and a per-command ResponseView registry applied in the
router on the success path.

- contracts: RESPONSE_LEVELS/ResponseLevel + meta.responseLevel + boundary
  schema whitelist. Plumbing mirrors --cost: cli-flags FlagDefinition +
  GLOBAL_FLAG_KEYS, AgentDeviceClientConfig + overrides, buildClientConfig,
  buildMeta. ResponseLevel exported from the public root.
- src/daemon/response-views.ts: the ResponseView registry. Seeds the snapshot
  digest — the full node tree (the dominant token sink) collapses to
  { nodeCount, refs: first 12 hittable/non-occluded refs with labels } plus the
  cheap top-level signals (truncated/visibility/snapshotQuality). full returns
  today's shape (nothing richer is computed yet).
- router graft (applyResponseLevelView + applyAgentCostGrafts): composes with
  the existing cost block. With responseLevel default (or unset) AND no
  registered view AND no --cost, the original response is returned UNCHANGED —
  byte-identical to today (Maestro .ad recompare safe). cost.nodeCount reads the
  original node tree so it stays accurate even after a digest.

Tests: snapshot view unit test (digest filters hittable/occluded, drops the
tree, keeps cheap signals; default/full passthrough); router graft test via an
injected view (default identity byte-identical, digest applies, full passthrough,
digest+cost composition, unregistered-command passthrough, boundary parse).

Verified: tsc, oxfmt + oxlint --deny-warnings, fallow audit clean, rslib build,
Layering Guard empty, 1106 daemon/contracts/client tests pass (incl. the
existing cost/typed-error grafts after the restructure).

* fix: repoint MCP output-schemas import to kernel/device (rebase fixup)

The kernel move (#940) deleted src/utils/device.ts; #941's
command-output-schemas.ts (merged after #940's codemod ran) still imported the
old path. Same one-line fix as #943; de-dups once that lands.

* fix: re-classify responseLevel flag in integration-progress model

The --level/responseLevel flag is a diagnostics/output flag (not device-
observable), classified in the exclusion bucket alongside --cost. (Lost in an
earlier rebase; re-applying.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-human Valid work that needs human implementation, judgment, or maintainer merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant