Feature Specification: Output-Schema Validation for Proxied Tool Calls

Feature Branch: 056-output-schema-validation Created: 2026-05-25 Status: Draft Input: Spec 054 Track A (carved into its own feature). When an upstream tool declares an outputSchema, mcpproxy verifies the tool's structured response conforms to that schema before it reaches the agent, so a buggy or compromised server cannot inject malformed/oversized/unexpected data into the agent's context.

Scope note: this feature is Track A only of the Spec 054 umbrella ("MCP Security Gateway Hardening"). It deliberately excludes Track B (output sanitisation), Track C (per-tool ACLs), Track D (TOFU pinning hardening), and Track E (audit hash chain). Those ship separately.

User Scenarios & Testing (mandatory)

User Story 1 - Structured output is validated against its declared schema (Priority: P1)

As an operator running AI agents through mcpproxy, when an upstream tool declares an outputSchema, I want mcpproxy to verify that the tool's structured response actually conforms to that schema before it reaches my agent, so that a buggy or compromised server cannot inject malformed, oversized, or unexpected data into the agent's context.

Why this priority: This is the emptiest axis — mcpproxy does zero output validation today — and the highest-leverage new capability, completing the "validated data out" half of the security story. It is mostly additive with safe defaults, so it can ship as a standalone MVP.

Independent Test: Configure a stub upstream tool with an outputSchema. Return a conforming response (must pass through unchanged) and a non-conforming response (blocked in strict mode / forwarded-and-tagged in warn mode, with an activity record emitted). Verify structuredContent is preserved verbatim on the pass path (no strip-then-validate).

Acceptance Scenarios:

Given a tool with a declared output schema and output_validation.mode=strict, When it returns structuredContent violating the schema, Then the call is blocked with a clear error and a policy_decision activity record is written.
Given the same tool in mode=warn, When it returns a violating response, Then the response is forwarded but tagged as schema-violating and a policy_decision activity record is written.
Given a tool with NO declared output schema, When it returns any response, Then validation is a no-op and behaviour is unchanged (backward compatible).
Given a valid structured response, When it is validated, Then the original structuredContent reaches the agent unmodified (byte-for-byte).

User Story 2 - Oversized / pathological output is bounded before validation (Priority: P2)

As an operator, I want extremely large or deeply-nested structured payloads from a tool to be bounded by configurable guards before validation runs, so a single response cannot exhaust memory, blow the agent's context window, or DoS the proxy through schema-validation cost.

Why this priority: A guard is cheap, protects the proxy itself, and is a prerequisite for safely running schema validation (which can be expensive on adversarial nested input). It builds on the same response chokepoint as Story 1.

Independent Test: Configure a byte-size guard and a nesting-depth guard. Return a structured payload exceeding the byte size, and one exceeding the depth. Verify each is treated as a validation failure (blocked in strict, tagged in warn) with an activity record, and that the guard check happens before full schema validation.

Acceptance Scenarios:

Given a configured max structured-output byte size, When a response's structured payload exceeds it, Then the call is treated as a guard violation (blocked in strict / tagged in warn) and a policy_decision activity record is written.
Given a configured max nesting depth, When a response's structured payload exceeds it, Then the same guard-violation handling applies.
Given a payload within both guards, When it is validated, Then guards add negligible overhead and schema validation proceeds.

User Story 3 - Operator can observe and tune validation behaviour (Priority: P3)

As an operator, I want to configure validation mode and guard limits, and to see validation failures in the activity log alongside other policy decisions, so I can roll the feature out safely (warn first, then strict) and audit what was caught.

Why this priority: Configurability + observability turn a binary feature into one operators trust enough to enable. It reuses the existing activity-log and config plumbing.

Independent Test: Set output_validation.mode to off, warn, and strict and confirm behaviour matches each. Confirm mcpproxy activity list surfaces the validation policy_decision records and that mcpproxy activity show <id> reveals the tool, mode, and violation detail.

Acceptance Scenarios:

Given output_validation.mode=off, When any tool returns structured output, Then no validation runs and no validation activity records are written.
Given a validation failure was recorded, When the operator runs mcpproxy activity list, Then the failure appears as a policy_decision record filterable by status.
Given a validation failure record, When the operator inspects it, Then it includes the server, tool, mode, and a human-readable description of the violation.

Edge Cases

Legacy text-only response (ContextForge #4042 trap): a tool declares an outputSchema but returns only legacy text content with no structuredContent. This MUST NOT hard-fail in warn mode — it is treated as "no structured output to validate" (no-op). In strict mode the configured missing_structured_content posture decides (default: allow, to preserve backward compatibility with tools that under-declare).
Oversized / deeply-nested payload: bounded by the size/depth guards (Story 2) before schema validation, consistent with existing payload caps; the guard verdict short-circuits expensive validation.
Malformed / unparseable schema on the tool: if the captured outputSchema is itself invalid JSON Schema, validation cannot run; treat as no-op + emit a one-time warning per tool (do not block traffic on the proxy's inability to compile a schema).
IsError upstream result: when the upstream tool returns an error result, output validation is skipped (there is no successful structured payload to validate).
Streaming / multiple content blocks: validation targets the single structuredContent field of the result; text/image/audio/embedded blocks are out of scope for Track A and pass through untouched.

Requirements (mandatory)

Functional Requirements

FR-A1: System MUST capture each upstream tool's declared output schema during tool discovery/indexing and persist it alongside the existing input schema, so the schema is available at call time without re-querying the upstream.
FR-A2: System MUST, on every proxied tool-call return, validate the structured portion of the response against the tool's captured output schema when one exists.
FR-A3: System MUST preserve the original structured response unchanged on the success path — validation operates on a copy/read-only view; it MUST never strip-then-validate or otherwise mutate the forwarded payload on success.
FR-A4: System MUST support strict (block on violation), warn (forward + tag), and off (disabled) modes, configurable globally and defaulting to a backward-compatible setting (warn).
FR-A5: System MUST emit a policy_decision activity record on every validation failure, including the server, tool, mode, and a description of the violation.
FR-A6: System MUST enforce configurable byte-size and nesting-depth guards on structured output, evaluated before full schema validation; a guard breach is handled as a validation failure under the active mode.
FR-A7: When a tool declares no output schema, validation MUST be a no-op (behaviour unchanged, no activity records).
FR-A8: When a tool declares an output schema but the response carries no structuredContent (legacy text-only), the system MUST NOT hard-fail in warn mode (treat as nothing-to-validate); behaviour in strict mode is governed by a configurable missing_structured_content posture defaulting to allow.
FR-A9: When the captured output schema is itself not a compilable JSON Schema, the system MUST treat validation as a no-op for that tool and surface a single diagnostic warning rather than blocking traffic.
FR-A10: Validation MUST be skipped when the upstream result is an error result (IsError), since there is no successful structured payload to validate.
FR-A11: In warn mode, a forwarded-but-violating response MUST be tagged in a way observable to the operator (activity record) without altering the payload delivered to the agent.
FR-A12: Behaviour MUST be identical across personal and server editions (no build-tag-specific logic).

Key Entities (include if feature involves data)

Captured Output Schema: the JSON Schema a tool declares for its structured output, captured at discovery and persisted with the tool's existing metadata (alongside input schema). Absent for most tools today.
Output Validation Config: operator-facing settings — mode (off/warn/strict), max_bytes, max_depth, missing_structured_content posture — with backward-compatible defaults.
Validation Failure Record: a policy_decision activity entry capturing server, tool, mode, and violation description; reuses the existing activity-log entity.

Success Criteria (mandatory)

Measurable Outcomes

SC-001: For a tool with a declared output schema returning a non-conforming structured response, mcpproxy blocks the call in strict mode and forwards-with-an-audit-record in warn mode — 100% of the time in tests.
SC-002: For a conforming response, the structuredContent delivered to the agent is byte-for-byte identical to what the upstream returned (zero mutation on the success path).
SC-003: For a tool with no declared output schema, end-to-end behaviour (latency, payload, content blocks) is indistinguishable from the pre-feature baseline (no observable change).
SC-004: A structured payload exceeding the configured byte-size or nesting-depth guard is caught before full schema validation, in 100% of guard-breach tests.
SC-005: Every validation failure produces exactly one policy_decision activity record discoverable via mcpproxy activity list, with the tool, mode, and violation present in activity show.
SC-006: With mode=off (or a tool without a schema), added per-call overhead is negligible (no measurable regression in the E2E tool-call latency baseline).

Assumptions

Default mode is warn (not strict), so enabling the feature never breaks an existing working agent on day one; operators opt into strict.
missing_structured_content defaults to allow, because many real-world tools declare an output schema yet still return only text content; hard-failing those would break compatibility (the ContextForge #4042 lesson).
Validation targets structuredContent only. Text/image/audio/embedded content blocks are out of scope for Track A; their handling (sanitisation/spotlighting) is Track B.
Schema compilation is cached per tool so repeated calls don't recompile; an uncompilable schema degrades to no-op (FR-A9).
The existing forwardContentResult chokepoint (internal/server/content_forward.go) is where validation hooks in, since it is the single response path for proxied calls; emitActivityPolicyDecision is reused for FR-A5/FR-A11 records.
Default guard limits are chosen to be generous enough never to trip on legitimate tool output (e.g. multi-MB byte cap, depth in the tens) while still bounding pathological adversarial input; exact defaults finalised in planning.

Out of Scope

Output sanitisation / redaction / spotlighting of untrusted content (Track B).
Per-tool / per-argument access control (Track C).
TOFU pinning of output schemas / annotations and provenance binding (Track D) — note Track A only captures the schema for validation; pinning its changes is Track D.
Tamper-evident audit hash chain and retention floors (Track E).
Validation of input arguments (already covered by upstream tools / existing intent flow).

Commit Message Conventions (mandatory)

Use Related #<issue> (never Fixes/Closes/Resolves).
Do NOT include Co-Authored-By: Claude or "Generated with Claude Code" (per repo policy / memory feedback_no_claude_git_attribution).
Conventional Commit prefixes enforced by commitlint (Spec 053 WP-C5): feat(056): ..., test(056): ..., etc.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Specification: Output-Schema Validation for Proxied Tool Calls

User Scenarios & Testing (mandatory)

User Story 1 - Structured output is validated against its declared schema (Priority: P1)

User Story 2 - Oversized / pathological output is bounded before validation (Priority: P2)

User Story 3 - Operator can observe and tune validation behaviour (Priority: P3)

Edge Cases

Requirements (mandatory)

Functional Requirements

Key Entities (include if feature involves data)

Success Criteria (mandatory)

Measurable Outcomes

Assumptions

Out of Scope

Commit Message Conventions (mandatory)

Uh oh!

FilesExpand file tree

spec.md

Latest commit

History

spec.md

File metadata and controls

Feature Specification: Output-Schema Validation for Proxied Tool Calls

User Scenarios & Testing (mandatory)

User Story 1 - Structured output is validated against its declared schema (Priority: P1)

User Story 2 - Oversized / pathological output is bounded before validation (Priority: P2)

User Story 3 - Operator can observe and tune validation behaviour (Priority: P3)

Edge Cases

Requirements (mandatory)

Functional Requirements

Key Entities (include if feature involves data)

Success Criteria (mandatory)

Measurable Outcomes

Assumptions

Out of Scope

Commit Message Conventions (mandatory)