Add FileInput/FileOutput well-known types for file transfer between MCP tools#42
Open
birdayz wants to merge 2 commits into
Open
Add FileInput/FileOutput well-known types for file transfer between MCP tools#42birdayz wants to merge 2 commits into
birdayz wants to merge 2 commits into
Conversation
… rewriting MCP tools that move files need different schemas depending on whether the deployment shares a filesystem between aigw and the agent (sandbox) or not (hosted). Rather than each MCP reinventing this, define it once. The generator recognizes mcp.v1.FileInput and mcp.v1.FileOutput by FQN (same mechanism as google.protobuf.Timestamp) and emits schemas with an x-mcp-file-mode marker. At registration time, WithFileMode(mode) rewrites the schema: - FileModeInline: strips file_path, requires content (hosted) - FileModePath: strips content, requires file_path (sandbox) - FileModeAll: keeps both, agent picks (hybrid) When no mode is set, the marker stays and all fields are exposed. The google/* testdata changes are just buf pulling newer BSR versions during regeneration -- no functional change there. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…iant The flat field layout forced runtime schema rewriting to strip individual fields. The oneof pattern is cleaner -- each transport (inline bytes, filesystem path, S3 presigned URL) is a distinct variant, and the runtime exposes only the one matching the deployment mode. The S3Reference type carries a presigned URL. The framework (not the MCP handler) is responsible for uploading/downloading from S3 and minting the URLs. This means MCP handlers always produce raw bytes and never touch object storage directly -- the framework intercepts and rewrites to the configured transport. Four modes: - FileModeInline: only content variant (hosted, small files) - FileModePath: only path variant (sandbox, shared filesystem) - FileModeS3: only s3 variant (cloud-native, presigned URLs) - FileModeAll: all variants exposed, agent picks S3 wiring is not yet implemented in any consumer -- this commit defines the proto shape and schema machinery so the framework layer in cloudv2 can wire it when ready. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment on lines
+45
to
+46
| // Path on the shared filesystem (sandbox mode). | ||
| string path = 2; |
There was a problem hiding this comment.
do we need to validate the path so that nobody references wrong location like local fs?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Add
mcp.v1.FileInputandmcp.v1.FileOutputas well-known protobuf types withoneof source/destinationfor deployment-aware file transfer in MCP tools.Why
MCP tools that move files need different transports depending on deployment topology. Inline bytes work for small files in hosted mode. Filesystem paths avoid context-window bloat in sandbox deployments. And S3 presigned URLs solve the MCP-to-MCP file transfer problem in production without a shared filesystem -- MCP A uploads to S3, emits a presigned GET URL, MCP B downloads directly, and the LLM just passes a URL string instead of carrying megabytes through its context window.
Without a shared abstraction every file-moving MCP reinvents this, and MCP-to-MCP file passing doesn't work at all in pure MCP (no sandbox) because the protocol has no mechanism for direct server-to-server transfer.
Implementation details
Proto shape
The
oneofmakes the transport explicit at the proto level. Common metadata (filename,mime_type,size_bytes) sits outside the oneof -- always present regardless of transport.Generator (
pkg/gen/schema.go)Recognizes
mcp.v1.FileInput,mcp.v1.FileOutput, andmcp.v1.S3Referenceby FQN inmessageFieldSchema()-- same mechanism asgoogle.protobuf.Timestamp. Emits JSON Schemas with anx-mcp-file-modeextension marker that the runtime uses to locate file-typed properties.Runtime (
pkg/runtime/file_mode.go)Four modes via
WithFileMode(mode)registration option:FileModeInline-- exposes only thecontentoneof variant (hosted)FileModePath-- exposes onlypath(sandbox)FileModeS3-- exposes onlys3(cloud-native)FileModeAll-- all variants visible, agent picks (hybrid)Schema rewriting strips non-matching oneof variants at registration time. The
x-mcp-file-modemarker is removed from the final schema.Framework contract
MCP handlers always produce/consume raw bytes. The framework layer (in the consumer repo) intercepts and rewrites to the configured transport:
Handlers never touch S3 or the filesystem directly.
S3 presigned URL flow
S3Reference carries a presigned URL. The framework (not the handler) mints it:
Presigned URLs are time-limited and scoped to a single object -- no IAM cross-wiring between MCPs. This makes MCP-to-MCP file transfer work without the LLM carrying bytes through its context window.
S3 wiring is not yet implemented in any consumer -- this PR defines the proto shape and schema machinery so the framework can wire it when ready.
References
First consumer: SharePoint managed MCP in cloudv2 (https://github.com/redpanda-data/cloudv2/pull/26291).