fix(core): sniff MCP image MIME types#27850
Conversation
🛑 Action Required: Evaluation ApprovalSteering changes have been detected in this PR. To prevent regressions, a maintainer must approve the evaluation run before this PR can be merged. Maintainers:
Once approved, the evaluation results will be posted here automatically. |
|
📊 PR Size: size/M
|
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a robust mechanism to sniff and validate the MIME types of image data within MCP blocks. By inspecting the binary signatures of base64-encoded payloads, the system can now automatically correct discrepancies where the declared MIME type does not match the actual image format, ensuring better compatibility and reliability when interacting with models. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces automatic MIME type detection for base64-encoded image data and embedded resources in MCP tools, correcting mislabeled MIME types (such as WebP) based on their file signatures. Feedback highlights a performance concern in detectImageMimeTypeFromBase64, where decoding the entire base64 string of potentially large images is inefficient, and suggests slicing and decoding only the first 32 characters instead.
| } | ||
|
|
||
| function detectImageMimeTypeFromBase64(data: string): string | undefined { | ||
| const buffer = Buffer.from(data, 'base64'); |
There was a problem hiding this comment.
Decoding the entire base64 string of a potentially large image (which can be several megabytes) into a Buffer just to check the first few bytes is highly inefficient. This function is also called multiple times for the same image block (once during transformation and once during display stringification), leading to redundant CPU and memory overhead.
Since we only need at most 12 bytes to detect all supported image signatures (PNG, JPEG, GIF, WebP), we can slice the first 32 characters of the base64 string (which decodes to 24 bytes) and decode only that prefix.
| const buffer = Buffer.from(data, 'base64'); | |
| const prefix = data.slice(0, 32); | |
| const buffer = Buffer.from(prefix, 'base64'); |
There was a problem hiding this comment.
Updated to decode only the base64 prefix before MIME signature checks.
Ran:
npm run test --workspace @google/gemini-cli-core -- mcp-tool.test.tsnpm run typecheck --workspace @google/gemini-cli-corenpm run lint --workspace @google/gemini-cli-coregit diff --check
Testing & ValidationI also independently identified and fixed this issue locally using the same approach (magic byte signature detection). I can confirm this solution works well for the Figma MCP integration with WebP images. I've tested the signature detection for:
Performance note: The optimization to decode only the base64 prefix (first 32 chars = 24 bytes) is excellent - this avoids decoding potentially multi-megabyte images just to check a few bytes. This matches industry best practices for MIME sniffing. For issue #27731: This fix resolves the Figma MCP integration failure where WebP images were mislabeled as Great work on this fix! Ready to help test if needed. |
|
Hi there! Thank you for your interest in contributing to Gemini CLI. To ensure we maintain high code quality and focus on our prioritized roadmap, we only guarantee review and consideration of pull requests for issues that are explicitly labeled as 'help wanted'. This PR will be closed in 7 days if it remains without that designation. We encourage you to find and contribute to existing 'help wanted' issues in our backlog! Thank you for your understanding. |
Summary
Fixes #27731.
Correct MCP image payloads whose declared MIME type does not match their base64 bytes, so WebP data reported as
image/pngis sent to the model asimage/webpinstead.Details
Adds local image signature sniffing for PNG, JPEG, GIF, and WebP data when transforming MCP image blocks and embedded binary resource blocks into inline data. Audio blocks continue to use their declared MIME type.
Related Issues
Fixes #27731.
How to Validate
packages/core:npm run test -- src/tools/mcp-tool.test.tsnpx eslint packages/core/src/tools/mcp-tool.ts packages/core/src/tools/mcp-tool.test.tsnpx prettier --check packages/core/src/tools/mcp-tool.ts packages/core/src/tools/mcp-tool.test.tsnpm run typecheck --workspace @google/gemini-cli-coregit diff --checkI did not run the full
npm run preflight; the checks above cover the touched core transform path and types.Pre-Merge Checklist
Note: I used Codex while preparing this change, reviewed the final diff, and ran the listed checks locally.