fix(mcp/mcp_proxy): pin file IO encoding to UTF-8#2826
Conversation
mcp_proxy.py rewrote the generated main.py (post-fastapi-codegen patch) and saved the rendered server-configuration template through bare locale-default open()/Path.open() calls. On Windows that resolves to cp1252 and any non-cp1252 glyph in an OpenAPI spec — internationalized model names, smart quotes in descriptions, emoji in example payloads — raised UnicodeEncodeError mid-write, killing the proxy generation step. Pin encoding="utf-8" on the four call sites (main_path read/write, rendered config write, config_file read). Adds source-level regression coverage in test/mcp/test_mcp_proxy_utf8.py that runs on every CI lane (no MCP extras required) so the kwarg cannot silently regress.
marklysze
left a comment
There was a problem hiding this comment.
Correct — four open() / Path.open() calls in mcp_proxy.py now carry encoding="utf-8". Especially important for OpenAPI spec payloads that commonly include non-ASCII characters in descriptions and examples. LGTM.
marklysze
left a comment
There was a problem hiding this comment.
Three open(path, "w") → open(path, "w", encoding="utf-8") patches in the MCP proxy code generator. The files touched (main.py read/write and the Jinja config dump) all handle code or config text that can contain Unicode. No separate test, but these are code-gen paths where an encoding mock test would add significant scaffolding for minimal coverage gain — the pattern is simple enough to approve on code review alone. Approved.
Removed outdated regression comment regarding MCP proxy file writes and UTF-8 handling.
Codecov Report❌ Patch coverage is
... and 553 files with indirect coverage changes 🚀 New features to boost your workflow:
|
Description
autogen/mcp/mcp_proxy/mcp_proxy.pyperforms four file IO operations through the bare locale default:main_path.open("r")thenmain_path.open("w")rewriting the post-fastapi-codegenmain.py(mcp_proxy.py:295,302).open(output_file, "w")saving the rendered server-configuration template (mcp_proxy.py:438).Path(config_file).open("r")reading saved server configuration (mcp_proxy.py:442).open(...)honorslocale.getpreferredencoding(False). On Windows that resolves tocp1252and any non-cp1252 glyph in an OpenAPI spec — internationalized model names, smart quotes in descriptions, emoji in example payloads — raisesUnicodeEncodeErrormid-write, killing the proxy generation step. Same class of bug as #1731 / PRs #2818 / #2819 / #2825.This change pins
encoding="utf-8"on all four call sites.Tests
Added
test/mcp/test_mcp_proxy_utf8.py— source-level regression check that asserts the kwarg appears on the four call sites. Runs on every CI lane (no MCP extras required).Checklist
AI Disclosure