magentic-one-cli: pin utf-8 encoding when loading YAML config (refs #5566)#7722
Open
adv0r wants to merge 1 commit into
Open
magentic-one-cli: pin utf-8 encoding when loading YAML config (refs #5566)#7722adv0r wants to merge 1 commit into
adv0r wants to merge 1 commit into
Conversation
Refs microsoft#5566. The original report (microsoft#5566) was that `m1` crashed with `UnicodeDecodeError: 'cp950' codec can't decode byte ...` when loading `page_script.js` on a non-UTF-8 default locale (Traditional Chinese Windows, cp950). That specific call site was fixed in microsoft#6094. The reporter noted at the time: *"there will be some similar issues in the codebase while using open function"*, and the issue stayed open explicitly to track that follow-up. This PR fixes the next call sites on the same code path the user actually hits when they type `m1 ...` on Windows. `magentic_one_cli/_m1.py` opens the YAML config file in two places: - the default `~/.magentic_one_config.yaml` (line 100) - the user-supplied `--config <path>` (line 105) Both used `open(..., "r")` with no `encoding=`, so on a non-UTF-8 locale (cp950, cp1252, etc.) a config containing any non-ASCII byte (comments in CJK, accented paths, …) would raise `UnicodeDecodeError` before the CLI even got to construct an agent. This change adds `encoding="utf-8"` to both `open()` calls. YAML 1.2 mandates UTF-8 as the default encoding for YAML streams, so pinning UTF-8 on the reader matches what users are already writing. Why only two call sites, and not a repo-wide sweep: - Keeps the diff reviewable. - Same package as the original crash (`magentic-one-cli`), same code path the bug report hits. - A blanket sweep across `python/packages/` would touch >40 files (incl. test fixtures and benchmark scenario scripts that read JSONL produced by the agents themselves, where forcing UTF-8 could in theory mask issues). Better to land focused, then iterate. No behaviour change for already-UTF-8-locale users. AI-assisted via Cursor (Claude Opus 4.7). Personal token-burn initiative by @adv0r to use up an expiring Cursor subscription budget on small, useful upstream contributions. Co-authored-by: Cursor <cursoragent@cursor.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Refs #5566.
#5566 reported
m1crashing withUnicodeDecodeError: 'cp950' codec can't decode byte ...on a non-UTF-8default locale (Traditional Chinese Windows). The original call site
(
playwright_controller.py) was fixed in #6094, but the reporterflagged that there were "similar issues in the codebase while using
open function", and the issue was kept open to track the follow-up
sweep. This PR fixes the next two call sites on the same code path the
user actually hits when they type
m1 ...on Windows.What
python/packages/magentic-one-cli/src/magentic_one_cli/_m1.pyloads itsYAML config in two places, both via
open(..., \"r\")with noencoding=:DEFAULT_CONFIG_FILE~/.magentic_one_config.yamlargs.config[…]--config <path>On non-UTF-8 locales (cp950, cp1252, …) any non-ASCII byte in the
config (comments in CJK, accented paths, …) would raise
UnicodeDecodeErrorbefore the CLI gets to build an agent.This PR pins
encoding=\"utf-8\"on both, matching YAML 1.2's defaultstream encoding.
Scope
Intentionally only these two call sites:
magentic-one-cli), same codepath the bug report hits.
python/packages/would touch >40 filesincluding test fixtures and benchmark scenario scripts that read
JSONL produced by the agents themselves — forcing UTF-8 there could
in theory mask issues. Better to land focused, then iterate.
Verification
users.
AI-assisted via Cursor (Claude Opus 4.7). Personal token-burn
initiative by @adv0r to use up an expiring Cursor subscription budget on
small, useful upstream contributions.
Made with Cursor