Skip to content

Commit 1400a5c

Browse files
genisis0xqingyun-wuvvlrff
authored
fix(environments): pin PythonEnvironment._write_to_file encoding to UTF-8 (#2827)
* fix(environments): pin PythonEnvironment._write_to_file encoding to UTF-8 PythonEnvironment._write_to_file is the shared helper every concrete PythonEnvironment subclass uses to land an agent-generated script on disk before invoking subprocess. The bare 'open(script_path, "w")' inherits 'locale.getpreferredencoding(False)' — cp1252 on default Windows installs — so the first non-cp1252 character (CJK string literal, emoji, smart quote from copy-pasted documentation, PEP 3131 identifier) raises 'UnicodeEncodeError' mid-write. The script is then partially written, the subprocess runs an invalid file, and the agent sees a syntax error or truncated output instead of the intended behavior. Pin encoding="utf-8" on the open call. Adds source-level regression coverage in test/environments/test_python_environment_utf8.py that runs on every CI lane (no PythonEnvironment subclass extras required) so the kwarg cannot silently regress. * chore: add license header to test/environments/__init__.py `check-license-headers` pre-commit hook flagged the new test package init as missing the standard SPDX header; add the 2023-2026 ag2 header matching the sibling test file to clear the pre-commit-check / pr-check failures. * Remove outdated UTF-8 regression comment Remove regression comment regarding UTF-8 encoding handling in PythonEnvironment._write_to_file. --------- Co-authored-by: Qingyun Wu <qingyun0327@gmail.com> Co-authored-by: Semen Frolov <148821259+vvlrff@users.noreply.github.com>
1 parent f50307a commit 1400a5c

3 files changed

Lines changed: 26 additions & 1 deletion

File tree

autogen/environments/python_environment.py

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,13 @@ def _write_to_file(self, script_path: str, content: str) -> None:
8787
script_path: Path to the file to write.
8888
content: Content to write to the file.
8989
"""
90-
with open(script_path, "w") as f:
90+
# Pin UTF-8 so any non-ASCII glyph in agent-supplied script content
91+
# (string literals, comments, docstrings, identifiers) round-trips on
92+
# platforms whose locale.getpreferredencoding() is not UTF-8 (cp1252
93+
# on default Windows installs). Without it `open(path, "w")` raises
94+
# UnicodeEncodeError mid-write on the first non-cp1252 character,
95+
# turning a successful agent code-gen into a runtime failure.
96+
with open(script_path, "w", encoding="utf-8") as f:
9197
f.write(content)
9298

9399
# Utility method for subclasses to wrap (for async support)

test/environments/__init__.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# Copyright (c) 2023 - 2026, AG2ai, Inc., AG2ai open-source projects maintainers and core contributors
2+
#
3+
# SPDX-License-Identifier: Apache-2.0
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# Copyright (c) 2023 - 2026, AG2ai, Inc., AG2ai open-source projects maintainers and core contributors
2+
#
3+
# SPDX-License-Identifier: Apache-2.0
4+
5+
from pathlib import Path
6+
7+
REPO_ROOT = Path(__file__).resolve().parents[2]
8+
9+
10+
def test_python_environment_write_to_file_pins_utf8() -> None:
11+
source = (REPO_ROOT / "autogen" / "environments" / "python_environment.py").read_text(encoding="utf-8")
12+
assert 'open(script_path, "w", encoding="utf-8") as f' in source, (
13+
"PythonEnvironment._write_to_file must pin encoding='utf-8' on its "
14+
"open() call so non-cp1252 script content (CJK string literals, emoji, "
15+
"smart quotes, PEP 3131 identifiers) does not crash on Windows."
16+
)

0 commit comments

Comments
 (0)