fix: make console output resilient to terminal encoding#5159
Open
ATOM00blue wants to merge 1 commit into
Open
Conversation
On consoles whose codepage cannot encode every character (for example Windows cp1252), printing a character the codec does not support raised an uncaught UnicodeEncodeError that crashed the whole session. The existing per-message ASCII fallback could not recover because rich leaves the unencodable text in its output buffer after a failed write, so the retry hit the same error. Relax the console output stream's error handler to "replace" when it is strict, so unencodable characters become a placeholder and output keeps working. An explicit non-strict handler is left untouched. Fixes Aider-AI#5128
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Fixes #5128.
On consoles whose codepage isn't UTF-8 (for example Windows cp1252), aider crashes with an uncaught UnicodeEncodeError as soon as any output contains a character the codepage can't encode — checkmarks, arrows, box-drawing characters, emoji, or non-Latin text. The traceback bottoms out in
aider/io.py_tool_message->console.print-> rich's_write_buffer._tool_messagealready tries an ASCII fallback, but it can't recover: when rich'sfile.writeraises, the unencodable text stays in rich's internal buffer (thedel self._buffer[:]after the write never runs), so the retry re-flushes the same bad buffer and raises again.tool_outputhad no guard at all. The result is that essentially any decorated output takes down the whole session.What
Relax the console output stream's error handler from
stricttoreplacewhen the console is constructed. Unencodable characters are then emitted as a placeholder instead of raising, which fixes every console-backed output path at once rather than guarding individual call sites.Test
Added regression tests that drive the real output paths against a strict cp1252 stream (a stand-in for a legacy Windows console) and assert no crash with unencodable characters replaced, plus a test that an explicit error handler is preserved.