Skip to content

Commit 00b1923

Browse files
committed
Render thinking blocks as markdown
1 parent 75d6238 commit 00b1923

3 files changed

Lines changed: 62 additions & 12 deletions

File tree

AGENTS.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -436,8 +436,9 @@ Gitignored scratchpad for helper files the user asks to be created there — typ
436436
- `cargo clippy --all-targets -- -D warnings`
437437
- `cargo test --all`
438438
- Before finishing, review the change for bugs and corner cases.
439-
- After each final modification, provide a clear, human-readable one-line commit message.
440439
- Use international English. Avoid regional idioms (whether American or British), clever shorthand, and compressed phrases; prefer wording that a non-native English reader can understand on the first read. This applies to chat replies, commit messages, code comments, documentation, and error messages.
440+
- After you finish cross-checking against the Non-Negotiable Rules and fixing the code, if needed, do another pass for bugs and regressions.
441+
- After each final modification, provide a clear, human-readable one-line commit message.
441442

442443
---
443444

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ All notable changes to Sofos are documented in this file.
77
### Changed
88

99
- **File-edit tool results are now fixed-size summaries.** `edit_file`, `write_file` (when it overwrites an existing file), and `morph_edit_file` previously returned the full syntax-highlighted diff to the model as the tool result. The colored diff carried truecolor ANSI escape sequences that roughly multiplied the byte count per line, and the tool result stayed in conversation history for every subsequent turn — so a session with many edits paid that bloated cost again on each later turn, and a single large rewrite could push the response into the hundreds of thousands of tokens. The model now sees a fixed two-line summary (`Success. Updated the following files:` followed by `M <path>`) regardless of edit size, while the terminal still renders the full colored diff exactly as before. If the model needs to verify the post-edit state it can re-read a range of the file.
10+
- **Reasoning output renders markdown and is separated from what follows.** The dim `Thinking:` section that streams before the assistant's reply or before a tool call used to print as raw dim text and ran straight into the next `Using tool:` or `Assistant:` header. Prose that contained inline code, bold, or list markers showed the source characters instead of rendering them. The body now flows through the same markdown stream renderer the assistant text uses, with the rendered output wrapped in a faint terminal style so the muted look is preserved, and a blank line separates the thinking section from whatever follows it.
1011

1112
## [0.2.11] - 2026-05-16
1213

src/ui/mod.rs

Lines changed: 59 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -217,13 +217,15 @@ impl UI {
217217
/// Handles real-time output during response streaming. Visible
218218
/// assistant text is fed through a [`MarkdownStreamRenderer`] so
219219
/// headings, lists, emphasis, and code fences render with ANSI styling
220-
/// instead of leaking raw markdown to the terminal. Thinking deltas
221-
/// stay plain-dim because they're typically free-form prose and the
222-
/// extra rendering pass would only delay their display.
220+
/// instead of leaking raw markdown to the terminal. Thinking deltas go
221+
/// through a separate renderer of the same type, with the rendered
222+
/// output wrapped in a faint SGR pair so the body keeps the dim
223+
/// "thinking" look without losing markdown formatting.
223224
pub struct StreamPrinter {
224225
thinking_started: AtomicBool,
225226
text_started: AtomicBool,
226227
text_renderer: Mutex<MarkdownStreamRenderer>,
228+
thinking_renderer: Mutex<MarkdownStreamRenderer>,
227229
}
228230

229231
impl StreamPrinter {
@@ -232,6 +234,7 @@ impl StreamPrinter {
232234
thinking_started: AtomicBool::new(false),
233235
text_started: AtomicBool::new(false),
234236
text_renderer: Mutex::new(MarkdownStreamRenderer::new()),
237+
thinking_renderer: Mutex::new(MarkdownStreamRenderer::new()),
235238
}
236239
}
237240

@@ -247,13 +250,25 @@ impl StreamPrinter {
247250
let (tr, tg, tb) = THINKING_RGB;
248251
print!("\n{}\n", "Thinking:".truecolor(tr, tg, tb).bold().dimmed());
249252
}
250-
print!("{}", delta.dimmed());
251-
let _ = stdout().flush();
253+
let to_print = {
254+
let mut renderer = self.lock_thinking_renderer();
255+
renderer.push_delta(delta);
256+
renderer.commit().unwrap_or_default()
257+
};
258+
if !to_print.is_empty() {
259+
print_dim(&to_print);
260+
let _ = stdout().flush();
261+
}
252262
}
253263

254264
pub fn on_text_delta(&self, delta: &str) {
255265
if !self.text_started.swap(true, Ordering::SeqCst) {
256266
if self.thinking_started.load(Ordering::SeqCst) {
267+
self.flush_thinking_tail();
268+
// Blank line between the thinking block and the
269+
// assistant text header. `finalize` guarantees a
270+
// trailing newline, so one extra `println!()` puts a
271+
// visible blank line between the two sections.
257272
println!();
258273
}
259274
println!("{}", "Assistant:".bright_blue().bold());
@@ -277,21 +292,54 @@ impl StreamPrinter {
277292
}
278293
// The finalised buffer ends with a newline, so the cursor
279294
// is already at column 0 — no extra println! needed for
280-
// text. Thinking-only finishes still want the trailing
281-
// separator.
295+
// text.
282296
let _ = stdout().flush();
283297
} else if self.thinking_started.load(Ordering::SeqCst) {
298+
self.flush_thinking_tail();
299+
// `finalize` ends with a newline. One extra blank line
300+
// separates the thinking body from whatever the turn
301+
// renders next (a tool call header, the input prompt).
284302
println!();
303+
let _ = stdout().flush();
285304
}
286305
}
287306

288-
/// Acquire the renderer lock, recovering from poison so a panic in
289-
/// one delta callback doesn't kill subsequent streaming output. A
290-
/// partial markdown buffer is recoverable; the worst case is one
291-
/// mid-stream paragraph rendering as plain text.
307+
/// Drain the thinking renderer's residual buffer (a partial last
308+
/// line, a still-open code fence, anything held back by `commit`)
309+
/// and emit it under the same dim wrap the streaming path uses.
310+
/// Shared between the thinking-to-text transition in `on_text_delta`
311+
/// and the thinking-only path in `finish`.
312+
fn flush_thinking_tail(&self) {
313+
let tail = self.lock_thinking_renderer().finalize().unwrap_or_default();
314+
if !tail.is_empty() {
315+
print_dim(&tail);
316+
}
317+
}
318+
319+
/// Acquire the text renderer lock, recovering from poison so a
320+
/// panic in one delta callback doesn't kill subsequent streaming
321+
/// output. A partial markdown buffer is recoverable; the worst
322+
/// case is one mid-stream paragraph rendering as plain text.
292323
fn lock_text_renderer(&self) -> std::sync::MutexGuard<'_, MarkdownStreamRenderer> {
293324
self.text_renderer.lock().unwrap_or_else(|e| e.into_inner())
294325
}
326+
327+
/// Same poison-recovery contract as [`Self::lock_text_renderer`],
328+
/// for the parallel thinking-side renderer.
329+
fn lock_thinking_renderer(&self) -> std::sync::MutexGuard<'_, MarkdownStreamRenderer> {
330+
self.thinking_renderer
331+
.lock()
332+
.unwrap_or_else(|e| e.into_inner())
333+
}
334+
}
335+
336+
/// Emit `text` wrapped in the faint SGR pair so the thinking body
337+
/// keeps its dim look. The renderer may have embedded its own ANSI
338+
/// for markdown emphasis or fenced-code highlighting; the wrap lets
339+
/// the prose dim cleanly and leaves those inner sequences intact
340+
/// where they apply.
341+
fn print_dim(text: &str) {
342+
print!("\x1b[2m{text}\x1b[22m");
295343
}
296344

297345
fn set_cursor_style(style: SetCursorStyle) -> io::Result<()> {

0 commit comments

Comments
 (0)