strip "Thought" prefix even when polish skipped

mios-dev · claude · mios-dev · commit 8b6de563e0a5 · 2026-05-18T00:58:51.000-04:00
Operator-pasted chat 2026-05-18: "Hey there!" produced the answer
"Thought\n\nHello! How can I assist you today? 😊" -- the literal
"Thought" prefix leaked because:
* polish_can_skip returned True (short clean output, no narration)
* the skip path returned `raw_output` AS-IS without running
  _strip_reasoning_leaks
* hermes uses qwen3.5:4b which is a reasoning-mode model that
  emits "Thought\n\n&lt;answer&gt;" -- _LEADING_THOUGHT_RE was designed
  to catch this but only ran on the polish output, not the skip
  path

Fix: the skip-polish branch now runs _strip_outer_md_fence +
_strip_reasoning_leaks before returning the raw text. Same two
post-processors the polish branch already applies. Idempotent
on already-clean text.

Live verified end-to-end: input "Thought\n\nHello! How can I
assist?" -&gt; output "Hello! How can I assist?" via the
skip-polish path. has_Thought_prefix=False.

Pipe re-installed into OWUI db; OWUI restarted clean.

Co-Authored-By: Claude Opus 4.7 (1M context) &lt;noreply@anthropic.com&gt;
diff --git a/usr/share/mios/owui/pipes/mios_agent_pipe.py b/usr/share/mios/owui/pipes/mios_agent_pipe.py
@@ -949,7 +949,16 @@ async def _polish_via_cpu(
             return raw_output
         if self._polish_can_skip(raw_output):
             await self._emit(emitter, "✓ clean → skip polish")
-            return raw_output
+            # Even on skip-polish, strip reasoning leaks ("Thought\n\n",
+            # <think>...</think>, <details>...</details>) from the
+            # raw text. Hermes uses reasoning-mode models that emit
+            # these prefixes; the operator should never see them.
+            # Operator-flagged 2026-05-18: "Hey there!" returned
+            # "Thought\n\nHello! How can I assist you today?" because
+            # polish skipped and the leading "Thought" passed through.
+            cleaned = self._strip_outer_md_fence(raw_output)
+            cleaned = self._strip_reasoning_leaks(cleaned)
+            return cleaned or raw_output
 
         # Try to load the structured tool history from Hermes session
         # JSON (OpenAI-format messages with tool_calls + tool_result).