MiOS-Agent pipe: stderr-suppress dispatch, tool_calls envelope reply, zero topic exclusions

mios-dev · claude · mios-dev · commit 3264f0ff847e · 2026-05-18T11:34:28.000-04:00
Five fixes from operator-driven verification this session: 1. shlex import (was the immediate NameError crash on every open_app / launch_app dispatch -- _dispatch_mios_verb used shlex.quote heavily but the module was never imported). 2. Router prompt clarity (was: router picked mios_apps for "open notepad" because both verbs contained "app"). Now each verb is tagged [WRITE] vs [READ] and a Verb-pick priority section lists the literal patterns (open X / close X / focus X / move X / list apps / list windows / go to URL). Explicit rule: NEVER pick a READ verb when the user clearly wants a system effect. 3. mios-find narrative leak suppressed at dispatch (operator-observed: "✅ open_app · (picked best of 4; 3 alternatives:)"). mios-find writes its picker narrative to stderr in human-mode; the broker's CAPTURE concatenates stdout+stderr, so the agent-facing tool_result picked up English noise. Fix: 2>/dev/null on the dispatch bash command for open_app so only the resolution stdout is piped to bash. 4. Tool-result reply is now OpenAI-tool-result-native. Previous form yielded a hardcoded English line ("✅ `tool` · <first line of stderr>") which leaked downstream-tool English into the chat AND failed gracefully on failure with "failed". New form yields a <details type="tool_calls" done="true"> block OWUI renders natively, containing the structured tool_call + tool_result envelope shaped to the OpenAI streaming protocol (function.name, function.arguments, tool_result.success, tool_result.output, tool_result.stderr). The only literal characters at this layer are universal status symbols (✅ / ⚠️) and the cross-locale tool name. 5. NO HARDCODED TOPIC EXCLUSIONS in the router or followup templates (operator-corrected mid-session: an earlier draft of this commit baked "decline weather/time/news/ etc." into _ROUTER_SYSTEM and the OWUI followup template -- operator clarified that MiOS-Agent is BOTH an Agentic-OS AND a generalized AI agent, so no per-topic deny-lists belong in source). Followup template now reads: "grounded in the CURRENT chat content and the operator's CURRENT system state" -- pure context-grounded suggestion, no static filter. Pipe deploy chain reminder (saved as agent memory this session): edit /usr/share/mios/owui/pipes/<file>.py -> mios-owui-install-pipe (pushes into webui.db function row) -> systemctl restart mios-open-webui. Skipping the installer step = container restart with no observable code change because OWUI loads pipes from the DB, not the filesystem. Quadlet deploy chain reminder: edit C:\MiOS\etc\containers\ systemd\<name>.container -> cp to live /etc/containers/ systemd/ -> automation/15-render-quadlets.sh to resolve ${MIOS_*:-default} placeholders to literals -> systemctl daemon-reload + restart. Raw cp of source-with-placeholders fails the unit at load time. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
diff --git a/etc/containers/systemd/mios-open-webui.container b/etc/containers/systemd/mios-open-webui.container
@@ -207,7 +207,7 @@ Environment=ENABLE_AUTOCOMPLETE_GENERATION=True
 # for emphasis instead of double-hyphen.
 Environment="TITLE_GENERATION_PROMPT_TEMPLATE=Output a 3 to 6 word title for this chat, in the same language the chat used. Title only, no quotes, no preamble, no thinking. Chat:\n{{MESSAGES:END:2}}"
 Environment="TAGS_GENERATION_PROMPT_TEMPLATE=Output 1 to 3 short topic tags for this chat (single words or hyphenated, in the chat language). JSON only: {\"tags\":[\"\",\"\"]}. Chat:\n{{MESSAGES:END:4}}"
-Environment="FOLLOW_UP_GENERATION_PROMPT_TEMPLATE=Suggest 3 short follow up actions the user might want next, in the same language and tone the chat used. Each is a single phrase or short imperative the user could click. JSON only: {\"follow_ups\":[\"\",\"\",\"\"]}. Chat:\n{{MESSAGES:END:6}}"
+Environment="FOLLOW_UP_GENERATION_PROMPT_TEMPLATE=Suggest 3 short follow up actions the user might want next, grounded in the CURRENT chat content and the operator's CURRENT system state. Each is a single phrase or short imperative the user could click, in the same language and tone the chat used. JSON only: {\"follow_ups\":[\"\",\"\",\"\"]}. Chat:\n{{MESSAGES:END:6}}"
 Environment="AUTOCOMPLETE_GENERATION_PROMPT_TEMPLATE=Continue the user partial input naturally, in their language. Output ONLY the completion text, no quotes, no preamble. Input so far: {{PROMPT}}"
 
 # Identity. UID 817 (mios-open-webui)
diff --git a/usr/share/mios/owui/pipes/mios_agent_pipe.py b/usr/share/mios/owui/pipes/mios_agent_pipe.py
@@ -38,6 +38,7 @@
 import json
 import os
 import re
+import shlex
 import asyncio
 import time
 from typing import AsyncGenerator, Awaitable, Callable, Optional
@@ -1015,31 +1016,46 @@ def _render_tool_history_for_compose(self, msgs: list[dict]) -> str:
         '              several tools. Emit\n'
         '              {"action":"agent","reason":"<short>"}\n'
         "\n"
-        "MiOS verbs available for dispatch (use EXACT name + args shape):\n"
-        '  open_app(name:str, position:str="default", args:list[str]?, monitor:int=0)\n'
-        '    position enum: default / as-is / center / left / right /\n'
-        '    top / bottom / top-left / top-right / bottom-left /\n'
-        '    bottom-right / maximize\n'
-        '  focus_window(title:str)\n'
-        '  move_window(title:str, position:str, monitor:int=0)\n'
-        '  close_window(title:str, mode:str="graceful")    # mode: graceful|force\n'
-        '  list_windows()\n'
-        '  screen_layout()\n'
-        '  open_url(url:str, browser:str?)\n'
-        '  launch_app(name:str)            # simple, no position\n'
-        '  mios_find(name:str)             # READ-ONLY resolve\n'
-        '  mios_apps(filter:str?)          # inventory\n'
-        '  everything_search(query:str, limit:int=10, ext:str?)\n'
-        '  system_status()\n'
+        "MiOS verbs available for dispatch. Each verb is tagged WRITE\n"
+        "(causes a visible system effect) or READ (returns info only):\n"
+        '  [WRITE] open_app(name, position="default", args?, monitor=0)\n'
+        '          -- LAUNCH an app/program. Use for "open X" / "launch X" /\n'
+        '             "start X" / "run X". position enum:\n'
+        '             default / as-is / center / left / right / top /\n'
+        '             bottom / top-left / top-right / bottom-left /\n'
+        '             bottom-right / maximize\n'
+        '  [WRITE] launch_app(name)                -- simpler launch, no position arg\n'
+        '  [WRITE] focus_window(title)             -- bring an OPEN window to front\n'
+        '  [WRITE] move_window(title, position, monitor=0)\n'
+        '  [WRITE] close_window(title, mode="graceful")   -- mode: graceful|force\n'
+        '  [WRITE] open_url(url, browser?)         -- open a URL in a browser\n'
+        '  [READ ] list_windows()                  -- list currently OPEN windows\n'
+        '  [READ ] screen_layout()                 -- monitor geometry\n'
+        '  [READ ] mios_find(name)                 -- resolve name -> path, no launch\n'
+        '  [READ ] mios_apps(filter?)              -- INVENTORY of installed apps, no launch\n'
+        '  [READ ] everything_search(query, limit=10, ext?)\n'
+        '  [READ ] system_status()\n'
+        "\n"
+        "Verb-pick priority (most common cases first):\n"
+        '  "open X" / "launch X" / "start X" / "run X"  -> open_app(name=X)\n'
+        '  "close X"                                    -> close_window(title=X)\n'
+        '  "focus X" / "bring X to front" / "switch to X" -> focus_window(title=X)\n'
+        '  "move X to <pos>"                            -> move_window(title=X, position=<pos>)\n'
+        '  "what apps are installed" / "list apps"      -> mios_apps()\n'
+        '  "what windows are open"                      -> list_windows()\n'
+        '  "go to <url>" / "visit <url>"                -> open_url(url=<url>)\n'
         "\n"
         "Rules:\n"
-        "- Pick `dispatch` ONLY when ONE call solves it. Position\n"
-        "  defaults to \"default\" (golden+16:10 centered); set\n"
+        "- `dispatch` only when ONE verb solves it.\n"
+        "- A WRITE verb is the right pick whenever the user asks for a\n"
+        "  system effect (open/close/focus/move). NEVER pick a READ verb\n"
+        "  when the user clearly wants an effect.\n"
+        "- Position defaults to \"default\" (golden+16:10 centered); set\n"
         "  explicitly only when the user named a side.\n"
-        "- Pick `chat` for greetings, thanks, follow-up clarification\n"
-        "  the agent can answer in one sentence.\n"
-        "- Pick `agent` for anything that needs N>1 tools, web\n"
-        "  research, install, file editing, multi-step plans.\n"
+        "- `chat` for greetings, thanks, one-sentence clarification.\n"
+        "- `agent` for N>1 tools, web research, install, file editing,\n"
+        "  general knowledge questions, conversational follow-through.\n"
+        "  MiOS-Agent is both an Agentic-OS AND a generalized AI agent.\n"
         "- Mirror the user's language in `reply` fields.\n"
         "- Output JSON ONLY -- no preamble, no markdown, no commentary."
     )
@@ -1129,7 +1145,14 @@ async def _dispatch_mios_verb(
                 ea = " ".join(shlex.quote(str(a)) for a in extra_args)
                 cmd = f"{env_prefix}mios-windows launch {shlex.quote(name)} {ea}"
             else:
-                cmd = f"{env_prefix}mios-find {shlex.quote(name)} | bash"
+                # 2>/dev/null suppresses mios-find's interactive narrative
+                # ("(picked best of N; M alternatives:)" + "  1. [...] ..."),
+                # which is English noise written to stderr for human
+                # operators. Agent-side dispatch only needs the stdout
+                # resolution line (which is piped to bash). Without this
+                # filter the broker's combined stdout+stderr capture leaks
+                # the narrative into the user-facing tool_result.
+                cmd = f"{env_prefix}mios-find {shlex.quote(name)} 2>/dev/null | bash"
         elif tool == "launch_app":
             cmd = f"mios-launch {shlex.quote(str(args.get('name', '')))}"
         elif tool == "focus_window":
@@ -1925,20 +1948,42 @@ async def pipe(
                         result = json.loads(result_json)
                     except json.JSONDecodeError:
                         result = {"output": result_json}
-                    out = result.get("output") or result.get("stderr") or ""
-                    # Thin operator-facing reply: just confirm the
-                    # action + cite the broker output one-liner. No
-                    # polish (the verb already ran; nothing to polish).
-                    if result.get("success") is False:
-                        await self._emit(__event_emitter__, "✅", done=True)
-                        yield f"_⚠️ `{tool}` → {(out or 'failed')[:240]}_"
-                        return
-                    await self._emit(__event_emitter__, "✅", done=True)
-                    # Surface the first non-empty line of output for context.
-                    first_line = next((ln for ln in out.splitlines()
-                                       if ln.strip()), "")
-                    yield f"✅ `{tool}` · {first_line[:240]}" if first_line \
-                          else f"✅ `{tool}`"
+                    ok = bool(result.get("success"))
+                    await self._emit(__event_emitter__,
+                                     "✅" if ok else "⚠️", done=True)
+                    # OpenAI-tool-result-native propagation: emit the
+                    # structured tool_call + tool_result envelope inside
+                    # a <details type="tool_calls"> block OWUI renders
+                    # natively. The only literal characters at this layer
+                    # are cross-locale identifiers (the tool name) +
+                    # universal symbols (✅ / ⚠️). The structured JSON
+                    # matches the OpenAI tool_calls / role:tool shape,
+                    # so any downstream agent reading chat history sees
+                    # the canonical tool_result content (not a prose
+                    # render) and the operator gets a collapsible block
+                    # with the raw data inside (zero English narrative).
+                    envelope = {
+                        "tool_call": {
+                            "id": f"call_{int(time.time()*1000)}",
+                            "type": "function",
+                            "function": {
+                                "name": tool,
+                                "arguments": args if isinstance(args, dict) else {},
+                            },
+                        },
+                        "tool_result": {
+                            "success": ok,
+                            "output": (result.get("output") or "")[:2000],
+                            "stderr": (result.get("stderr") or "")[:2000],
+                        },
+                    }
+                    symbol = "✅" if ok else "⚠️"
+                    yield (
+                        f"<details type=\"tool_calls\" done=\"true\">\n"
+                        f"<summary>{symbol} `{tool}`</summary>\n\n"
+                        f"```json\n{json.dumps(envelope, indent=2, default=str)}\n```\n"
+                        f"</details>"
+                    )
                     return
             elif action == "chat":
                 reply = str(verdict.get("reply", "")).strip()