Commit e3b17a0
mios-daemon + agent-pipe router: repoint at the iGPU lane (:11435)
With mios-ollama-igpu standing up the micro-LLM lane in the
preceding commit, repoint the two known micro-LLM clients at it
so dGPU/CUDA traffic (qwen2.5-coder:7b polish, big Hermes
inference) stops competing with classifier + nudger requests
for queue slots.
usr/libexec/mios/mios-daemon
MIOS_DAEMON_ENDPOINT default flips from :11434 -> :11435. The
daemon IS the iGPU micro-LLM agent per the operator's stated
architecture ("iGPU micro-llm(s)/mios-daemon agent collects
Linux systems logs, journals, relevant AI files--etc-etc"); the
prior default had it talking to the dGPU/CUDA lane instead.
usr/lib/mios/agent-pipe/server.py
MIOS_AGENT_PIPE_ROUTER_ENDPOINT default flips from :11434 ->
:11435. The Layer-1 router classifier (qwen3:1.7b) now lives
exclusively on the iGPU lane; under dGPU saturation router
latency stays bounded because it never queues behind big-model
inference.
usr/lib/systemd/system/mios-agent-pipe.service
Adds explicit Environment=MIOS_AGENT_PIPE_ROUTER_ENDPOINT line
so an operator inspecting the unit (`systemctl cat
mios-agent-pipe.service`) sees the routing decision without
having to read the Python defaults.
Live-verified on podman-MiOS-DEV:
/health.router.endpoint -> http://localhost:11435 ✓
Chat fast-path end-to-end -> 7.2s first-call (cold ollama-rocm
runner, CPU fallback because WSL kernel doesn't expose AMD
/dev/kfd + /dev/dri) -> "Hello!" returned cleanly ✓
podman logs mios-ollama-igpu shows the POST landed at the new
lane: [GIN] 200 | 7.195948393s | POST "/v1/chat/completions" ✓
On bare-metal MiOS-bootc with the AMD iGPU exposed in the host
kernel, the same setup activates ROCm on the iGPU and router /
nudger latency drops further. Same code path; deployment-time
capability difference.
Operator overrides remain available -- a deployment with only an
NVIDIA dGPU (no AMD iGPU lane) can point both endpoints back at
:11434 via /etc/mios/agent-pipe.env + the matching mios-daemon
env override.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent 5462671 commit e3b17a0
3 files changed
Lines changed: 19 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
75 | 75 | | |
76 | 76 | | |
77 | 77 | | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
78 | 83 | | |
79 | | - | |
| 84 | + | |
80 | 85 | | |
81 | 86 | | |
82 | 87 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
21 | 27 | | |
22 | 28 | | |
23 | 29 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
72 | 72 | | |
73 | 73 | | |
74 | 74 | | |
75 | | - | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
76 | 82 | | |
77 | 83 | | |
78 | 84 | | |
| |||
0 commit comments