Problem
The agent is slow to reply. Need a systematic latency investigation + fixes.
Suspected contributors (from triage):
- Every user message goes through the full pipeline serially ("goes through XX and stop") — needs more async/concurrency.
- Reasoning runs after every response, adding latency on simple turns.
- Use of
-p (print/headless) mode — evaluate alternatives.
- Messages DB + growing context slow things down.
- Memory file reads slow; consider refreshing/removing old data.
Expected
- Profile the per-turn path end to end; identify the dominant cost.
- Concrete fixes (async, conditional reasoning, transport change, DB/context/memory optimizations).
Related: #54 (context compaction), #55 (memory RAG), #62 (effort level), #64 (SQL cache).
From triage dump: "The model is slow, understand why" + the whole "What slows down" section.
Labels: enhancement
Problem
The agent is slow to reply. Need a systematic latency investigation + fixes.
Suspected contributors (from triage):
-p(print/headless) mode — evaluate alternatives.Expected
Related: #54 (context compaction), #55 (memory RAG), #62 (effort level), #64 (SQL cache).
From triage dump: "The model is slow, understand why" + the whole "What slows down" section.
Labels: enhancement