Skip to content

Commit 76f9180

Browse files
authored
feat(trace): unified per-conversation forensic recorder for chat + search (#139)
* feat(trace): unified per-conversation forensic recorder for chat + search Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com> * feat(trace): hot-swap recorder when [debug] trace_enabled toggles Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com> * fix(trace): suppress empty conversation_end when no chat fired Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com> * feat(trace): mirror /search turn into chat-domain trace file Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com> * fix(trace): repoint stale rustdoc links to crate::trace Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com> * fix(trace): qualify private intra-doc link in FileRecorder::record Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com> * fix(trace): defer is_first_turn flip until backend confirms turn Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com> * fix(trace): sanitize ConversationId to block path traversal Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com> * refactor(trace): extract per-chunk and start-gate helpers from coverage-off commands Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com> * fix(trace): evict search-domain registry entries on TurnEnd Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com> * fix(trace): add TurnAccepted handshake so cancel-mid-first-turn cannot duplicate ConversationStart Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com> * test(trace): cover search cancel-mid-first-turn TurnAccepted parity Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com> * docs(trace): drop legacy search_trace_enabled alias and trim trace docs Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com> * refactor(settings): move trace toggle to AI tab and tidy timeout labels Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com> --------- Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
1 parent 5adc86d commit 76f9180

40 files changed

Lines changed: 3973 additions & 1148 deletions

CHANGELOG.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77

88
## Unreleased
99

10+
### Added
11+
12+
- **Unified trace recorder.** Records every chat conversation and `/search` session as JSON-Lines under `app_data_dir/traces/{chat,search}/<conversation_id>.jsonl`. Off by default; toggle from Settings or set `[debug] trace_enabled = true` in `config.toml`.
13+
1014
### Changed
1115

16+
- **BREAKING**: Renamed `[debug] search_trace_enabled` to `trace_enabled` (now covers both chat and search). Rename the field in your `config.toml` after upgrading. Trace file layout also changed to `traces/{chat,search}/<conversation_id>.jsonl`.
17+
- The `ask_ollama`, `search_pipeline`, and `capture_full_screen_command` Tauri commands now require a `conversationId: String` argument (and `ask_ollama` additionally requires `isFirstTurn: bool` and `slashCommand: Option<String>`). The frontend's `useOllama` hook generates a stable trace id per session and threads it transparently. External callers that invoked these commands directly must update their `invoke()` calls. A new fire-and-forget `record_conversation_end` command lets the frontend signal end-of-conversation (used by `useOllama.reset()` and `useOllama.loadMessages()`) so the chat-domain trace file gets a clean closing line.
1218
- **BREAKING**: Renamed the `[model]` section in `config.toml` to `[inference]`. The section still contains a single field, `ollama_url`, but the name now reflects what it actually configures (the inference daemon endpoint, not a model). There is no backward-compatibility shim: if you had a custom `[model]` section, rename it to `[inference]` after upgrading.
1319
- Active model selection is now strictly Option-typed end to end. Ollama's `/api/tags` is the single source of truth: when nothing is installed and nothing is persisted, Thuki refuses to dispatch requests and surfaces a "Pick a model" prompt instead of falling back to a hardcoded slug. The previous `DEFAULT_MODEL_NAME` constant has been removed.
1420

docs/configurations.md

Lines changed: 6 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -75,10 +75,8 @@ judge_timeout_s = 30
7575
router_timeout_s = 45
7676

7777
[debug]
78-
# When true, writes a forensic JSON-Lines trace file for every /search turn to
79-
# ~/Library/Application Support/com.quietnode.thuki/traces/.
80-
# Also toggleable from the Settings panel (Web tab, Diagnostics section).
81-
search_trace_enabled = false
78+
# Records every chat conversation and /search session to disk for later inspection.
79+
trace_enabled = false
8280
```
8381

8482
## Reading the reference tables
@@ -180,11 +178,11 @@ For security, both URLs default to your local machine (`127.0.0.1`) and should s
180178

181179
### `[debug]`
182180

183-
Diagnostics toggles. `search_trace_enabled` is exposed in the Settings panel (Web tab, Diagnostics section) so you can flip it without editing `config.toml`.
181+
Records every chat conversation and `/search` session as JSON-Lines under `app_data_dir/traces/{chat,search}/<conversation_id>.jsonl`. Off by default; toggleable from Settings. Trace files stay on your disk and are never uploaded.
184182

185-
| Field | Default | Tunable? | Why not tunable | Bounds | Description |
186-
| :--------------------- | :------ | :------- | :-------------- | :----- | :---------- |
187-
| `search_trace_enabled` | `false` | Yes ||| When on, Thuki writes a forensic JSON-Lines trace file for every `/search` turn to `~/Library/Application Support/com.quietnode.thuki/traces/`. Each file records every query sent to SearXNG, every page the reader fetched, and every AI decision in that turn. Useful for diagnosing why a search went wrong; leave off for normal use. |
183+
| Field | Default | Tunable? | Why not tunable | Bounds | Description |
184+
| :-------------- | :------ | :------- | :-------------- | :----- | :--------------------------------------------------------------------------- |
185+
| `trace_enabled` | `false` | Yes ||| Records every chat conversation and `/search` session to disk for debugging. |
188186

189187
### `[activation]` (not in TOML)
190188

src-tauri/src/commands.rs

Lines changed: 261 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -209,6 +209,13 @@ pub enum StreamChunk {
209209
Cancelled,
210210
/// A structured, user-friendly error occurred during processing.
211211
Error(OllamaError),
212+
/// Emitted exactly once per turn, after the backend has cleared every
213+
/// pre-`ConversationStart` gate (no-model bail, model lookup, etc.) and
214+
/// committed to opening the trace for this `conversation_id`. Carries
215+
/// no payload; the frontend uses it as the unambiguous signal to
216+
/// retire its `is_first_turn` flag without relying on token-arrival
217+
/// ordering. Does not appear in the trace itself.
218+
TurnAccepted,
212219
}
213220

214221
/// A single message in the Ollama `/api/chat` conversation format.
@@ -463,10 +470,65 @@ pub async fn stream_ollama_chat(
463470
accumulated
464471
}
465472

473+
/// Mirrors a streaming chunk into the chat-domain trace recorder. Pulled out
474+
/// of [`ask_ollama`] so the per-token routing logic and the token-count
475+
/// increment are exercised by the unit-test suite rather than the
476+
/// coverage-off Tauri command body. `Done`, `Cancelled`, and `Error` chunks
477+
/// are intentionally noops here: those terminal events are summarized by
478+
/// `AssistantComplete` after the stream returns.
479+
pub(crate) fn record_chunk_to_trace(
480+
chunk: &StreamChunk,
481+
recorder: &std::sync::Arc<crate::trace::BoundRecorder>,
482+
token_count: &AtomicU64,
483+
) {
484+
match chunk {
485+
StreamChunk::Token(text) => {
486+
token_count.fetch_add(1, Ordering::Relaxed);
487+
recorder.record(crate::trace::RecorderEvent::AssistantTokens {
488+
chunk: text.clone(),
489+
});
490+
}
491+
StreamChunk::ThinkingToken(text) => {
492+
recorder.record(crate::trace::RecorderEvent::AssistantThinking {
493+
chunk: text.clone(),
494+
});
495+
}
496+
StreamChunk::Done
497+
| StreamChunk::Cancelled
498+
| StreamChunk::Error(_)
499+
| StreamChunk::TurnAccepted => {}
500+
}
501+
}
502+
503+
/// Emits `ConversationStart` to the trace recorder iff this is the first
504+
/// turn of the conversation. Pulled out of [`ask_ollama`] and the search
505+
/// pipeline so the gate is covered by tests instead of the coverage-off
506+
/// Tauri command body.
507+
pub(crate) fn record_conversation_start_if_first_turn(
508+
recorder: &std::sync::Arc<crate::trace::BoundRecorder>,
509+
is_first_turn: bool,
510+
model: String,
511+
system_prompt: String,
512+
) {
513+
if is_first_turn {
514+
recorder.record(crate::trace::RecorderEvent::ConversationStart {
515+
model,
516+
system_prompt,
517+
});
518+
}
519+
}
520+
466521
/// Streams a chat response from the local Ollama backend. Appends the user
467522
/// message and assistant response to conversation history after completion
468523
/// or cancellation (retaining context for follow-up requests). Uses an epoch
469524
/// counter to prevent stale writes after a reset.
525+
///
526+
/// `conversation_id` flows from the frontend (`useConversationHistory.ts`).
527+
/// `is_first_turn` lets the frontend tell the backend "emit
528+
/// `ConversationStart` before this turn's `UserMessage`" without the backend
529+
/// needing to track per-conversation state. Both feed the unified trace
530+
/// recorder when `[debug] trace_enabled = true`; off by default they collapse
531+
/// to noop calls.
470532
#[cfg_attr(coverage_nightly, coverage(off))]
471533
#[cfg_attr(not(coverage), tauri::command)]
472534
#[allow(clippy::too_many_arguments)]
@@ -475,13 +537,17 @@ pub async fn ask_ollama(
475537
quoted_text: Option<String>,
476538
image_paths: Option<Vec<String>>,
477539
think: bool,
540+
conversation_id: String,
541+
is_first_turn: bool,
542+
slash_command: Option<String>,
478543
on_event: Channel<StreamChunk>,
479544
client: State<'_, reqwest::Client>,
480545
generation: State<'_, GenerationState>,
481546
history: State<'_, ConversationHistory>,
482547
config: State<'_, parking_lot::RwLock<AppConfig>>,
483548
active_model: State<'_, crate::models::ActiveModelState>,
484549
capabilities_cache: State<'_, ModelCapabilitiesCache>,
550+
trace_recorder: State<'_, std::sync::Arc<crate::trace::LiveTraceRecorder>>,
485551
) -> Result<(), String> {
486552
// Snapshot the config once so all downstream reads (endpoint, prompt, model)
487553
// see a consistent view even if the user edits Settings mid-stream.
@@ -507,6 +573,35 @@ pub async fn ask_ollama(
507573
let cancel_token = CancellationToken::new();
508574
generation.set_token(cancel_token.clone());
509575

576+
// Bind the trace recorder to this conversation. When tracing is on,
577+
// every event for this turn flows to
578+
// `traces/chat/<conversation_id>.jsonl` via the registry. When off,
579+
// each `record()` is a constant-time noop. The bound recorder is
580+
// cheap to clone and is captured by the streaming-pump closure so
581+
// per-token emits skip the registry lookup on the hot path.
582+
let live: std::sync::Arc<crate::trace::LiveTraceRecorder> =
583+
std::sync::Arc::clone(trace_recorder.inner());
584+
let live_inner: std::sync::Arc<dyn crate::trace::TraceRecorder> = live;
585+
let bound_recorder = std::sync::Arc::new(crate::trace::BoundRecorder::new(
586+
live_inner,
587+
crate::trace::ConversationId::new(conversation_id),
588+
));
589+
590+
// Emit ConversationStart at the moment we know the model + resolved
591+
// system prompt. The frontend's `is_first_turn` flag prevents this
592+
// event from firing on subsequent turns of the same conversation.
593+
record_conversation_start_if_first_turn(
594+
&bound_recorder,
595+
is_first_turn,
596+
model_name.clone(),
597+
config.prompt.resolved_system.clone(),
598+
);
599+
// Tell the frontend the trace was opened for this conversation_id.
600+
// Sent unconditionally (regardless of `is_first_turn`) so the hook
601+
// can retire its flag the moment ANY turn lands, even if a previous
602+
// first-turn attempt was cancelled before any token arrived.
603+
let _ = on_event.send(StreamChunk::TurnAccepted);
604+
510605
// Build user message content. When quoted text is present, label it
511606
// explicitly so the model knows the highlighted text is the primary
512607
// subject and any attached images provide surrounding context.
@@ -517,6 +612,16 @@ pub async fn ask_ollama(
517612
_ => message,
518613
};
519614

615+
// Emit UserMessage before any image base64 work, so the trace
616+
// captures the user's intent even if encoding fails. Image paths
617+
// are recorded as strings (matching the IPC contract); image bytes
618+
// never enter the JSONL.
619+
bound_recorder.record(crate::trace::RecorderEvent::UserMessage {
620+
content: content.clone(),
621+
attached_images: image_paths.clone().unwrap_or_default(),
622+
slash_command: slash_command.clone(),
623+
});
624+
520625
// Base64-encode attached images for the Ollama multimodal API.
521626
let images = match image_paths {
522627
Some(ref paths) if !paths.is_empty() => {
@@ -580,6 +685,14 @@ pub async fn ask_ollama(
580685
))
581686
};
582687

688+
let stream_started_ms = std::time::SystemTime::now()
689+
.duration_since(std::time::UNIX_EPOCH)
690+
.map(|d| d.as_millis() as u64)
691+
.unwrap_or(0);
692+
let token_count_atomic = std::sync::Arc::new(AtomicU64::new(0));
693+
let token_count_for_pump = std::sync::Arc::clone(&token_count_atomic);
694+
let recorder_for_pump = std::sync::Arc::clone(&bound_recorder);
695+
583696
let accumulated = stream_ollama_chat(
584697
OllamaChatParams {
585698
endpoint,
@@ -592,11 +705,25 @@ pub async fn ask_ollama(
592705
&client,
593706
cancel_token.clone(),
594707
|chunk| {
708+
// Mirror the user-visible chunk into the trace before
709+
// forwarding it to the frontend. Token / ThinkingToken
710+
// chunks land as discrete trace events; terminal chunks are
711+
// summarized below by `AssistantComplete`.
712+
record_chunk_to_trace(&chunk, &recorder_for_pump, &token_count_for_pump);
595713
let _ = on_event.send(chunk);
596714
},
597715
)
598716
.await;
599717

718+
let stream_ended_ms = std::time::SystemTime::now()
719+
.duration_since(std::time::UNIX_EPOCH)
720+
.map(|d| d.as_millis() as u64)
721+
.unwrap_or(0);
722+
bound_recorder.record(crate::trace::RecorderEvent::AssistantComplete {
723+
total_tokens: token_count_atomic.load(Ordering::Relaxed),
724+
latency_ms: stream_ended_ms.saturating_sub(stream_started_ms),
725+
});
726+
600727
// Persist user + assistant messages to in-memory history when the epoch
601728
// has not changed (no reset during streaming) and we received content.
602729
// This includes cancelled generations so that subsequent requests retain
@@ -658,6 +785,34 @@ pub fn reset_conversation(history: State<'_, ConversationHistory>) {
658785
history.messages.lock().unwrap().clear();
659786
}
660787

788+
/// Frontend-driven `ConversationEnd` emission.
789+
///
790+
/// The chat-domain trace lifecycle is owned by the frontend because
791+
/// Thuki's window-close intercept hides instead of quits, and the same
792+
/// conversation can resume on the next hotkey activation. Emitting
793+
/// `ConversationEnd` from the backend on window-hide would falsely mark
794+
/// every still-open conversation ended on every dismiss. The frontend
795+
/// invokes this command exactly when the user-perceived conversation
796+
/// terminates: clicking "New conversation", loading a different
797+
/// conversation from history, or quitting from the tray.
798+
///
799+
/// The command is a thin trace-only signal; it does NOT mutate
800+
/// `ConversationHistory` (that is `reset_conversation`'s job) and does
801+
/// NOT touch the SQLite-backed history UI.
802+
#[cfg_attr(coverage_nightly, coverage(off))]
803+
#[cfg_attr(not(coverage), tauri::command)]
804+
pub fn record_conversation_end(
805+
conversation_id: String,
806+
reason: String,
807+
trace_recorder: State<'_, std::sync::Arc<crate::trace::LiveTraceRecorder>>,
808+
) {
809+
use crate::trace::TraceRecorder;
810+
trace_recorder.record(
811+
&crate::trace::ConversationId::new(conversation_id),
812+
crate::trace::RecorderEvent::ConversationEnd { reason },
813+
);
814+
}
815+
661816
#[cfg(test)]
662817
mod tests {
663818
use super::*;
@@ -2122,4 +2277,110 @@ mod tests {
21222277
assert_eq!(err.kind, OllamaErrorKind::ModelNotFound);
21232278
assert!(!err.message.contains("picker chip"));
21242279
}
2280+
2281+
// ─── Trace orchestration helpers ────────────────────────────────────────
2282+
2283+
/// Builds a `BoundRecorder` over a `MockRecorder` so each helper test
2284+
/// can inspect what got recorded without going through the file system.
2285+
fn mock_bound_recorder(
2286+
conv_id: &str,
2287+
) -> (
2288+
Arc<crate::trace::BoundRecorder>,
2289+
Arc<crate::trace::recorder::MockRecorder>,
2290+
) {
2291+
let mock = Arc::new(crate::trace::recorder::MockRecorder::new());
2292+
let inner: Arc<dyn crate::trace::TraceRecorder> = mock.clone();
2293+
let bound = Arc::new(crate::trace::BoundRecorder::new(
2294+
inner,
2295+
crate::trace::ConversationId::new(conv_id),
2296+
));
2297+
(bound, mock)
2298+
}
2299+
2300+
#[test]
2301+
fn record_chunk_to_trace_emits_assistant_tokens_and_increments_count() {
2302+
let (bound, mock) = mock_bound_recorder("conv-token");
2303+
let counter = AtomicU64::new(0);
2304+
record_chunk_to_trace(&StreamChunk::Token("hi".to_string()), &bound, &counter);
2305+
record_chunk_to_trace(&StreamChunk::Token(" there".to_string()), &bound, &counter);
2306+
assert_eq!(counter.load(Ordering::Relaxed), 2);
2307+
let snapshot = mock.snapshot();
2308+
assert_eq!(snapshot.len(), 2);
2309+
for (id, _) in &snapshot {
2310+
assert_eq!(id.as_str(), "conv-token");
2311+
}
2312+
assert!(matches!(
2313+
snapshot[0].1,
2314+
crate::trace::RecorderEvent::AssistantTokens { ref chunk } if chunk == "hi"
2315+
));
2316+
assert!(matches!(
2317+
snapshot[1].1,
2318+
crate::trace::RecorderEvent::AssistantTokens { ref chunk } if chunk == " there"
2319+
));
2320+
}
2321+
2322+
#[test]
2323+
fn record_chunk_to_trace_emits_assistant_thinking_without_increment() {
2324+
let (bound, mock) = mock_bound_recorder("conv-think");
2325+
let counter = AtomicU64::new(0);
2326+
record_chunk_to_trace(
2327+
&StreamChunk::ThinkingToken("planning".to_string()),
2328+
&bound,
2329+
&counter,
2330+
);
2331+
assert_eq!(counter.load(Ordering::Relaxed), 0);
2332+
let snapshot = mock.snapshot();
2333+
assert_eq!(snapshot.len(), 1);
2334+
assert!(matches!(
2335+
snapshot[0].1,
2336+
crate::trace::RecorderEvent::AssistantThinking { ref chunk } if chunk == "planning"
2337+
));
2338+
}
2339+
2340+
#[test]
2341+
fn record_chunk_to_trace_skips_terminal_chunks() {
2342+
let (bound, mock) = mock_bound_recorder("conv-term");
2343+
let counter = AtomicU64::new(0);
2344+
record_chunk_to_trace(&StreamChunk::Done, &bound, &counter);
2345+
record_chunk_to_trace(&StreamChunk::Cancelled, &bound, &counter);
2346+
record_chunk_to_trace(
2347+
&StreamChunk::Error(no_model_selected_error()),
2348+
&bound,
2349+
&counter,
2350+
);
2351+
assert_eq!(counter.load(Ordering::Relaxed), 0);
2352+
assert_eq!(mock.snapshot().len(), 0);
2353+
}
2354+
2355+
#[test]
2356+
fn record_conversation_start_if_first_turn_emits_when_true() {
2357+
let (bound, mock) = mock_bound_recorder("conv-start");
2358+
record_conversation_start_if_first_turn(
2359+
&bound,
2360+
true,
2361+
"model-a".to_string(),
2362+
"you are helpful".to_string(),
2363+
);
2364+
let snapshot = mock.snapshot();
2365+
assert_eq!(snapshot.len(), 1);
2366+
assert!(matches!(
2367+
snapshot[0].1,
2368+
crate::trace::RecorderEvent::ConversationStart {
2369+
ref model,
2370+
ref system_prompt,
2371+
} if model == "model-a" && system_prompt == "you are helpful"
2372+
));
2373+
}
2374+
2375+
#[test]
2376+
fn record_conversation_start_if_first_turn_skips_when_false() {
2377+
let (bound, mock) = mock_bound_recorder("conv-skip");
2378+
record_conversation_start_if_first_turn(
2379+
&bound,
2380+
false,
2381+
"model-a".to_string(),
2382+
"ignored".to_string(),
2383+
);
2384+
assert_eq!(mock.snapshot().len(), 0);
2385+
}
21252386
}

0 commit comments

Comments
 (0)