Skip to content

fix(trace-store): merge _update span patches on FileSystemTraceStore.load#48

Merged
drewstone merged 1 commit into
mainfrom
fix/trace-store-load-merges-span-update-patches
May 14, 2026
Merged

fix(trace-store): merge _update span patches on FileSystemTraceStore.load#48
drewstone merged 1 commit into
mainfrom
fix/trace-store-load-merges-span-update-patches

Conversation

@tangletools
Copy link
Copy Markdown
Contributor

Summary

  • FileSystemTraceStore.load() was appending every NDJSON row through appendSpan, including the { ...patch, _update: true } fragments that updateSpan writes. Cross-process readers ended up with two spans per id (original full row + patch fragment with no kind/runId/name).
  • Mirrors the existing runs branch (which falls back to updateRun on duplicate-id throw): the spans branch now detects _update and routes to updateSpan, merging into the prior span.
  • Live (same-process) writers were unaffected because their in-memory index is populated directly by appendSpan/updateSpan; only the lazy-reload path was broken.

Why this matters

Tax-agent's canonical eval shards a FileSystemTraceStore per (variant × persona) cell, then a cross-process step (convertTraceStoresToOtlp) re-opens each cell dir and projects it into OTLP-flat JSONL for the trace analyst. The duplicate-span behavior meant the OTLP corpus was malformed (patch fragments were emitted as standalone OTLP spans with no model/messages/kind), and the analyst's downstream consumers either rejected the corpus or saw garbage spans. Combined with a separate extension bug in tax-agent's converter, the analyst was always skipping with OTLP corpus empty (0 runs, 0 spans).

Test plan

  • New regression in tests/trace-store.test.ts: writer process appends a span + updateSpan, fresh reader process re-opens the dir and asserts spans.length === 1 with all fields merged (original kind/runId/name/model survive, patch endedAt/status/output applied).
  • Existing trace-store tests (updateRun/updateSpan after index-loaded, round-trip, rotation) still pass.
  • Full suite: pnpm test → 1020/1020 pass.
  • pnpm typecheck clean.

…load

When a fresh FileSystemTraceStore opens a directory written by a prior
process, load() calls appendSpan() for every NDJSON row — including the
{ ...patch, _update: true } fragments that updateSpan writes. Those
fragments lack kind / runId / name / model, so the reader ended up with
two spans per id (full row + fragment), breaking any consumer that
re-opens the store cross-process (e.g. tax-agent's canonical eval OTLP
converter, which then reported 0 spans to the trace analyst).

The runs branch already handled this via appendRun-then-updateRun
fallback. Mirror the same pattern for spans: detect \_update and route
to updateSpan, merging into the prior span. New regression test reads
the dir from a second FileSystemTraceStore instance and asserts the
patch fields are applied without losing kind / runId / name.
@tangletools
Copy link
Copy Markdown
Contributor Author

tangletools commented May 14, 2026

✅ No Blockers — 6f7cf13a

Readiness 85/100 · Confidence 85/100 · 3 findings (3 low)

kimi-code verifier:deepseek aggregate
Readiness 88 85 85
Confidence 98 85 85
Correctness 88 85 85
Security 100 95 95
Testing 95 80 80
Architecture 90 90 90

I read both changed files in full, ran the trace-store test suite (15/15 pass), and traced the callers/callees through InMemoryTraceStore and FileSystemTraceStore. The PR correctly fixes the cross-process span duplication bug by collapsing _update patch rows into their original spans on reload, mirroring the existing run-reload pattern. One minor data-pollution issue prevents a higher score. | The PR correctly fixes duplicate-span-on-reload by collapsing _update patches in the load path. However, the fix passes the raw record (including the _update sentinel) to InMemoryTraceStore.updateSpan at

🟡 LOW Internal _update flag leaks into reloaded Span objects — src/trace/store.ts

At line 280, the reload path passes the raw NDJSON record directly to store.updateSpan(record.spanId, record). Since record includes the internal _update: true field written by FileSystemTraceStore.updateSpan (line 320), that flag gets spread onto the merged Span object by InMemoryTraceStore.updateSpan (line 100). Same-process spans never carry _update because FileSystemTraceStore.updateSpan strips it before touching the in-memory index ([lin

🟡 LOW Same _update flag leak affects Run objects on reload — src/trace/store.ts

The runs reload path at lines 263-268 uses append-fallback-to-update. When an _update: true run row is processed, appendRun fails (duplicate runId), then updateRun at line 268 is called with the raw record containing _update: true. InMemoryTraceStore.updateRun at line 90 spreads it via { ...existing, ...patch } into the stored Run. Fix: destructure _update out of the record before calling store.updateRun, same as the span fix.

🟡 LOW Cross-process reload regression test doesn't assert _update is absent from merged spans — tests/trace-store.test.ts

The test at line 143-171 validates that span fields merge correctly after cross-process reload but never asserts that the internal _update flag is NOT present on the reloaded Span object. This means the _update leak bug (kimi-code#0) would pass CI silently. Fix: add expect((s as Record<string, unknown>)._update).toBeUndefined() after line 171.


tangletools · 2026-05-14T11:56:25Z · trace

Copy link
Copy Markdown
Contributor Author

@tangletools tangletools left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Approved — 3 non-blocking findings — 6f7cf13a

I read both changed files in full, ran the trace-store test suite (15/15 pass), and traced the callers/callees through InMemoryTraceStore and FileSystemTraceStore. The PR correctly fixes the cross-process span duplication bug by collapsing _update patch rows into their original spans on reload, mirroring the existing run-reload pattern. One minor data-pollution issue prevents a higher score. | The

Full findings and scores: review summary


tangletools · 2026-05-14T11:56:25Z · trace

@drewstone drewstone merged commit 5e47a2a into main May 14, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants