Skip to content

fix(stt): track transcript timing provenance#5285

Merged
ComputelessComputer merged 1 commit into
mainfrom
fix/transcript-timing-provenance
May 20, 2026
Merged

fix(stt): track transcript timing provenance#5285
ComputelessComputer merged 1 commit into
mainfrom
fix/transcript-timing-provenance

Conversation

@ComputelessComputer
Copy link
Copy Markdown
Collaborator

@ComputelessComputer ComputelessComputer commented May 20, 2026

Preserve provider word timing metadata through batch persistence and transcript rendering, request OpenAI Whisper word timestamps, and keep transcript-only synthetic timings from driving seek clicks.


Note

Medium Risk
Touches STT ingestion/persistence and transcript rendering behavior, including provider adapter output and click-to-seek gating; mistakes could impact transcript playback seeking or metadata compatibility across stored transcripts.

Overview
Adds per-word timing provenance via new ~/stt/timing helpers and threads this metadata through batch ingestion (transformWordEntries/batch listeners), Tinybase persistence (useRunBatch), and transcript rendering types (SegmentWord).

Batch providers now emit/propagate a timing_source (OpenAI requests word timestamps for Whisper; OpenAI text-only/stream results no longer synthesize words and are marked synthetic_text; Mistral marks segment-interpolated timings), and the UI disables word click-to-seek + hover affordances for non-seekable (synthetic_text) words while ensuring segment stability comparisons account for timing source.

Rust-rendered transcript segments now reattach the original word metadata after rendering (render-transcript.ts), with updated tests covering the new metadata expectations.

Reviewed by Cursor Bugbot for commit 96c6d7f. Bugbot is set up for automated code reviews on this repo. Configure here.

@netlify
Copy link
Copy Markdown

netlify Bot commented May 20, 2026

Deploy Preview for old-char canceled.

Name Link
🔨 Latest commit 96c6d7f
🔍 Latest deploy log https://app.netlify.com/projects/old-char/deploys/6a0d3fc83022030008a89f80

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 1c4fdf6. Configure here.

Comment thread apps/desktop/src/store/zustand/listener/batch.ts
Preserve provider word timing metadata through batch persistence and transcript rendering, request OpenAI Whisper word timestamps, and keep transcript-only synthetic timings from driving seek clicks.
@ComputelessComputer ComputelessComputer force-pushed the fix/transcript-timing-provenance branch from 1c4fdf6 to 96c6d7f Compare May 20, 2026 04:59
@ComputelessComputer ComputelessComputer merged commit a9c41c0 into main May 20, 2026
12 checks passed
@ComputelessComputer ComputelessComputer deleted the fix/transcript-timing-provenance branch May 20, 2026 05:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant