Skip to content

Optimize translation costs: default off, per-conversation toggle, batch mode, screen-aware deferral #6837

@beastoin

Description

@beastoin

Translation via the /v4/listen API is the largest translation cost driver. Currently every user with a primary language set gets real-time Google Cloud Translation API calls on every conversation, even monolingual speakers. This issue proposes 5 changes to cut costs while keeping translation seamless for users who need it.

Current Behavior

  • Default ON: single_language_mode defaults to false. Any user who sets a primary language during onboarding gets auto-translation active on every listen session.
  • Global toggle only: Translation is all-or-nothing via Settings → Language → Automatic Translation. No per-conversation control.
  • Real-time per-segment: TranslationCoordinator processes every segment through language detection → classification → GCP Translation API ($15/M chars). Monolingual gate skips most calls after 4 consecutive same-language detections, but the coordinator is instantiated for every session.
  • No screen awareness: Translation happens even when the phone screen is off and user isn't viewing the transcript.
  • No time limit: No restriction on translating old segments retroactively.
  • Desktop waste: Desktop app sends language param to backend, backend translates and persists, but desktop never displays translations — wasted API calls.

Expected Behavior

Smart, cost-efficient translation that's seamless when users need it but doesn't burn API calls when they don't.

Affected Areas

File Description
backend/routers/transcribe.py:318-328 Translation language determination — currently enabled whenever single_language_mode=false and language preference exists
backend/utils/translation_coordinator.py Real-time coordinator — instantiated every session even for monolingual users
backend/utils/translation.py GCP Translation API client — _client.translate_text() calls
backend/utils/translation_cache.py Monolingual gate + caching (already good, but gate activates too late)
backend/routers/users.py:525-534 Language preference endpoint — auto-sets single_language_mode based on language support, not user choice
app/lib/pages/settings/language_settings_page.dart Global translation toggle UI
app/lib/services/sockets/transcription_service.dart WebSocket language param
app/lib/providers/capture_provider.dart TranslationEvent handler
app/lib/widgets/transcript.dart Translation display in conversation view

Solution

1. Default auto-translate OFF

  • Change single_language_mode default to true for new users
  • Existing users with translation enabled keep their setting (no migration needed)
  • Users who want translation explicitly enable it in Settings

2. Per-conversation toggle in live conversation UI

  • Add a translate button/toggle in the live conversation capturing screen
  • When user turns it on mid-conversation, backend starts translating from that point
  • Once enabled for a conversation, keep auto-translate on for that conversation permanently (sticky per-conversation)
  • Settings → Language still has the global option to turn it off entirely

3. Batch translation for on-demand activation

  • When user enables translation mid-conversation, batch-translate existing segments (only segments created in the last 24h)
  • Use translate_units_batch() (already exists) for efficient batching instead of per-segment real-time calls
  • Cap retroactive translation to 24h window to prevent unbounded cost on long histories

4. Screen-aware deferral

  • Track screen on/off state from the mobile app (send as WebSocket metadata or periodic signal)
  • When screen is off: defer translation, accumulate segments
  • When screen turns on: batch-translate accumulated segments in one API call
  • Net effect: same UX (translations appear when user looks), far fewer API calls (1 batch vs N real-time)

5. Desktop: skip translation entirely

  • Desktop app doesn't display translations — don't request them
  • Either: desktop sends single_language_mode=true override in WebSocket params, or backend checks source=desktop and skips translation
  • Remove unused google-cloud-translate from pusher/requirements.txt (dead dependency)

Impact

  • Cost: Major reduction in GCP Translation API spend — most users are monolingual and currently trigger the coordinator unnecessarily
  • UX: No degradation — users who need translation get it on-demand with the same seamless experience
  • Risk: Users who currently rely on always-on translation will need to re-enable it after the default change (one-time)

by AI for @beastoin

Metadata

Metadata

Assignees

No one assigned

    Labels

    p3Priority: Backlog (score <14)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions