Skip to content

fix(tts): unify voice resolution — dashboard /tts settings now affect the tts tool#1175

Open
vanducng wants to merge 1 commit into
nextlevelbuilder:devfrom
dataplanelabs:upstream-fix/tts-system-configs-fallback
Open

fix(tts): unify voice resolution — dashboard /tts settings now affect the tts tool#1175
vanducng wants to merge 1 commit into
nextlevelbuilder:devfrom
dataplanelabs:upstream-fix/tts-system-configs-fallback

Conversation

@vanducng
Copy link
Copy Markdown
Contributor

Real-world bug

A user configured Vietnamese voice "HoaiMy" (`vi-VN-HoaiMyNeural`) on the dashboard `/tts` page. The Test Playground worked. But when the LLM invoked the `tts` tool, Edge synthesized English instead.

Root cause — two TTS storage locations

Location Read by Written by
`system_configs[tts..voice]` Test Playground, auto-apply at construction-time Dashboard `/tts` page
`builtin_tool_tenant_configs[tts].default_voice_id` `tts` tool (LLM-invoked) (not exposed in any UI)

When the LLM calls `tts` without an explicit `voice` arg, the resolution chain (`args > agent OtherConfig > tenant builtin`) finds no override → empty voice → Edge falls back to its built-in default (English).

Fix

Add `system_configs` as a 4th-level fallback so the dashboard is the single source of truth:

```
args > agent OtherConfig > tenant builtin > system_configs[tts..voice]
```

  • New `TtsTool.SetSystemConfigStore(s)` setter
  • `resolveVoiceAndModel` now takes `providerName` and reads the right key
  • Effective provider resolved BEFORE voice/model so the right lookup key is used
  • Wired at gateway boot in `cmd/gateway.go` after `pgStores` is ready

Tests

4 new unit tests in `internal/tools/tts_systemconfigs_fallback_test.go`:

  • `TestResolveVoiceAndModel_SystemConfigsFallback` — voice + model resolve from system_configs
  • `TestResolveVoiceAndModel_ArgWinsOverSystemConfigs` — explicit arg precedence preserved
  • `TestResolveVoiceAndModel_NoStoreNoFallback` — graceful when store unwired
  • `TestResolveVoiceAndModel_EmptyProviderSkipsFallback` — no key lookup without provider

All existing `tts` tests still pass.

Verification

  • `go build ./...` (PG) clean
  • `go vet ./...` clean
  • `go test ./internal/tools/...` green
  • Already running on the fork's prod cluster (v3.23.43) — verified TTS tool now uses dashboard-configured voice.

Cherry-picked from fork's PR dataplanelabs#172.

… the tts tool (#172)

The dashboard /tts page writes to system_configs[tts.<provider>.voice],
but the LLM-invoked tts tool was only checking args > agent OtherConfig
> builtin_tool_tenant_configs[tts].default_voice_id. Two different
storage locations → the user's chosen voice was ignored, Edge defaulted
to en-US-AriaNeural even when "HoaiMy" (vi-VN-HoaiMyNeural) was set
correctly in the dashboard.

Add system_configs as a 4th-level fallback so the dashboard becomes
the single source of truth.

- TtsTool gains SetSystemConfigStore(s) setter
- resolveVoiceAndModel takes providerName + looks up
  tts.<provider>.voice/model when no higher-precedence source set
- effectiveProvider resolved BEFORE voice/model so the right key is hit
- Wired at gateway boot in cmd/gateway.go (after pgStores ready)
- 4 unit tests covering: fallback, arg precedence, no-store, empty provider

Verified against production trace 019e6036-44bb-703b-85fa-dee34f7ab2c0
where the tts tool was called with provider=edge, no voice arg, and
defaulted to English instead of the configured Vietnamese voice.
@mrgoonie
Copy link
Copy Markdown
Contributor

Backlog review: merge-candidate. The PR fixes the dashboard-vs-tool TTS voice/model drift with explicit precedence and regression coverage for system_configs fallback. go and web checks are passing, and the scope is focused enough for merge review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants