Skip to content

[Together] Default text-to-speech voice#2184

Merged
hanouticelina merged 1 commit into
mainfrom
together-tts-default-voice
May 21, 2026
Merged

[Together] Default text-to-speech voice#2184
hanouticelina merged 1 commit into
mainfrom
together-tts-default-voice

Conversation

@hanouticelina
Copy link
Copy Markdown
Contributor

@hanouticelina hanouticelina commented May 21, 2026

Comment from @hanouticelina : this PR should fix the widget for text-to-speech with Together

Summary

Together's /v1/audio/speech requires a voice field, but the SDK didn't set one — so any TTS call that didn't pass parameters.voice failed with HTTP 400 "voice is required". This also caused the periodic HF mapping validator to flip hexgrad/Kokoro-82M to status: "error".

Defaults voice to af_alloy (a valid Kokoro voice) only when the target model is Kokoro — the only TTS model currently registered for Together. User-supplied parameters always override. Other model families (Orpheus, Cartesia, …) get no default and continue to surface Together's clear "voice is required" error if the caller omits one.

Behavior matrix

Call Before After
Kokoro, no voice 400 "voice is required" voice: "af_alloy"
Kokoro, parameters: { voice: undefined } 400 voice: "af_alloy"
Kokoro, parameters: { voice: "af_bella" } af_bella af_bella ✓ (user override)
Future non-Kokoro, no voice 400 400 "voice is required" (no wrong default)
Future non-Kokoro, parameters: { voice: "tara" } tara tara

Test plan

  • pnpm --filter @huggingface/inference run check (tsc) — clean
  • pnpm --filter @huggingface/inference run lint:check — clean
  • pnpm --filter @huggingface/inference run format — clean
  • Live against api.together.xyz with hexgrad/Kokoro-82M:
    • no parameters → 108.9 KB WAV (default af_alloy)
    • parameters: {} → 108.5 KB WAV (default af_alloy)
    • parameters: { voice: undefined } → 108.6 KB WAV (default af_alloy)
    • parameters: { voice: "af_bella" } → 92.9 KB WAV (user override)
  • Mock-fetch on a synthetic non-Kokoro model:
    • no voicebody.voice absent (no wrong default)
    • voice: "tara"body.voice: "tara" (user value passes through)
  • Reproduced the failure on main for direct comparison

Note

Cursor Bugbot is generating a summary for commit 61f1d47. Configure here.

Together's /v1/audio/speech requires a `voice` field, but the SDK didn't
set one — so any TTS call that didn't pass `parameters.voice` failed with
HTTP 400 "voice is required". This also caused the periodic HF mapping
validator to mark hexgrad/Kokoro-82M as `status: "error"`.

Default to `af_alloy` (a valid Kokoro voice) only when the target model is
Kokoro — the only TTS model currently registered for Together. User-supplied
parameters always override. Other model families (Orpheus, Cartesia, …) get
no default and continue to surface Together's clear "voice is required"
error if the caller omits one.
@hanouticelina hanouticelina changed the title [Together] Default text-to-speech voice to af_alloy for Kokoro [Together] Default text-to-speech voice May 21, 2026
@hanouticelina hanouticelina merged commit da1107d into main May 21, 2026
6 checks passed
@hanouticelina hanouticelina deleted the together-tts-default-voice branch May 21, 2026 13:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant