Skip to content
Discussion options

You must be logged in to vote

Short version: there's no language or accent flag you can pass per request. The TTS request in koboldcpp only carries the text, a speaker seed, an audio seed, the speaker voice/instruction, and an optional reference audio clip. There is no language field. (The language/langcode param you may have seen is for Whisper transcription, not TTS.)

The English-accent leak you're hitting is a model-level thing, not a missing koboldcpp knob. Qwen3-TTS tends to fall back to its dominant English distribution unless it's conditioned on a real example of the target language, which is why you sometimes get the right accent only after a few retries. Upstream has the same report: QwenLM/Qwen3-TTS#134 ("Wh…

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@schnz
Comment options

Answer selected by LostRuins
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants