Skip to content

[Feature Request] Explicit pt-BR (Brazilian Portuguese) language tag, separate from pt-PT #167

Description

@jonathansousa87

Problem

Currently, both Brazilian Portuguese and European Portuguese are merged
under a single pt language code. This causes unpredictable dialect output
during inference — even when providing a strong Brazilian Portuguese reference
audio, the model sometimes generates European Portuguese phonetics instead.

The two variants are phonetically distinct in ways that are immediately
noticeable to native speakers:

  • Palatalization of /t/ and /d/ before /i/ (e.g., "tia" → /tʃia/ in BR,
    /tia/ in PT)
  • Pre-consonant /r/ realization (guttural in BR, tapped in PT)
  • Open vowels and vocalic rhythm (fuller in BR, reduced in PT)
  • Syllable-final /l/ → /w/ vocalisation (BR only)

Use Case

I am building a video dubbing pipeline (English → Brazilian Portuguese)
where consistent pt-BR output per segment is a hard requirement.
The current ambiguity makes the model unreliable for this use case
without additional post-processing workarounds.

Brazil has 200M+ Portuguese speakers and is the largest Portuguese-speaking
market in the world. A dedicated pt-BR tag would make MOSS-TTS significantly
more useful for a large and underserved developer audience.

Request

For MOSS-TTS 2.0, please consider:

  1. Separate language codes: pt-BR for Brazilian Portuguese and pt-PT
    for European Portuguese, following the ISO 639-1 standard already used
    by most TTS systems.
  2. Dedicated training data per variant: ensuring the model has sufficient
    pt-BR speech data so the dialect is reliably reproduced without depending
    solely on reference audio to infer the variant.
  3. Reference audio + language tag combination: when both are provided,
    the explicit tag should take precedence over the dialect inferred from
    the audio.

Current Workaround (insufficient)

Using a reference audio with strong BR phonetics helps but does not guarantee
consistent output. The model still occasionally falls back to European
Portuguese, which is not acceptable for production dubbing workflows.

Thank you for the great work on MOSS-TTS — the token-level duration control
and Apache 2.0 license make it the most promising open-source TTS for this
use case. Looking forward to 2.0!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions