Issues with Hindi Language Voice Cloning in VoxCPM

When using the VoxCPM2 model for voice cloning on an Ubuntu GPU server, I encountered a few issues while testing Hindi voice generation.

**1. Hindi generation using English reference audio**
When I provide Hindi input text along with an English reference audio sample (e.g., a speech by Barack Obama), the model fails to generate proper Hindi audio output. In some cases, no meaningful audio is produced, or the output is unintelligible.

**2. Extra word/artifact with Hindi reference audio**
When I use a Hindi reference audio and generate Hindi speech, the output consistently contains an unexpected extra word or artifact at the beginning of the generated audio. This behavior is reproducible across multiple test samples.

**3. Hi-Fi cloning observation**
I also tested the Hi-Fi cloning setup (with prompt audio and transcript), and the above issues—especially the initial artifact in Hindi output—still persist.

**Minimal reproducible example:**

```python
# -*- coding: utf-8 -*-

from voxcpm import VoxCPM
import soundfile as sf

# Load model
model = VoxCPM.from_pretrained(
    "openbmb/VoxCPM2",
    load_denoiser=False
)

# INPUTS
text = "हम बैकग्राउंड म्यूज़िक के साथ आवाज़ की स्पष्टता की जाँच कर रहे हैं।"
reference_audio = "Obama_reference.wav"

# Generate cloned speech
wav = model.generate(
    text=text,
    reference_wav_path=reference_audio,
    cfg_value=2.0,
    inference_timesteps=10,
)

# Save output
sf.write("output_hindi.wav", wav, model.tts_model.sample_rate)

print("Output saved as output_hindi.wav")
```

**Environment details:**

* OS: Ubuntu
* GPU: H100
* Python: 3.10
* Installation: Cloned from the official VoxCPM repository

**Questions:**

1. Is cross-language voice cloning (e.g., English reference → Hindi output) currently supported or recommended?
2. What could be causing the extra word or artifact at the beginning of Hindi outputs?
3. Are there any best practices or parameter tuning recommendations for Hindi or multilingual voice cloning?

Any guidance would be greatly appreciated. Thank you!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues with Hindi Language Voice Cloning in VoxCPM #288

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issues with Hindi Language Voice Cloning in VoxCPM #288

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions