Unclear voice cloning instructions / documentation

Hey, I tested Moss-TTS-Nano, especially the voice cloning part.

What I don't understand from the docs is
- if the source audio transcription can be passed along somewhere (seems not?)
- what the expected source audio length is – I tried 2s, 3s, 6s, 10s, 30s, and only the 3s-part had somewhat decent results, the others all produced garbage.
- how to cache a voice profile for multiple generations.

Thanks!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unclear voice cloning instructions / documentation #9

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Unclear voice cloning instructions / documentation #9

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions