Overview

Built for PiperTTS (custom model training) with a simple file structure:

dataset.zip/
├── metadata.csv
└── wavs/
    ├── <name>_processed<index>.wav
    └── ...

Works with LJSpeech-compatible TTS engines too!
metadata.csv looks like:

wav_filename|transcript
wavs/<name>_processed<index>.wav|<transcript>
...

Setup

Tested on Ubuntu Server 22.04 (Python 3.10.12) and Ubuntu Desktop 24.04 (Python 3.12.3). Should run fine on Debian-based systems.
No (official) Windows support, sorry!

Clone it:

git clone https://github.com/DominicTWHV/LJSpeech_Dataset_Generator.git

Setup:

cd LJSpeech_Dataset_Generator
chmod +x pipeline.sh

Run it:
```
./pipeline.sh
```
Then hop onto the Gradio WebUI @ port 7860. The server listens on 0.0.0.0:7860 by default. For local use, connect at https://127.0.0.1:7860/ .

Post-Processing

Move dataset.zip to your training directory:

mv /output/dataset.zip /path/to/training/dir
unzip dataset.zip

Or just download it via the WebUI.

Troubleshooting

File permission issues? Missing files? Check script permissions or background processes.
Still stuck? Feel free to drop a note in the issues tab.

Name		Name	Last commit message	Last commit date
Latest commit History 198 Commits
.github		.github
functions		functions
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
build_docker.sh		build_docker.sh
pipeline.sh		pipeline.sh
requirements.txt		requirements.txt
webui.py		webui.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Setup

Post-Processing

Troubleshooting

About

Uh oh!

Releases 3

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Overview

Setup

Post-Processing

Troubleshooting

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages