Skip to content

fix: fix_loudness to -12 dB#221

Open
zgldh wants to merge 1 commit into
myshell-ai:mainfrom
zgldh:main
Open

fix: fix_loudness to -12 dB#221
zgldh wants to merge 1 commit into
myshell-ai:mainfrom
zgldh:main

Conversation

@zgldh
Copy link
Copy Markdown

@zgldh zgldh commented Dec 8, 2024

Fixed the loudness issue with the idea of #45 (comment)

Original output:
https://zgldh.github.io/temp/original.wav

Fixed output:
https://zgldh.github.io/temp/fixed_loudness.wav

@nwhitehead
Copy link
Copy Markdown

This is a good idea!

In the example output in the comment it seems to me like the audio is too loud everywhere. The waveform looks visually clipped and it sounds distorted. I think pyloudnorm is doing filtering to avoid clipping artifacts as much as it can but I still hear it.

I tried with other values and -18 seemed to be good. That sounded clean to me and matched the volume level of other spoken text sources.

@alastorid alastorid mentioned this pull request Jan 21, 2025
not-hanjo-mei added a commit to not-hanjo-mei/MeloTTS that referenced this pull request May 5, 2025
Implements a new FastAPI-based web API in the `webapi/` directory,
providing an OpenAI-compatible endpoint for TTS generation. Includes:
- API implementation and dependencies.
- Unit tests (`test/test_webapi.py`).
- Documentation (`docs/webapi.md`) and updates to main docs.

Also integrates loudness normalization (based on myshell-ai#221)
to improve audio output consistency (`melo/api.py`, `melo/utils.py`).

Additional updates include:
- New Android deployment documentation.
- Training guide and script adjustments.
- Updated requirements.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants