Skip to content

[FEATURE] Audio normalization#268

Merged
mathieudpnt merged 32 commits into
Project-OSmOSE:mainfrom
Gautzilla:feature/audio-normalization
Sep 15, 2025
Merged

[FEATURE] Audio normalization#268
mathieudpnt merged 32 commits into
Project-OSmOSE:mainfrom
Gautzilla:feature/audio-normalization

Conversation

@Gautzilla
Copy link
Copy Markdown
Contributor

@Gautzilla Gautzilla commented Aug 29, 2025

⚠️ Obsolete description, see the updated comment below

🐳 What's new?

The fetched audio data can now be normalized according to 3 presets:

Preset Definition
raw $x$
dc_reject $x-\overline{x}$
zscore $\frac{x-\overline{x}}{\sigma (x)}$

The automatic dc rejection in the AudioData.get_value() method was removed, as it now is controlable through the normalization property.

🐳 How to use it?

The normalization property on AudioData, AudioDataset and Analysis can be set to either "raw", "dc_reject" or "zscore".
The fetched audio data will then be normalized accordingly when AudioData.get_value() is called.

🐬 Core API

Simply set the normalization property of AudioData or AudioDataset objects:

from osekit.core_api.audio_dataset import AudioDataset

ads = AudioDataset(
    ...,
    normalization="zscore"
)

ads.write(...) # The written audio files will be normalized z-scores.

🐬 Public API

Simply set the normalization property of the Analysis object:

from osekit.public_api.dataset import Dataset
from osekit.public_api.analysis import Analysis

dataset = Dataset(...)

analysis = Analysis(
    ...,
    normalization = "zscore",
)

dataset.run_analysis(analysis=analysis) # The audio data is turned to z-score during the analysis

@Gautzilla
Copy link
Copy Markdown
Contributor Author

The PR is still in draft mode as I have to include normalization in the docs.

QuinquinCarpentierGIF

@mathieudpnt
Copy link
Copy Markdown
Contributor

As we discussed, this might benefit from adding a "full-scale" normalization mode

@Gautzilla
Copy link
Copy Markdown
Contributor Author

🐳 What's new?

The fetched audio data can now be normalized according to 4 presets given by the osekit.utils.audio_utils.Normalization flag:

Preset Definition
Normalization.RAW $x$
Normalization.DC_REJECT $x-\overline{x}$
Normalization.PEAK $\frac{x}{x_\text{max}}$
Normalization.ZSCORE $\frac{x-\overline{x}}{\sigma (x)}$

The automatic dc rejection in the AudioData.get_value() method was removed, as it now is controlable through the normalization property.

🐳 How to use it?

The normalization property on AudioData, AudioDataset and Analysis can be set to any Normalization flag.
The fetched audio data will then be normalized accordingly when AudioData.get_value() is called.

⚠️ Normalization.DC_REJECT can be combined with any other (single) normalization, but any other combination will raise a ValueError:

from osekit.utils.audio_utils import Normalization

n = Normalization.DC_REJECT | Normalization.PEAK # OK
n = Normalization.DC_REJECT | Normalization.ZSCORE # OK
n = Normalization.ZSCORE| Normalization.PEAK # raises a ValueError
n =Normalization.DC_REJECT | Normalization.ZSCORE| Normalization.PEAK # raises a ValueError

🐬 Core API

Simply set the normalization property of AudioData or AudioDataset objects:

from osekit.core_api.audio_dataset import AudioDataset
from osekit.utils.audio_utils import Normalization

ads = AudioDataset(
    ...,
    normalization=Normalization.ZSCORE
)

ads.write(...) # The written audio files will be normalized z-scores.

🐬 Public API

Simply set the normalization property of the Analysis object:

from osekit.public_api.dataset import Dataset
from osekit.public_api.analysis import Analysis
from osekit.utils.audio_utils import Normalization

dataset = Dataset(...)

analysis = Analysis(
    ...,
    normalization=Normalization.ZSCORE,
)

dataset.run_analysis(analysis=analysis) # The audio data is turned to z-score during the analysis

@mathieudpnt
Copy link
Copy Markdown
Contributor

mathieudpnt commented Sep 8, 2025

thumbs-thumbs-up

@Gautzilla
Copy link
Copy Markdown
Contributor Author

Gautzilla commented Sep 11, 2025

Resolving conflicts in notebooks looks buggy as f, so I had to make a few tweaks and now I need your doppelganger to approve the PR

image

@mathieudpnt mathieudpnt merged commit e1627eb into Project-OSmOSE:main Sep 15, 2025
1 check passed
@Gautzilla Gautzilla deleted the feature/audio-normalization branch September 15, 2025 13:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants