Skip to content
Merged
Show file tree
Hide file tree
Changes from 29 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
5f2755d
add normalization functions to audio_utils module
Gautzilla Aug 28, 2025
05afcc1
add normalization util tests
Gautzilla Aug 28, 2025
a0b9c91
add AudioData.normalization property
Gautzilla Aug 28, 2025
454f972
remove AudioData.get_value reject_dc parameter
Gautzilla Aug 28, 2025
2980418
add AudioData normalization serialization
Gautzilla Aug 28, 2025
6b88cab
add AudioData normalization serialization tests
Gautzilla Aug 28, 2025
c4b2752
add normalization to AudioDataset
Gautzilla Aug 28, 2025
8f2f6b3
add AudioData and AudioDataset normalization tests
Gautzilla Aug 29, 2025
b74a395
add AudioDataset normalization serialization tests
Gautzilla Aug 29, 2025
0ec5eba
add public_api normalization in analysis
Gautzilla Aug 29, 2025
1db6340
add AudioData.normalization in the docs
Gautzilla Sep 1, 2025
640ff7f
add AudioDataset.normalization in the docs
Gautzilla Sep 1, 2025
fe4a0d0
add public API normalization in doc
Gautzilla Sep 1, 2025
65e0aa9
add AudioDataset.from_folder sample_rate parameter
Gautzilla Sep 1, 2025
76b7a40
add normalization in doc notebooks
Gautzilla Sep 1, 2025
cb04f13
remove reset cell from public LTAS notebook
Gautzilla Sep 1, 2025
0861c58
change normalization to a Flag
Gautzilla Sep 2, 2025
cf32e72
adapt normalization test to new normalization system
Gautzilla Sep 2, 2025
9cd31f2
add combined normalization test
Gautzilla Sep 2, 2025
3df6e01
use metaclass to check normalization validity on call
Gautzilla Sep 3, 2025
bf0a84d
use new Normalization flag in AudioData
Gautzilla Sep 3, 2025
9147277
use new Normalization flag in AudioDataset
Gautzilla Sep 3, 2025
0392ee4
use Normalization flag in the public API
Gautzilla Sep 3, 2025
b99efcf
use Normalization flag in example notebooks
Gautzilla Sep 3, 2025
93f82b4
update docs with Normalization flag
Gautzilla Sep 3, 2025
54f8ccd
add Normalization flag to API doc
Gautzilla Sep 3, 2025
f62386c
add negative peak tests
Sep 5, 2025
c04c0f5
fix peak normalization with negative values
Sep 5, 2025
9259f42
Merge branch 'main' into feature/audio-normalization
Sep 9, 2025
60c0634
move Normalization import to idoine cell
Sep 11, 2025
009c6f3
Merge branch 'main' into feature/audio-normalization
Sep 11, 2025
3760c7e
Merge branch 'main' into feature/audio-normalization
Gautzilla Sep 15, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion docs/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,5 @@
:maxdepth: 2

publicapi
coreapi
coreapi
utils
62 changes: 59 additions & 3 deletions docs/source/coreapi_usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,59 @@ The data is fetched seamlessly on-demand from the audio file(s). The opening/clo

Eventual time gap between audio items are filled with ``0.`` values.

Normalization
"""""""""""""

The fetched audio data can be normalized according to the presets given by the :class:`osekit.utils.audio_utils.Normalization` flag:

.. list-table:: Normalization presets
:widths: 10 10
:header-rows: 1

* - Name
- Description
* - ``Normalization.RAW``
- :math:`x`
* - ``Normalization.DC_REJECT``
- :math:`x-\overline{ x }`
* - ``Normalization.PEAK``
- :math:`\frac{x}{x_\text{max}}`
* - ``Normalization.ZSCORE``
- :math:`\frac{ x-\overline{x} }{\sigma (x)}`

To normalize the data, simply set the :attr:`osekit.core_api.audio_data.AudioData.normalization` property to the
requested normalization flag:

.. code-block:: python

from osekit.core_api.audio_data.AudioData import AudioData
from osekit.utils.audio_utils.normalization import Normalization

ad = AudioData(...)
ad.normalization = Normalization.ZSCORE # Note: normalization also is a parameter of the AudioData initializer

v = ad.get_value() # The fetched data will then be normalized

.. note::

The ``Normalization.DC_REJECT`` normalization can be combined with any single other normalization:

.. code-block:: python

from osekit.utils.audio_utils.normalization import Normalization

dc_peak = Normalization.DC_REJECT | Normalization.PEAK

.. warning::

Instantiating another combination of normalizations will raise an error:

.. code-block:: python

from osekit.utils.audio_utils.normalization import Normalization

incorrect_normalization = Normalization.RAW | Normalization.PEAK
incorrect_normalization = Normalization.DC_REJECT | Normalization.RAW | Normalization.PEAK

Calibration
"""""""""""
Expand All @@ -124,8 +177,8 @@ allows for retrieving the data in the shape of the recorded acoustic pressure.

.. code-block:: python

from osekit.core_api.instrument import Instrument
from osekit.core_api.audio_data import AudioData
from osekit.core_api.instrument import Instrument
import numpy as np

instrument = Instrument(end_to_end_db = 150) # The raw 1. WAV value equals 150 dB SPL re 1 uPa
Expand Down Expand Up @@ -170,6 +223,7 @@ an ``AudioDataset`` from a given folder containing audio files:

from pathlib import Path
from osekit.core_api.audio_dataset import AudioDataset
from osekit.core_api.instrument import Instrument
from pandas import Timestamp, Timedelta

folder = Path(r"...")
Expand All @@ -179,7 +233,9 @@ an ``AudioDataset`` from a given folder containing audio files:
strptime_format="%y_%m_%d_%H_%M_%S", # To parse the files begin Timestamp
begin=Timestamp("2009-01-06 12:00:00"),
end=Timestamp("2009-01-06 14:00:00"),
data_duration=Timedelta("10s")
data_duration=Timedelta("10s"),
instrument=Instrument(end_to_end_db=150),
normalization="dc_reject"
)

The resulting ``AudioDataset`` will contain 10s-long ``AudioData`` ranging from ``2009-01-06 12:00:00`` to ``2009-01-06 14:00:00``.
Expand Down Expand Up @@ -366,4 +422,4 @@ should be provided:
ltas.plot()
plt.show()

A ``SpectroData`` object can be turned into a ``LTASData`` thanks to the :meth:`osekit.core_api.ltas_data.LTASData.from_spectro_data` method.
A ``SpectroData`` object can be turned into a ``LTASData`` thanks to the :meth:`osekit.core_api.ltas_data.LTASData.from_spectro_data` method.
1 change: 1 addition & 0 deletions docs/source/example_ltas.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ This LTAS will:
* Start at the begin of the first audio file
* End at the end of the last audio file
* Be downsampled at ``24 kHz``
* Have its DC component removed

| The FFT used for computing the spectrograms will use a ``1024 samples``-long hamming window.
| The ``hop`` of LTAS ``ShortTimeFFT`` objects is forced to the size of the window (no overlap).
Expand Down
6 changes: 5 additions & 1 deletion docs/source/example_ltas_core.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@
"audio_folder = Path(r\"_static/sample_audio\")\n",
"\n",
"from osekit.core_api.audio_dataset import AudioDataset\n",
"from osekit.utils.audio_utils import Normalization\n",
"from osekit.core_api.instrument import Instrument\n",
"\n",
"audio_data = AudioDataset.from_folder(\n",
Expand All @@ -60,7 +61,10 @@
").data[0]\n",
"\n",
"# Resampling at 24 kHz\n",
"audio_data.sample_rate = 24_000"
"audio_data.sample_rate = 24_000\n",
"\n",
"# Removing the DC component\n",
"audio_data.normalization = Normalization.DC_REJECT"
]
},
{
Expand Down
8 changes: 7 additions & 1 deletion docs/source/example_ltas_public.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -134,13 +134,15 @@
"metadata": {},
"outputs": [],
"source": [
"from osekit.utils.audio_utils import Normalization\n",
"from osekit.public_api.analysis import Analysis, AnalysisType\n",
"\n",
"analysis = Analysis(\n",
" analysis_type=AnalysisType.SPECTROGRAM\n",
" | AnalysisType.MATRIX, # we want to export both the spectrogram and the sx matrix\n",
" nb_ltas_time_bins=3000, # This will turn the regular spectrum computation in a LTAS\n",
" sample_rate=sample_rate,\n",
" normalization=Normalization.DC_REJECT, # Removes the DC component\n",
" fft=sft,\n",
" v_lim=(0.0, 150.0), # Boundaries of the spectrograms\n",
" colormap=\"viridis\", # Default value\n",
Expand Down Expand Up @@ -196,7 +198,11 @@
"cell_type": "code",
"execution_count": null,
"id": "e05d653bc1e8bfe2",
"metadata": {},
"metadata": {
"tags": [
"remove-cell"
]
},
"outputs": [],
"source": [
"# Reset the dataset to get all files back to place.\n",
Expand Down
1 change: 1 addition & 0 deletions docs/source/example_multiple_spectrograms.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ In this example, we want to export spectrograms drawn from the sample audio data
* Last spectrogram ends at ``2022-09-25 22:36:25``
* Spectrograms represent ``5 s``-long audio data
* Audio data are downsampled sampled at ``24 kHz`` before spectrograms are computed
* The DC component of the audio data is rejected before spectrograms are computed
* Spectrograms that are in the gap between recordings should be skipped

The FFT used for computing the spectrograms will use a ``1024 samples``-long hamming window, with a ``128 samples``-long hop.
Expand Down
5 changes: 4 additions & 1 deletion docs/source/example_multiple_spectrograms_core.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@
"\n",
"from osekit.core_api.audio_dataset import AudioDataset\n",
"from osekit.core_api.instrument import Instrument\n",
"from osekit.utils.audio_utils import Normalization\n",
"from pandas import Timestamp, Timedelta\n",
"\n",
"audio_dataset = AudioDataset.from_folder(\n",
Expand All @@ -61,6 +62,8 @@
" end=Timestamp(\"2022-09-25 22:36:25\"),\n",
" data_duration=Timedelta(seconds=5),\n",
" instrument=Instrument(end_to_end_db=150.0),\n",
" sample_rate=24_000,\n",
" normalization=Normalization.DC_REJECT,\n",
")"
]
},
Expand Down Expand Up @@ -192,7 +195,7 @@
"source": [
"import matplotlib.pyplot as plt\n",
"\n",
"spectro_dataset.data[0].plot()\n",
"spectro_dataset.data[1].plot()\n",
"plt.show()"
]
},
Expand Down
2 changes: 2 additions & 0 deletions docs/source/example_multiple_spectrograms_public.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,7 @@
"outputs": [],
"source": [
"from osekit.public_api.analysis import Analysis, AnalysisType\n",
"from osekit.utils.audio_utils import Normalization\n",
"from pandas import Timestamp, Timedelta\n",
"\n",
"analysis = Analysis(\n",
Expand All @@ -143,6 +144,7 @@
" end=Timestamp(\"2022-09-25 22:36:25\"),\n",
" data_duration=Timedelta(seconds=5),\n",
" sample_rate=sample_rate,\n",
" normalization=Normalization.DC_REJECT,\n",
" fft=sft,\n",
" v_lim=(0.0, 150.0), # Boundaries of the spectrograms\n",
" colormap=\"viridis\", # Default value\n",
Expand Down
1 change: 1 addition & 0 deletions docs/source/example_reshaping_multiple_files.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ In this example, we want to export reshaped files from the sample audio dataset
* Last file ends at ``2022-09-25 22:36:25``
* Files are ``5 s``-long
* Files are sampled at ``24 kHz``
* Files are DC-filtered
* Files that are in the gap between recordings should be skipped

.. toctree::
Expand Down
3 changes: 3 additions & 0 deletions docs/source/example_reshaping_multiple_files_core.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@
"audio_folder = Path(r\"_static/sample_audio\")\n",
"\n",
"from osekit.core_api.audio_dataset import AudioDataset\n",
"from osekit.utils.audio_utils import Normalization\n",
"from pandas import Timestamp, Timedelta\n",
"\n",
"audio_dataset = AudioDataset.from_folder(\n",
Expand All @@ -55,6 +56,8 @@
" begin=Timestamp(\"2022-09-25 22:35:15\"),\n",
" end=Timestamp(\"2022-09-25 22:36:25\"),\n",
" data_duration=Timedelta(seconds=5),\n",
" sample_rate=24_000,\n",
" normalization=Normalization.DC_REJECT,\n",
")"
],
"outputs": [],
Expand Down
Loading