Skip to content

[Feature]: internal processing is 32 bit float but output is truncated to 16 bit integer - allow for 32 or 24 bit output #258

@dts350z

Description

@dts350z

Description

Claude Code:

1. 24-bit PCM output support via soundfile

Files:

  • audio_separator\separator\common_separator.py__init__() + write_audio_pydub()
  • audio_separator\separator\separator.py__init__() + config dict
  • audio_separator\utils\cli.py — argument parser + Separator instantiation

Date: 2025-02-12

Problem:
Models output float32 (~24 bits of precision) but write_audio_pydub() truncates all output to int16 (16-bit) before writing. The existing write_audio_soundfile() path (--use_soundfile) has a bug where float data is assigned to an int16 interleave array without scaling by 32767, producing silent output. Documentation claimed 24-bit output but the library was always writing 16-bit.

Changes:

common_separator.py

Added output_subtype from config (after line ~85):

self.output_subtype = config.get("output_subtype", "PCM_16")

Added early-return soundfile path in write_audio_pydub() (before the int16 conversion):

# For WAV/FLAC with PCM_24, use soundfile directly (pydub can't do 24-bit)
file_format = stem_path.lower().split(".")[-1]
if self.output_subtype == "PCM_24" and file_format in ("wav", "flac"):
    import soundfile as sf
    sf.write(stem_path, stem_source, self.sample_rate, subtype="PCM_24")
    self.logger.debug(f"Exported 24-bit {file_format.upper()} via soundfile to {stem_path}")
    return

separator.py

Added output_subtype="PCM_16" parameter to __init__(), stored as self.output_subtype, and included in the config dict passed to architecture-specific separators.

cli.py

Added --output_subtype argument (default "PCM_16") and passed it to the Separator constructor.

Why: Preserves the full precision of model output. 24-bit PCM captures the float32 model output without the quantization noise from 16-bit truncation. The existing pydub int16 path remains untouched for 16-bit output and lossy formats (MP3, M4A, etc.).

Note: The broken write_audio_soundfile() path was not modified — instead, the fix uses soundfile.write() directly inside the existing write_audio_pydub() method with an early return for 24-bit lossless formats. This avoids the interleaving bugs in the soundfile path while keeping the change minimal.

Configuration: Controlled by separation_bit_depth in the [output_processing] section of UpMixery.ini. The value is passed to audio_separator_launcher.py via --bit-depth, which maps it to --output_subtype PCM_24 or PCM_16.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions