Problem Statement
When benchmarking transcription/translation workloads, GuideLLM currently transcodes input audio to MP3 by default through the encode_media preprocessing path (encode_audio(audio_format="mp3") default). This changes the original dataset audio format before request submission.
Proposed Solution
Change default audio encoding behavior to avoid forced MP3 transcoding.
Suggested behavior:
- If
audio_format is explicitly provided by the user, use it.
- Otherwise, infer format from source metadata/path (file suffix, URL suffix, or dataset-provided format).
- If format cannot be inferred, default to
wav (safe/common fallback), not mp3.
Optional:
- add a warning/log when fallback-to-WAV is used due to missing format metadata,
- document this behavior in benchmarking docs and CLI examples.
Alternatives Considered
Always force WAV instead of MP3: better than MP3 for fidelity, but still ignores source format unnecessarily.
Usage Examples
Additional Context
No response
Problem Statement
When benchmarking transcription/translation workloads, GuideLLM currently transcodes input audio to MP3 by default through the
encode_mediapreprocessing path (encode_audio(audio_format="mp3")default). This changes the original dataset audio format before request submission.Proposed Solution
Change default audio encoding behavior to avoid forced MP3 transcoding.
Suggested behavior:
audio_formatis explicitly provided by the user, use it.wav(safe/common fallback), notmp3.Optional:
Alternatives Considered
Always force WAV instead of MP3: better than MP3 for fidelity, but still ignores source format unnecessarily.
Usage Examples
Additional Context
No response