Add OpenVINO export and inference support for MedASR (google/medasr) by padatta · Pull Request #1745 · huggingface/optimum-intel

padatta · 2026-05-21T08:30:12Z

Description:

What does this PR do?

Adds OpenVINO export, inference, and full INT8 quantization support for google/medasr (model_type=lasr_ctc).

Changes

Export (optimum/exporters/openvino/model_configs.py):

DummyLasrCtcAudioInputGenerator: generates input_features [batch, time, features] + attention_mask [batch, time] with random_mask_tensor
LasrCtcOpenVINOConfig: registered for lasr_ctc → automatic-speech-recognition task via AutoModelForCTC

Inference (optimum/intel/openvino/modeling.py):

OVModelForCTC.forward(): handles input_features → input_values naming and conditionally passes attention_mask
OVModelForCTC._preprocess_quantization_config(): auto-sets processor from model_name_or_path (mirrors Whisper/Seq2Seq pattern)

Quantization (optimum/intel/openvino/quantization.py):

OVModelForCTC branch in build_from_quantization_config() to route CTC models to speech-to-text calibration datasets (librispeech)
OVModelForCTC added to build_from_dataset() isinstance check
_prepare_ctc_calibration_data(): collects audio calibration inputs via InferRequestWrapper

CLI (optimum/exporters/openvino/__main__.py):

CTC model detection in _main_quantize() for weight compression

Tests & Docs:

Gated tests behind RUN_SLOW_EXPORT_TESTS=1 + transformers>=5.0
MedASR entry in docs/source/openvino/models.mdx

Add OpenVINO export, inference, and quantization support for google/medasr (model_type=lasr_ctc): Export: - Add LasrCtcOpenVINOConfig with custom DummyLasrCtcAudioInputGenerator (input_features [batch, time, features] + attention_mask) - Register lasr_ctc in TasksManager custom classes (AutoModelForCTC) Inference: - Update OVModelForCTC.forward() to handle input_features naming and conditionally pass attention_mask Quantization: - Add OVModelForCTC._preprocess_quantization_config() for automatic processor resolution (mirrors Whisper/Seq2Seq pattern) - Add OVModelForCTC branch in build_from_quantization_config() to route CTC models to speech-to-text calibration datasets - Add OVModelForCTC to build_from_dataset() isinstance check - Add _prepare_ctc_calibration_data() method for collecting audio calibration inputs via InferRequestWrapper - Add CTC model detection in _main_quantize() for weight compression Tests & Docs: - Add gated tests (RUN_SLOW_EXPORT_TESTS=1, transformers>=5.0) - Add MedASR entry to supported models documentation Verified on Intel Arc iGPU + CPU: - FP16 and INT8 weight-only: cosine sim >= 0.9999, token match >= 99% - INT8 full quantization (32 LibriSpeech samples): 2.9x CPU speedup

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add OpenVINO export and inference support for MedASR (google/medasr)#1745

Add OpenVINO export and inference support for MedASR (google/medasr)#1745
padatta wants to merge 1 commit into
huggingface:mainfrom
padatta:medasr-openvino-support

padatta commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

padatta commented May 21, 2026

What does this PR do?

Changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant