Skip to content

Commit 8c36f5a

Browse files
authored
[5676209][ONNX][Autocast] Add support for single npz file with multiple samples (#815)
## What does this PR do? **Type of change:** New feature **Overview:** Currently, Autocast only supports calibration data with shape matching the model's input. This PR adds support for calibration data with shape that is a multiple of the model's input. It does so by re-arranging the data as such that it contains multiple samples with shape matching the model's input. Simplified example: - ONNX input: `[1, 3, 224, 224]` - Calibration data: `[10, 3, 224, 224]` - Calibration data with multiple samples: `[1, 3, 224, 224] * 10` ## Usage Single `npz` file with multiple samples: ```sh $ python -m modelopt.onnx.autocast --onnx_path=$MODEL_NAME.onnx --calibration_data=calib_data_10.npz ``` ## Testing See bug 5676209. ## Before your PR is "*Ready for review*" <!-- If you haven't finished some of the above items you can still open `Draft` PR. --> - **Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)** and your commits are signed. - **Is this change backward compatible?**: Yes - **Did you write any new necessary tests?**: No - **Did you add or update any necessary documentation?**: No - **Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**: Yes ## Additional Information Equivalent support is already included in the quantization workflow: https://github.com/NVIDIA/Model-Optimizer/blob/1cc8e6bf3917f61500e81d4ded0af5d5a00e2e25/modelopt/onnx/quantization/calib_utils.py#L50 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * CalibrationDataProvider now accepts both file paths and pre-loaded ONNX models as input. * NPZ calibration files can now provide multiple batches for calibration workflows. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>
1 parent 5cc2a54 commit 8c36f5a

File tree

3 files changed

+11
-3
lines changed

3 files changed

+11
-3
lines changed

CHANGELOG.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ NVIDIA Model Optimizer Changelog (Linux)
1313
- Add standalone type inference option (``--use_standalone_type_inference``) in ONNX AutoCast as an alternative to ONNX's ``infer_shapes``. This experimental feature performs type-only inference without shape inference, useful as a workaround when shape inference fails or to avoid unnecessary shape inference overhead.
1414
- Add support for Kimi K2 Thinking model quantization from the original int4 checkpoint.
1515
- Add support for ``params`` constraint based automatic neural architecture search in Minitron pruning (``mcore_minitron``) as an alternative to manual pruning (using ``export_config``). See `examples/pruning/README.md <https://github.com/NVIDIA/Model-Optimizer/tree/main/examples/pruning>`_ for more details on its usage.
16+
- Add support for calibration data with multiple samples in ``npz`` format in the ONNX Autocast workflow.
1617

1718
0.41 (2026-01-19)
1819
^^^^^^^^^^^^^^^^^

modelopt/onnx/autocast/referencerunner.py

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@
3030
import onnx
3131

3232
from modelopt.onnx.autocast.logging_config import configure_logging, logger
33+
from modelopt.onnx.quantization.calib_utils import CalibrationDataProvider
3334
from modelopt.onnx.quantization.ort_utils import _prepare_ep_list
3435

3536
configure_logging()
@@ -70,7 +71,13 @@ def _load_inputs_from_json(self, input_data_path):
7071

7172
def _load_inputs_from_npz(self, input_data_path):
7273
"""Load inputs from NPZ format."""
73-
return [np.load(input_data_path)]
74+
calib_data = np.load(input_data_path)
75+
76+
if isinstance(calib_data, np.lib.npyio.NpzFile):
77+
# Wrap data into a CalibDataProvider to support a single NPZ file containing data from multiple batches
78+
data_loader = {key: calib_data[key] for key in calib_data.files}
79+
return CalibrationDataProvider(self.model, data_loader).calibration_data_list
80+
return [calib_data]
7481

7582
def _validate_inputs(self, data_loader):
7683
"""Validate that input names and shapes match the model."""

modelopt/onnx/quantization/calib_utils.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ class CalibrationDataProvider(CalibrationDataReader):
3838

3939
def __init__(
4040
self,
41-
onnx_path: str,
41+
onnx_path: str | onnx.ModelProto,
4242
calibration_data: CalibrationDataType,
4343
calibration_shapes: str | None = None,
4444
):
@@ -58,7 +58,7 @@ def __init__(
5858
logger.info("Setting up CalibrationDataProvider for calibration")
5959
# Tensor data is not required to generate the calibration data
6060
# So even if the model has external data, we don't need to load them here
61-
onnx_model = onnx.load(onnx_path)
61+
onnx_model = onnx.load(onnx_path) if isinstance(onnx_path, str) else onnx_path
6262
input_names = get_input_names(onnx_model)
6363
input_shapes = {} if calibration_shapes is None else parse_shapes_spec(calibration_shapes)
6464
inferred_input_shapes = get_input_shapes(onnx_model)

0 commit comments

Comments
 (0)