Skip to content

fix(ml): use SafeInt checked arithmetic in ML operator coefficient size validation#28001

Merged
tianleiwu merged 7 commits intomainfrom
tlwu/20260407/fix_linear_classifier
Apr 17, 2026
Merged

fix(ml): use SafeInt checked arithmetic in ML operator coefficient size validation#28001
tianleiwu merged 7 commits intomainfrom
tlwu/20260407/fix_linear_classifier

Conversation

@tianleiwu
Copy link
Copy Markdown
Contributor

@tianleiwu tianleiwu commented Apr 7, 2026

Replaces unchecked static_cast<size_t> multiplications in coefficient/support-vector size guards with SafeInt<size_t> arithmetic. Without this, a crafted model with adversarial dimension values could cause the product to silently wrap around, bypassing the validation entirely before downstream GEMM/dot-product reads.

Also improves the SVMClassifier feature-count mismatch error to include expected feature_count, actual num_features, input_rank, and num_batches for easier diagnosis.

Changes

  • linearclassifier.ccSafeInt<size_t>(class_count_) * SafeInt<size_t>(num_features) with try/catch returning INVALID_ARGUMENT on overflow
  • linearregressor.cc — Same pattern for num_targets_ * num_features
  • svmclassifier.cc — SafeInt overflow protection for both the linear (class_count_ * num_features) and non-linear (vector_count_ * num_features) branches; descriptive error message on feature-count mismatch
  • svmclassifier_test.cc — Update assertion substring to match the new error message

Motivation and Context

Unchecked integer multiplication in size-validation guards can be bypassed by a crafted model whose dimension values wrap around size_t, effectively disabling the bounds check before out-of-bounds memory reads occur downstream. This hardens the existing validation added for LinearClassifier, LinearRegressor, SVMClassifier, and SVMRegressor.

@tianleiwu tianleiwu requested review from Copilot, vraspar and xadupre and removed request for vraspar and xadupre April 7, 2026 18:51
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds runtime shape/attribute validation for CPU ML operators so malformed model metadata (e.g., undersized coefficients/support vectors) is rejected early, preventing potential out-of-bounds reads. Also adds regression tests for the new validation paths.

Changes:

  • Add runtime checks for coefficient and support-vector buffer sizes based on runtime num_features for Linear* and SVM* operators.
  • Tighten SVMRegressor coefficient validation against n_supports and validate input rank before reading feature dimension.
  • Add negative tests covering invalid coefficient sizes and feature-count mismatches.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
onnxruntime/core/providers/cpu/ml/linearclassifier.cc Validates coefficients_ matches class_count_ * num_features at runtime.
onnxruntime/core/providers/cpu/ml/linearregressor.cc Validates coefficients_ matches num_targets_ * num_features before output/GEMM.
onnxruntime/core/providers/cpu/ml/svmclassifier.cc Validates input rank/features and validates attribute buffer sizes against runtime feature count.
onnxruntime/core/providers/cpu/ml/svmregressor.cc Tightens coefficients_ size check and validates input rank before feature extraction.
onnxruntime/test/providers/cpu/ml/linearclassifer_test.cc Adds failing LinearClassifier case for undersized coefficients.
onnxruntime/test/providers/cpu/ml/linearregressor_test.cc Adds failing LinearRegressor case for undersized coefficients.
onnxruntime/test/providers/cpu/ml/svmclassifier_test.cc Adds failing SVMClassifier case for input feature-count mismatch.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/core/providers/cpu/ml/linearclassifier.cc Outdated
Comment thread onnxruntime/core/providers/cpu/ml/linearregressor.cc Outdated
Comment thread onnxruntime/core/providers/cpu/ml/svmclassifier.cc Outdated
Comment thread onnxruntime/core/providers/cpu/ml/svmclassifier.cc Outdated
@tianleiwu
Copy link
Copy Markdown
Contributor Author

@copilot apply changes based on the comments in this thread

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 7, 2026

@copilot apply changes based on the comments in this thread

Applied all four review suggestions in commit 480b3d7:

  • linearclassifier.cc, linearregressor.cc, svmclassifier.cc (linear/SVM size checks): Replaced unchecked static_cast<size_t> multiplication with SafeInt<size_t> wrapped in try/catch(...), returning INVALID_ARGUMENT with a descriptive message on overflow.
  • svmclassifier.cc feature-count check: Expanded the error message to include expected feature_count, actual num_features, input_rank, and num_batches.
  • svmclassifier_test.cc: Updated SVMClassifierInvalidInputFeatureCount test to assert on the new "Invalid input for SVMClassifier" substring.

Note: catch (...) is used (matching the reviewer's own suggestion) because ORT's safeint.h overrides the SafeInt exception handler to throw OnnxRuntimeException — there is no standalone SafeIntException type in this codebase.

Copilot AI changed the title Validate ML model coefficient sizes fix(ml): use SafeInt checked arithmetic in ML operator coefficient size validation Apr 7, 2026
Copy link
Copy Markdown
Contributor

@vraspar vraspar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SVMClassifier constructor still has unchecked overflow

The constructor computescoefficient and classifier counts with bare static_cast<size_t> multiplications:

const size_t expected_coefficients = static_cast<size_t>(class_count_ - 1) * static_cast<size_t>(vector_count_);
const size_t num_classifiers = static_cast<size_t>(class_count_) * static_cast<size_t>(class_count_ - 1) / 2;

In SVC mode, Compute relies on the constructor's validation of coefficients_, rho_, prob_a_, and prob_b_ sizes. The new SafeInt checks in Compute only cover support_vectors_ for SVC mode. If an attacker crafts class_count_ and vector_count_ values that overflow expected_coefficients to a small number, the constructor check passes, and Compute later does OOB reads on coefficients_.

Fix is straightforward -- wrap both multiplications in SafeInt the same way the Compute methods do it:

const size_t expected_coefficients = SafeInt<size_t>(class_count_ - 1) * static_cast<size_t>(vector_count_);
const size_t num_classifiers = SafeInt<size_t>(class_count_) * static_cast<size_t>(class_count_ - 1) / 2;

This needs to land in the same PR since the constructor path is part of the same attack surface.

No overflow regression tests

The new negative tests all check "wrong size" mismatches, which is useful but doesn't exercise the overflow path. That's the actual security scenario this PR is fixing. At least one test should use dimension values large enough to overflow size_t multiplication without requiring a huge allocation. For example, for LinearClassifier you could set coefficients to a small vector but set dimensions such that class_count * feature_count overflows to match that small size. The test passes if the operator rejects the model with an error instead of silently accepting the overflowed value.

Without this, we don't have regression coverage for the specific bug being fixed.

Minor notes

SVMRegressor still uses a generic "Invalid argument" message in its size check, while LinearClassifier, LinearRegressor, and SVMClassifier now have descriptive messages. Not blocking, but it would be nice to make it consistent in the same pass.

Separately, rank validation is inconsistent across the four operators. SVMClassifier/SVMRegressor check input_rank > 0 && input_rank <= 2, LinearClassifier only checks dims == 0, and LinearRegressor checks > 2 but not == 0. This is pre-existing and out of scope for this PR, just flagging it.

@tianleiwu tianleiwu force-pushed the tlwu/20260407/fix_linear_classifier branch from fa5b942 to cc19972 Compare April 13, 2026 20:52
@tianleiwu tianleiwu requested review from Copilot and vraspar April 13, 2026 20:53
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/test/providers/cpu/ml/svmclassifier_test.cc
Comment thread onnxruntime/core/providers/cpu/ml/linearclassifier.cc Outdated
Comment thread onnxruntime/core/providers/cpu/ml/linearregressor.cc Outdated
Comment thread onnxruntime/core/providers/cpu/ml/svmclassifier.cc
Comment thread onnxruntime/core/providers/cpu/ml/svmclassifier.cc
ONNX models may have more coefficients than needed for a given input
dimension (e.g. logreg_iris.onnx has 12 coefficients for 3 classes but
input has only 2 features). The safety check only needs to ensure the
coefficient array is large enough to prevent OOB reads, not exactly
equal.
- Replace try/catch SafeInt with SafeMultiply for builds where
  exceptions are disabled (minimal build).
- Fix wasm -Wshorten-64-to-32 by using SafeInt<size_t> for resize args.
- Add explicit safeint.h include.
Comment thread onnxruntime/core/providers/cpu/ml/svmclassifier.cc
Comment thread onnxruntime/test/providers/cpu/ml/svmclassifier_test.cc
@tianleiwu tianleiwu merged commit 7063e74 into main Apr 17, 2026
96 checks passed
@tianleiwu tianleiwu deleted the tlwu/20260407/fix_linear_classifier branch April 17, 2026 01:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants