Skip to content

Fix integer overflows in tensor byte-size computations #19055

Merged
lucylq merged 1 commit into
mainfrom
security-19
Apr 23, 2026
Merged

Fix integer overflows in tensor byte-size computations #19055
lucylq merged 1 commit into
mainfrom
security-19

Conversation

@lucylq
Copy link
Copy Markdown
Contributor

@lucylq lucylq commented Apr 22, 2026

Three tensor-byte-size multiplications had no overflow check, letting a malicious PTE trigger wrap-to-small size_t values while kernels iterate on the un-wrapped element count, producing heap buffer overflows.

Fixed here:

  • extension/tensor/tensor_ptr.h: data.size() * elementSize(type) in make_tensor_ptr cast path.
  • extension/tensor/tensor_ptr_maker.cpp: compute_numel(...) * elementSize(type) in empty_strided.
  • runtime/core/tensor_layout.cpp: dim-product loop and final * elementSize(scalar_type) in calculate_nbytes; now returns Error::InvalidArgument on overflow since the function already returns Result<size_t>.

All guards use c10::mul_overflows, matching the existing pattern in MethodMeta::calculate_nbytes, the data loaders, and PlatformMemoryAllocator.

runtime/core/portable_type/tensor_impl.cpp is intentionally left alone in this branch; guarding the nbytes() / compute_numel multiplications there breaks internal callers and will be handled separately.

Authored with Claude.

…H-19)

Three tensor-byte-size multiplications had no overflow check, letting a
malicious PTE trigger wrap-to-small size_t values while kernels iterate
on the un-wrapped element count, producing heap buffer overflows.

Fixed here:
- extension/tensor/tensor_ptr.h: data.size() * elementSize(type) in
  make_tensor_ptr cast path.
- extension/tensor/tensor_ptr_maker.cpp: compute_numel(...) *
  elementSize(type) in empty_strided.
- runtime/core/tensor_layout.cpp: dim-product loop and final *
  elementSize(scalar_type) in calculate_nbytes; now returns
  Error::InvalidArgument on overflow since the function already returns
  Result<size_t>.

All guards use c10::mul_overflows, matching the existing pattern in
MethodMeta::calculate_nbytes, the data loaders, and
PlatformMemoryAllocator.

runtime/core/portable_type/tensor_impl.cpp is intentionally left alone
in this branch; guarding the nbytes() / compute_numel multiplications
there breaks internal callers and will be handled separately.

Authored with Claude.
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Apr 22, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19055

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 4 New Failures, 2 Unrelated Failures

As of commit a879332 with merge base 217ad45 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following job failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 22, 2026
@lucylq lucylq marked this pull request as ready for review April 22, 2026 22:01
Copilot AI review requested due to automatic review settings April 22, 2026 22:01
@github-actions
Copy link
Copy Markdown

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens tensor byte-size computations against integer overflow (wrap-to-small) scenarios that could otherwise lead to under-allocation and subsequent heap buffer overflows when kernels iterate using the true element count.

Changes:

  • Add overflow-checked dimension product and * elementSize handling in TensorLayout byte-size calculation (returning Error::InvalidArgument on overflow).
  • Add c10::mul_overflows guards for numel * elementSize in empty_strided().
  • Add c10::mul_overflows guards for data.size() * elementSize when allocating cast buffers in make_tensor_ptr().

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
runtime/core/tensor_layout.cpp Uses c10::mul_overflows to safely compute nbytes and returns InvalidArgument on overflow.
extension/tensor/tensor_ptr_maker.cpp Adds overflow checking before allocating the backing std::vector<uint8_t> in empty_strided().
extension/tensor/tensor_ptr.h Adds overflow checking for cast-buffer allocation when casting vector-backed tensor data.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 117 to +121
if (type != deduced_type) {
ET_CHECK_MSG(
runtime::canCast(deduced_type, type),
"Cannot cast deduced type to specified type.");
std::vector<uint8_t> casted_data(data.size() * aten::elementSize(type));
size_t casted_bytes = 0;
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In executor (non-ATen) builds, executorch::aten::compute_numel() ultimately multiplies sizes using signed ssize_t arithmetic without overflow checks. That means the earlier data.size() == compute_numel(...) validation can itself invoke signed-overflow UB for large shapes, and may incorrectly succeed with a wrapped result. Since this block then allocates based on data.size(), you can end up with a tensor whose declared shape is inconsistent with its backing storage.

To make the new overflow-checked allocation meaningful, consider replacing the compute_numel(...) usage in this overload with a local, overflow-checked size_t product over sizes (similar to the mul_overflows loop used elsewhere), and compare against data.size() only after that succeeds.

Copilot uses AI. Check for mistakes.
Comment on lines +28 to +32
size_t next = 0;
if (c10::mul_overflows(n, static_cast<size_t>(sizes[i]), &next)) {
return Error::InvalidArgument;
}
n = next;
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new overflow handling in calculate_nbytes() changes behavior to return Error::InvalidArgument on overflow, but runtime/core/test/tensor_layout_test.cpp currently has no coverage for this path. Add a unit test that uses sizes whose product (or product * elementSize) overflows size_t (e.g., {INT32_MAX, INT32_MAX} with ScalarType::Double on 64-bit) and assert TensorLayout::create() returns InvalidArgument.

Copilot uses AI. Check for mistakes.
Comment on lines +116 to +117
const auto numel = static_cast<size_t>(
executorch::aten::compute_numel(sizes.data(), sizes.size()));
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty_strided() still relies on executorch::aten::compute_numel() (which returns ssize_t and, in executor mode, multiplies sizes using signed arithmetic without overflow checks). If the size product overflows inside compute_numel, that's undefined behavior and may yield a wrapped numel before you even reach the mul_overflows(numel, elem_size, ...) guard.

Consider computing numel locally using size_t + c10::mul_overflows over the sizes vector (and validating non-negative sizes) so the overflow is caught deterministically before any UB occurs, then multiply by elem_size as you do now.

Suggested change
const auto numel = static_cast<size_t>(
executorch::aten::compute_numel(sizes.data(), sizes.size()));
size_t numel = 1;
for (const auto dim : sizes) {
ET_CHECK_MSG(
dim >= 0,
"empty_strided requires non-negative sizes, got %zd",
static_cast<ssize_t>(dim));
size_t next_numel = 0;
ET_CHECK_MSG(
!c10::mul_overflows(
numel, static_cast<size_t>(dim), &next_numel),
"empty_strided size overflow while computing numel: %zu * %zu",
numel,
static_cast<size_t>(dim));
numel = next_numel;
}

Copilot uses AI. Check for mistakes.
@meta-codesync
Copy link
Copy Markdown
Contributor

meta-codesync Bot commented Apr 22, 2026

@lucylq has imported this pull request. If you are a Meta employee, you can view this in D102058400.

executorch::aten::compute_numel(sizes.data(), sizes.size()) *
executorch::aten::elementSize(type));
const auto numel = static_cast<size_t>(
executorch::aten::compute_numel(sizes.data(), sizes.size()));
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to use safe_numel here.

@lucylq lucylq merged commit 30f6b89 into main Apr 23, 2026
178 of 186 checks passed
@lucylq lucylq deleted the security-19 branch April 23, 2026 00:57
Gasoonjia pushed a commit that referenced this pull request Apr 23, 2026
Three tensor-byte-size multiplications had no overflow check, letting a
malicious PTE trigger wrap-to-small size_t values while kernels iterate
on the un-wrapped element count, producing heap buffer overflows.

Fixed here:
- extension/tensor/tensor_ptr.h: data.size() * elementSize(type) in
make_tensor_ptr cast path.
- extension/tensor/tensor_ptr_maker.cpp: compute_numel(...) *
elementSize(type) in empty_strided.
- runtime/core/tensor_layout.cpp: dim-product loop and final *
elementSize(scalar_type) in calculate_nbytes; now returns
Error::InvalidArgument on overflow since the function already returns
Result<size_t>.

All guards use c10::mul_overflows, matching the existing pattern in
MethodMeta::calculate_nbytes, the data loaders, and
PlatformMemoryAllocator.

runtime/core/portable_type/tensor_impl.cpp is intentionally left alone
in this branch; guarding the nbytes() / compute_numel multiplications
there breaks internal callers and will be handled separately.

Authored with Claude.
zeel2104 pushed a commit to zeel2104/executorch that referenced this pull request May 5, 2026
Three tensor-byte-size multiplications had no overflow check, letting a
malicious PTE trigger wrap-to-small size_t values while kernels iterate
on the un-wrapped element count, producing heap buffer overflows.

Fixed here:
- extension/tensor/tensor_ptr.h: data.size() * elementSize(type) in
make_tensor_ptr cast path.
- extension/tensor/tensor_ptr_maker.cpp: compute_numel(...) *
elementSize(type) in empty_strided.
- runtime/core/tensor_layout.cpp: dim-product loop and final *
elementSize(scalar_type) in calculate_nbytes; now returns
Error::InvalidArgument on overflow since the function already returns
Result<size_t>.

All guards use c10::mul_overflows, matching the existing pattern in
MethodMeta::calculate_nbytes, the data loaders, and
PlatformMemoryAllocator.

runtime/core/portable_type/tensor_impl.cpp is intentionally left alone
in this branch; guarding the nbytes() / compute_numel multiplications
there breaks internal callers and will be handled separately.

Authored with Claude.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants