Skip to content

Arm backend: Fix integer overflow in VGFBackend IO size computation#19256

Draft
lucylq wants to merge 1 commit intomainfrom
security27-fix-overflows-arm
Draft

Arm backend: Fix integer overflow in VGFBackend IO size computation#19256
lucylq wants to merge 1 commit intomainfrom
security27-fix-overflows-arm

Conversation

@lucylq
Copy link
Copy Markdown
Contributor

@lucylq lucylq commented May 1, 2026

Replace std::accumulate with std::multiplies<>() with an explicit loop using c10::mul_overflows() to detect overflow before each multiplication. The previous code would silently wrap on overflow, producing an undersized memcpy size that could lead to out-of-bounds reads/writes when copying tensor data to/from Vulkan device memory.

Also reject negative dimensions before casting to size_t.

This PR was authored with the assistance of Claude.

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell

Replace std::accumulate with std::multiplies<>() with an explicit loop
using c10::mul_overflows() to detect overflow before each multiplication.
The previous code would silently wrap on overflow, producing an
undersized memcpy size that could lead to out-of-bounds reads/writes
when copying tensor data to/from Vulkan device memory.

Also reject negative dimensions before casting to size_t.

Addresses TOB-EXECUTORCH-27.

This PR was authored with the assistance of Claude.
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented May 1, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19256

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 3 Unrelated Failures

As of commit aa1f963 with merge base 5e8a0df (image):

NEW FAILURE - The following job has failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 1, 2026
@github-actions github-actions Bot added ciflow/trunk module: arm Issues related to arm backend labels May 1, 2026
@lucylq
Copy link
Copy Markdown
Contributor Author

lucylq commented May 1, 2026

@claude review

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 1, 2026

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@claude
Copy link
Copy Markdown

claude Bot commented May 1, 2026

Claude finished @lucylq's task in 1m 6s —— View job


Code Review: Fix integer overflow in VGFBackend IO size computation

  • Read the changed file and understand the context
  • Analyze the overflow fix implementation
  • Post review feedback

Overall: This is a solid security fix. The approach of using c10::mul_overflows with an explicit loop correctly detects overflow before each multiplication, and rejecting negative dimensions prevents signed-to-unsigned conversion issues. The error messages are clear and actionable.

Minor issues:

  1. <numeric> is now unused (backends/arm/runtime/VGFBackend.cpp:10). After removing std::accumulate, this include is dead. Consider removing it.
    Fix this →

  2. Duplicated logic for input/output loops (lines 195–206 and 240–251). The two size-computation blocks are identical. This is fine for a security fix PR — extracting a helper would change the scope. Just noting for future reference.

  3. Edge case: empty io->size vector (scalar tensor). If io->size is empty, the loop body never executes and io_size remains io->elt_size. This is correct — a scalar tensor should copy exactly one element — so no issue here, just confirming the behavior is intentional.

  4. Edge case: zero dimension. If any dimension is 0, io_size becomes 0, and memcpy(..., 0) is a no-op. This is well-defined behavior, so no concern.

Looks good to merge. The only actionable item is the stale <numeric> include — trivial to clean up now or later.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: arm Issues related to arm backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant