Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19057
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 12 New Failures, 6 Unrelated FailuresAs of commit d1d4471 with merge base 2d53535 ( NEW FAILURES - The following jobs have failed:
FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
There was a problem hiding this comment.
Pull request overview
This PR improves integer overflow safety across ExecuTorch runtime/components by replacing potentially overflowing additions with checked arithmetic using c10::add_overflows.
Changes:
- Add overflow-safe end-offset computations in core runtime helpers (
HierarchicalAllocator,ArrayRef). - Harden flat_tensor segment validation against addition overflow.
- Add overflow-checked argument-count and sequence-length validations in CUDA/Metal backends and Qualcomm llama runner code.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| runtime/core/hierarchical_allocator.h | Uses c10::add_overflows to safely compute offset_bytes + size_bytes before bounds checks. |
| runtime/core/array_ref.h | Prevents N + M overflow in ArrayRef::slice bounds validation. |
| extension/flat_tensor/flat_tensor_data_map.cpp | Validates segment (offset + size) with overflow detection before comparing to segment end. |
| examples/qualcomm/oss_scripts/llama/runner/runner.cpp | Adds overflow-checked cur_pos_ + num_prompt_tokens validation vs seq_len. |
| examples/qualcomm/oss_scripts/llama/runner/prompt_processor.cpp | Adds overflow-checked start_pos + num_prompt_tokens validation vs context limits. |
| backends/cuda/runtime/cuda_backend.cpp | Prevents overflow in n_inputs + n_outputs when validating args.size(). |
| backends/apple/metal/runtime/metal_backend.cpp | Prevents overflow in n_inputs + n_outputs when validating args.size(). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| int64_t end_pos = 0; | ||
| ET_CHECK_MSG( | ||
| cur_pos_ + num_prompt_tokens < seq_len, | ||
| !c10::add_overflows( | ||
| cur_pos_, static_cast<int64_t>(num_prompt_tokens), &end_pos) && | ||
| end_pos < static_cast<int64_t>(seq_len), |
There was a problem hiding this comment.
On MSVC, c10::add_overflows' fallback implementation performs a + b for signed types, which can itself invoke signed-overflow UB before the overflow check triggers. Since cur_pos_/end_pos are int64_t, this reintroduces a platform-specific UB risk. Consider switching this check to unsigned arithmetic (e.g., validate cur_pos_ >= 0, cast operands to uint64_t/size_t, then use add_overflows) or use an explicit bounds check like cur_pos_ <= seq_len - num_prompt_tokens after normalizing types.
| int64_t end_pos = 0; | ||
| ET_CHECK_MSG( | ||
| (start_pos + num_prompt_tokens) <= | ||
| (metadata_.context_len - metadata_.ar_len), | ||
| !c10::add_overflows( | ||
| start_pos, static_cast<int64_t>(num_prompt_tokens), &end_pos) && | ||
| end_pos <= static_cast<int64_t>(metadata_.context_len) - | ||
| static_cast<int64_t>(metadata_.ar_len), | ||
| "The sequence length exceeds the maximum limit that the prompt processor can handle."); |
There was a problem hiding this comment.
This uses c10::add_overflows with signed int64_t operands. In this repo's MSVC fallback implementation, add_overflows computes a + b for signed types, which can trigger signed-overflow UB before detection. To keep this check safe cross-platform, prefer unsigned arithmetic (after asserting start_pos >= 0) or rewrite as a non-overflowing bounds check using wider/unsigned types.
| uint64_t seg_end = 0; | ||
| ET_CHECK_OR_RETURN_ERROR( | ||
| (segments->Get(segment_index)->offset() + | ||
| segments->Get(segment_index)->size()) <= segment_end_offset, | ||
| !c10::add_overflows( | ||
| static_cast<uint64_t>(segments->Get(segment_index)->offset()), | ||
| static_cast<uint64_t>(segments->Get(segment_index)->size()), | ||
| &seg_end) && | ||
| seg_end <= static_cast<uint64_t>(segment_end_offset), | ||
| InvalidExternalData, |
There was a problem hiding this comment.
segment_end_offset is a size_t but is passed as header_.segment_base_offset + header_.segment_data_size (uint64_t). On 32-bit builds this can truncate, making the bounds check ineffective for large offsets even though seg_end is computed in uint64_t. Consider changing segment_end_offset (and related computations) to uint64_t throughout this helper to avoid truncation and remove the need for casts.
Summary
Check various overflows