Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18724
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 3 New Failures, 2 Unrelated FailuresAs of commit eba5644 with merge base 3d2c853 ( NEW FAILURES - The following jobs have failed:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
BROKEN TRUNK - The following job failed but was present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
6be51fa to
c8339db
Compare
There was a problem hiding this comment.
Pull request overview
This PR strengthens runtime validation of serialized method metadata to fail fast on malformed programs, focusing on tensor shape sizing and memory planning buffer sizes.
Changes:
- Add validation to reject negative tensor dimension sizes during byte-size calculation.
- Add null checks when extracting tensor metadata for inputs/outputs.
- Reject negative memory-planned buffer sizes and add required formatting support (
<cinttypes>forPRId64).
Comments suppressed due to low confidence (3)
runtime/executor/method_meta.cpp:111
calculate_nbytes()can now returnInvalidProgramfor negative sizes, butTensorInfo::create()converts any failure intoInvalidArgumentand logs only a generic message. This loses the specific error kind and makes invalid programs look like caller misuse; consider returningnbytes.error()(and/or forwarding the detailed log message) instead of hardcodingInvalidArgumenthere.
auto nbytes = calculate_nbytes(sizes, scalar_type);
ET_CHECK_OR_RETURN_ERROR(
nbytes.ok(),
InvalidArgument,
"Failed to calculate nbytes for TensorInfo");
runtime/executor/method_meta.cpp:211
- Even with the new
tensor_value != nullptrguard, this code still dereferencestensor_value->sizes()andtensor_value->dim_order()without checking those vectors are present. A malformed/partial flatbuffer can have these fields unset (nullptr), which would crash beforeTensorInfo::create()runs. Consider validatingtensor_value->sizes()/dim_order()are non-null before calling->data()/->size().
return TensorInfo::create(
Span<const int32_t>(
tensor_value->sizes()->data(), tensor_value->sizes()->size()),
Span<const uint8_t>(
tensor_value->dim_order()->data(), tensor_value->dim_order()->size()),
static_cast<executorch::aten::ScalarType>(tensor_value->scalar_type()),
runtime/executor/method_meta.cpp:267
- Same null-dereference risk as inputs:
tensor_value->sizes()/tensor_value->dim_order()are used without checking for nullptr. Please validate these flatbuffer vectors exist before dereferencing so malformed programs fail gracefully instead of crashing.
return TensorInfo::create(
Span<const int32_t>(
tensor_value->sizes()->data(), tensor_value->sizes()->size()),
Span<const uint8_t>(
tensor_value->dim_order()->data(), tensor_value->dim_order()->size()),
static_cast<executorch::aten::ScalarType>(tensor_value->scalar_type()),
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ET_CHECK_OR_RETURN_ERROR( | ||
| sizes[i] >= 0, | ||
| InvalidProgram, | ||
| "Invalid size[%zu]: %d. Size must not be negative", | ||
| i, | ||
| sizes[i]); |
There was a problem hiding this comment.
Format string uses %d for int32_t (sizes[i]). In this codebase, PRId32 is used for int32_t to avoid type/format mismatches across platforms; update this (and the other %d occurrences in calculate_nbytes) to use "%" PRId32.
| int64_t size = s_plan_->non_const_buffer_sizes()->Get(index + 1); | ||
| ET_CHECK_OR_RETURN_ERROR( | ||
| size >= 0, | ||
| InvalidProgram, | ||
| "memory_planned_buffer_size(%zu) has invalid negative size: %" PRId64, | ||
| index, | ||
| size); | ||
| return size; |
There was a problem hiding this comment.
New validation paths (negative dimension sizes, null tensor_value, negative memory-planned buffer sizes) don’t appear to have dedicated tests. Since there are already gtests for MethodMeta/TensorInfo, please add coverage that asserts the expected Error codes (and that we don’t crash) for these invalid metadata cases.
| tensor_value->dim_order() != nullptr, | ||
| InvalidProgram, | ||
| "Null tensor metadata for input %zu", | ||
| index); |
There was a problem hiding this comment.
Note: i'd like to move these to program_validation later. Currently program_validation is only run when Verification::InternalConsistency is set, though.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ET_CHECK_OR_RETURN_ERROR( | ||
| sizes[i] >= 0, | ||
| InvalidProgram, | ||
| "Invalid size[%zu]: %d. Size must not be negative", | ||
| i, | ||
| sizes[i]); |
There was a problem hiding this comment.
The new negative-dimension check returns Error::InvalidProgram from calculate_nbytes(), but TensorInfo::create() currently converts any calculate_nbytes() failure into Error::InvalidArgument and logs a second generic message. If callers are expected to distinguish invalid programs from invalid arguments (and to avoid double-logging), propagate the underlying error from calculate_nbytes() instead of overriding it in TensorInfo::create().
| auto input_index = s_plan_->inputs()->Get(index); | ||
| // input_index was already validated by input_tag(). | ||
| auto tensor_value = s_plan_->values()->Get(input_index)->val_as_Tensor(); | ||
| ET_CHECK_OR_RETURN_ERROR( | ||
| tensor_value != nullptr && tensor_value->sizes() != nullptr && |
There was a problem hiding this comment.
input_tensor_meta() still dereferences s_plan_->values()->Get(input_index) before the new null checks. Flatbuffer vectors can contain null table offsets, so an invalid program could still crash here. Add a null check for the EValue pointer (and return InvalidProgram) before calling ->val_as_Tensor().
| auto output_index = s_plan_->outputs()->Get(index); | ||
| // output_index was already validated by output_tag(). | ||
| auto tensor_value = s_plan_->values()->Get(output_index)->val_as_Tensor(); | ||
|
|
||
| ET_CHECK_OR_RETURN_ERROR( | ||
| tensor_value != nullptr && tensor_value->sizes() != nullptr && |
There was a problem hiding this comment.
output_tensor_meta() still dereferences s_plan_->values()->Get(output_index) before the new null checks. If the values vector contains a null EValue offset, this will crash before we can return InvalidProgram. Add a null check for the EValue pointer prior to calling ->val_as_Tensor().
| int64_t size = s_plan_->non_const_buffer_sizes()->Get(index + 1); | ||
| ET_CHECK_OR_RETURN_ERROR( | ||
| size >= 0, | ||
| InvalidProgram, | ||
| "memory_planned_buffer_size(%zu) has invalid negative size: %" PRId64, |
There was a problem hiding this comment.
These new validation paths (negative tensor sizes, null tensor metadata, and negative non_const_buffer_sizes entries) are not currently covered by tests. Please add unit tests that construct/patch an invalid ExecutionPlan and assert the expected Error codes for input/output/attribute meta access and memory_planned_buffer_size().
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ET_CHECK_OR_RETURN_ERROR( | ||
| sizes[i] >= 0, | ||
| InvalidProgram, | ||
| "Invalid size[%zu]: %d. Size must not be negative", | ||
| i, | ||
| sizes[i]); |
There was a problem hiding this comment.
sizes[i] is an int32_t, but this log uses %d for printing it. That’s not guaranteed to be correct for int32_t across platforms (printf type mismatch is UB if int32_t isn’t int). Prefer using the <cinttypes> macros (e.g., PRId32) and cast to int32_t to make the formatting portable (similar to other runtime/executor code).
| auto value = values->Get(i); | ||
| if (value->val_type() == executorch_flatbuffer::KernelTypes::Tensor) { | ||
| auto tensor_value = value->val_as_Tensor(); | ||
| if (tensor_value->extra_tensor_info() != nullptr && | ||
| if (tensor_value != nullptr && | ||
| tensor_value->extra_tensor_info() != nullptr && |
There was a problem hiding this comment.
values->Get(i) can be nullptr (other code paths explicitly validate serialization_value != nullptr before dereferencing). In both num_attributes() and attribute_tensor_meta(), value->val_type() is called without checking value, which can still crash on corrupt/partially-validated flatbuffers. Consider guarding with if (value == nullptr) continue; (or an ET_CHECK_OR_RETURN_ERROR in the Result-returning function) before accessing val_type()/val_as_Tensor().
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| auto serialization_value = s_plan_->values()->Get(input_index); | ||
| ET_CHECK_OR_RETURN_ERROR( | ||
| serialization_value != nullptr, | ||
| InvalidProgram, | ||
| "Null value for input %zu at index %d", |
There was a problem hiding this comment.
The new serialization_value != nullptr check is too late to prevent crashes on malformed flatbuffers: input_tensor_meta() calls input_tag() first, and input_tag() dereferences s_plan_->values()->Get(input_index) without a null check. Consider moving the null check into input_tag() / get_tag() (and/or validating s_plan_->values() is non-null) so a null entry is reported as InvalidProgram instead of segfaulting.
| auto serialization_value = s_plan_->values()->Get(output_index); | ||
| ET_CHECK_OR_RETURN_ERROR( | ||
| serialization_value != nullptr, | ||
| InvalidProgram, | ||
| "Null value for output %zu at index %d", |
There was a problem hiding this comment.
Same issue as inputs: output_tensor_meta() calls output_tag() before this null check, but output_tag() dereferences s_plan_->values()->Get(output_index) without validating the entry is non-null. To reliably catch corrupted flatbuffers, add the null check in output_tag() / get_tag() (and/or validate s_plan_->values() is non-null) before any dereference.
| for (size_t i = 0; i < values->size(); ++i) { | ||
| auto value = values->Get(i); | ||
| if (value->val_type() == executorch_flatbuffer::KernelTypes::Tensor) { | ||
| auto tensor_value = value->val_as_Tensor(); | ||
| if (tensor_value->extra_tensor_info() != nullptr && | ||
| if (tensor_value != nullptr && |
There was a problem hiding this comment.
attribute_tensor_meta() also dereferences value without a null check (value->val_type()), which can segfault if ExecutionPlan.values contains null table offsets. Add values != nullptr / value != nullptr validation (or skip null entries) and return InvalidProgram for corrupted metadata.
| int64_t size = s_plan_->non_const_buffer_sizes()->Get(index + 1); | ||
| ET_CHECK_OR_RETURN_ERROR( | ||
| size >= 0, | ||
| InvalidProgram, | ||
| "memory_planned_buffer_size(%zu) has invalid negative size: %" PRId64, |
There was a problem hiding this comment.
Add test coverage for the new negative-size rejection in memory_planned_buffer_size(): a unit test can mutate non_const_buffer_sizes[index+1] to a negative value and verify the API returns InvalidProgram (and does not crash).
…ethodMeta memory_planned_buffer_size() returned buffer sizes from the PTE FlatBuffer without validation. Negative int64_t values would become huge size_t values when used by callers for allocation, causing OOM/DoS. - Reject negative values from memory_planned_buffer_size() with Error::InvalidProgram - Reject negative tensor dimension sizes in calculate_nbytes() before the implicit cast to size_t, preventing wraparound to huge values - Add null-pointer checks on tensor metadata (tensor_value, sizes, dim_order) in input_tensor_meta() and output_tensor_meta() to guard against malformed PTE files This PR was authored with the assistance of Claude.
JacobSzwejbka
left a comment
There was a problem hiding this comment.
Would be ideal to just front load as much value checking as possible in program creation.
- Reject negative memory planning values - Reject negative tensor dimension sizes in calculate_nbytes() - Add null-pointer checks before TensorInfo::create to catch invalid tensor metadata in the flatbuffer. This PR was authored with the assistance of Claude. Co-authored-by: Github Executorch <github_executorch@arm.com>
This PR was authored with the assistance of Claude.