Xnnpack flatbuffer: check non-null and num_dims matches array size in defineTensor#18896
Xnnpack flatbuffer: check non-null and num_dims matches array size in defineTensor#18896
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18896
Note: Links to docs will display an error until the docs builds have been completed. ⏳ No Failures, 11 PendingAs of commit 8d6ad8a with merge base 799bf5a ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@claude review |
This PR needs a
|
|
Claude finished @lucylq's task in 1m 33s —— View job Review of PR #18896
This PR adds input validation to prevent heap buffer overflows (TOB-EXECUTORCH-39, -42) from malicious flatbuffer payloads. The approach is sound — validating at deserialization time before passing untrusted dimension values to XNNPACK APIs. Here's a detailed review: Looks GoodNull check in
Combined null check in Error propagation at all call sites: All callers ( Issues1.
2. constexpr uint32_t kMaxDimensionValue = 1 << 24; // 16M per dimensionThe literal 3. No total-elements overflow check (suggestion, not a blocker) The per-dimension cap of 16M prevents any single dimension from being absurdly large, but a tensor with dimensions SummaryThe PR is a clear security improvement. It correctly validates deserialized flatbuffer tensor data before it reaches XNNPACK APIs, addressing heap buffer overflow risks from crafted inputs. The |
|
addressed claude's comments |
There was a problem hiding this comment.
Pull request overview
This PR hardens the XNNPACK runtime flatbuffer deserialization path by validating shape vectors before passing raw pointers into XNNPACK APIs, aiming to prevent heap buffer overflows and unbounded allocations from malformed or malicious programs.
Changes:
- Change
flatbufferDimsToVectorto returnResult<std::vector<T>>, add NULL-checking, and validate each dimension value against a max limit. - Add validation in
defineTensorto ensure dims exist andnum_dimsmatches the flatbuffer vector length, and propagateResulterrors to callers. - Update several node-definition helpers (transpose/reshape/slice/pad) to handle
ResultfromflatbufferDimsToVector.
Comments suppressed due to low confidence (3)
backends/xnnpack/runtime/XNNCompiler.cpp:1160
xnn_define_static_reshapeis called withgraph_node->num_dims()butnew_shape()->size()is not validated againstnum_dims. A malformed flatbuffer could cause XNNPACK to read pastdims_data.
Please addnew_shape != nullptrandnew_shape->size() == num_dimsvalidation (and a reasonablenum_dimscap) before the call.
auto graph_node = node->xnode_union_as_XNNStaticReshape();
// Get tensor dims, we need to convert the uint32_t* to size_t*
auto dims_result = flatbufferDimsToVector(graph_node->new_shape());
if (!dims_result.ok()) {
return dims_result.error();
}
std::vector<size_t> dims_data = std::move(dims_result.get());
xnn_status status = xnn_define_static_reshape(
subgraph_ptr,
graph_node->num_dims(),
dims_data.data(),
remapped_ids.at(graph_node->input_id()),
backends/xnnpack/runtime/XNNCompiler.cpp:1465
xnn_define_static_sliceis called withgraph_node->num_dims()but there is no validation thatoffsets()->size()andsizes()->size()both equalnum_dims. If either vector is shorter, XNNPACK will read pastoffsets/sizes.
Add checks for non-null vectors, matching sizes, and a reasonablenum_dimscap before calling XNNPACK.
auto graph_node = node->xnode_union_as_XNNStaticSlice();
auto offsets_result = flatbufferDimsToVector(graph_node->offsets());
if (!offsets_result.ok()) {
return offsets_result.error();
}
std::vector<size_t> offsets = std::move(offsets_result.get());
auto sizes_result = flatbufferDimsToVector(graph_node->sizes());
if (!sizes_result.ok()) {
return sizes_result.error();
}
std::vector<size_t> sizes = std::move(sizes_result.get());
xnn_status status = xnn_define_static_slice(
subgraph_ptr,
graph_node->num_dims(),
offsets.data(),
sizes.data(),
remapped_ids.at(graph_node->input_id()),
backends/xnnpack/runtime/XNNCompiler.cpp:1004
xnn_define_static_transposeis called withgraph_node->num_dims()but there is no validation thatperm()->size()matchesnum_dims. Ifnum_dimsis larger than the perm vector length, XNNPACK will read pastdims_data.
Add a check thatperm != nullptrandperm->size() == num_dims(and ideallynum_dims <= XNN_MAX_TENSOR_DIMS) before calling into XNNPACK.
auto graph_node = node->xnode_union_as_XNNStaticTranspose();
// Get tensor dims, we need to convert the uint32_t* to size_t*
auto dims_result = flatbufferDimsToVector(graph_node->perm());
if (!dims_result.ok()) {
return dims_result.error();
}
std::vector<size_t> dims_data = std::move(dims_result.get());
xnn_status status = xnn_define_static_transpose(
subgraph_ptr,
graph_node->num_dims(),
dims_data.data(),
remapped_ids.at(graph_node->input_id()),
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| Converts dims from uint32 to size_t. Takes in a flatbuffer vector | ||
| of uint32_t and returns a std::vector of size_t. XNNPACK takes in | ||
| dims of size_t* but tensor shape is serialized in flatbuffer as | ||
| int32_t. As a result, we need to static cast the shapes to size_t | ||
| int32_t. As a result, we need to static cast the shapes to size_t. | ||
| Individual dimension values are validated to prevent unbounded memory | ||
| allocation from malicious inputs. |
| ET_CHECK_OR_RETURN_ERROR( | ||
| fb_dims != nullptr, | ||
| InvalidProgram, | ||
| "flatbufferDimsToVector: dims vector is null"); | ||
| std::vector<T> dims_data; | ||
| dims_data.reserve(fb_dims->size()); | ||
| for (auto fb_dim : *fb_dims) { |
|
Clamping max elements at 16 million makes me a little bit nervous. Flattening large tensors isn't uncommon, and if a graph break happened in between the flatten call and whatever uses it, it could break existing PTEs. How critical is this check? |
4a24c25 to
b5faf2f
Compare
There was a problem hiding this comment.
Pull request overview
This PR aims to harden XNNPACK flatbuffer deserialization in XNNCompiler by validating tensor dimension metadata and making flatbufferDimsToVector return a Result<> to propagate validation failures.
Changes:
- Changed
flatbufferDimsToVectorto returnResult<std::vector<T>>and added a null-check for the incoming flatbuffer vector. - Added validation in
defineTensorthatdimsis non-null and thatnum_dimsmatchesdims()->size(). - Updated multiple node builders to handle
flatbufferDimsToVectorfailures and return the correspondingError.
Comments suppressed due to low confidence (4)
backends/xnnpack/runtime/XNNCompiler.cpp:171
flatbufferDimsToVectorcurrently only null-checks and casts dimensions. The PR description mentions per-dimension validation (e.g., enforcing a 16M max) to prevent unbounded allocations/overflows from malicious dimension values, but that validation isn't implemented here. Add bounds checks (and ideally a safe product/overflow check where total element count is derived) before pushing dims into the vector.
std::vector<T> flatbufferDimsToVector(
const flatbuffers::Vector<uint32_t>* fb_dims) {
std::vector<T> dims_data;
dims_data.reserve(fb_dims->size());
for (auto fb_dim : *fb_dims) {
dims_data.push_back(static_cast<T>(fb_dim));
}
return dims_data;
}
/**
Gets the constant data pointer associated with the given tensor value.
backends/xnnpack/runtime/XNNCompiler.cpp:996
xnn_define_static_transposeconsumesnum_dimselements fromperm, but the code doesn't validate thatgraph_node->perm()->size()matchesgraph_node->num_dims(). A malformed flatbuffer could setnum_dimslarger than the perm vector, leading to out-of-bounds reads. Add a check that the converteddims_data.size()equalsgraph_node->num_dims()(and handle NULL perm appropriately).
graph_node->num_dims(),
dims_data.data(),
remapped_ids.at(graph_node->input_id()),
remapped_ids.at(graph_node->output_id()),
graph_node->flags());
ET_CHECK_OR_RETURN_ERROR(
status == xnn_status_success,
Internal,
"Failed to create static transpose node %i with code: %s",
node->debug_handle(),
xnn_status_to_string(status));
return Error::Ok;
}
backends/xnnpack/runtime/XNNCompiler.cpp:1152
xnn_define_static_reshapeusesgraph_node->num_dims()to read fromnew_shape, but there is no validation thatnew_shape()->size()equalsnum_dims. Ifnum_dimsis larger than the vector, this can cause out-of-bounds reads. Add a check thatdims_data.size() == graph_node->num_dims()before calling into XNNPACK.
dims_data.data(),
remapped_ids.at(graph_node->input_id()),
remapped_ids.at(graph_node->output_id()),
graph_node->flags());
ET_CHECK_OR_RETURN_ERROR(
status == xnn_status_success,
Internal,
"Failed to create squeeze node %i with code: %s",
node->debug_handle(),
xnn_status_to_string(status));
return Error::Ok;
}
backends/xnnpack/runtime/XNNCompiler.cpp:1457
xnn_define_static_slicereadsnum_dimselements from bothoffsetsandsizes, but the code does not validate thatoffsets()->size()andsizes()->size()matchgraph_node->num_dims(). A malformed flatbuffer could setnum_dimslarger than these vectors and trigger out-of-bounds reads. Add checks that both converted vectors have lengthnum_dimsbefore calling XNNPACK.
offsets.data(),
sizes.data(),
remapped_ids.at(graph_node->input_id()),
remapped_ids.at(graph_node->output_id()),
graph_node->flags());
ET_CHECK_OR_RETURN_ERROR(
status == xnn_status_success,
Internal,
"Failed to create static slice node %i with code: %s",
node->debug_handle(),
xnn_status_to_string(status));
return Error::Ok;
}
/*
Defines batch matrix multiply node into the subgraph,
using the remapped ids to map the serialized ids,
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@claude review |
|
Claude finished @lucylq's task in 1m 39s —— View job Review of PR #18896 (re-review after updates)
This re-review covers the current state after lucylq addressed previous feedback (the What Changed (current diff)The PR now makes two categories of changes:
Issues1. Scalar tensor rejection (bug) ET_CHECK_OR_RETURN_ERROR(
tensor_value != nullptr && tensor_value->dims() != nullptr,
InvalidProgram,
"Deserialized tensor is null, or tensor dims is null");The flatbuffer schema explicitly states: "If num_dims is 0, this pointer can be NULL." This check rejects scalar tensors (0-dimensional) where Fix: allow ET_CHECK_OR_RETURN_ERROR(
tensor_value != nullptr,
InvalidProgram,
"Deserialized tensor is null");
ET_CHECK_OR_RETURN_ERROR(
tensor_value->num_dims() == 0 || tensor_value->dims() != nullptr,
InvalidProgram,
"Tensor dims is null but num_dims is %u",
tensor_value->num_dims());And guard the if (tensor_value->dims() != nullptr) {
ET_CHECK_OR_RETURN_ERROR(
tensor_value->num_dims() == tensor_value->dims()->size(),
InvalidProgram,
"Tensor num_dims %u does not match dims array size %u",
tensor_value->num_dims(),
tensor_value->dims()->size());
}2. The
Adding Re: GregoryComer's concern about 16M limitThe per-dimension Looks Good
SummaryThe structural validation approach is correct and simpler than the previous
|
|
@claude review |
|
Claude finished @lucylq's task in 1m 13s —— View job Review of PR #18896 (re-review after updates)
This is the third review, covering the current state after the Previous Issues — Status
All three issues from previous reviews have been addressed. The PR is significantly improved. Remaining IssueNull pointer dereference for scalar tensors in
// Line 307
std::vector<size_t> dims_data = flatbufferDimsToVector(tensor_value->dims());
dims_data.reserve(fb_dims->size()); // line 163 — null deref if fb_dims is nullptr
for (auto fb_dim : *fb_dims) { // line 164 — also null derefSo a valid scalar tensor with std::vector<size_t> dims_data;
if (tensor_value->dims() != nullptr) {
dims_data = flatbufferDimsToVector(tensor_value->dims());
}Looks Good
Minor NitThe comment at SummaryThe PR addresses the scalar tensor issue, removes the controversial |
There was a problem hiding this comment.
Pull request overview
This PR strengthens XNNPACK FlatBuffer deserialization/graph-definition validation in XNNCompiler.cpp to better reject malformed programs and prevent potential out-of-bounds access when interpreting dimension/parameter arrays.
Changes:
- Add validation for
XNNTensorValuenullness and dims pointer/length consistency indefineTensor. - Add null checks for several node parameter vectors (e.g., transpose perm, reshape new_shape, slice offsets/sizes).
- Prefer using actual vector sizes (converted to
std::vector<size_t>) when passing ranks to XNNPACK APIs.
Comments suppressed due to low confidence (2)
backends/xnnpack/runtime/XNNCompiler.cpp:1065
xnn_define_static_constant_padis called withpre_paddings_dims.data()/post_paddings_dims.data()but no explicit length; this makes the runtime rely on the vectors being the correct length for the input tensor rank. In addition to non-null checks, validate thatpre_paddings_dims.size() == post_paddings_dims.size()and that this size matches the input tensor’snum_dims(can be obtained by looking up theXNNTensorValuewithid_out == graph_node->input_id()similar toisQP8).
ET_CHECK_OR_RETURN_ERROR(
graph_node->pre_paddings() != nullptr &&
graph_node->post_paddings() != nullptr,
InvalidProgram,
"StaticConstantPad: pre_paddings or post_paddings is null");
std::vector<size_t> pre_paddings_dims =
flatbufferDimsToVector(graph_node->pre_paddings());
std::vector<size_t> post_paddings_dims =
flatbufferDimsToVector(graph_node->post_paddings());
xnn_status status = xnn_define_static_constant_pad(
subgraph_ptr,
pre_paddings_dims.data(),
post_paddings_dims.data(),
graph_node->padding_value(),
backends/xnnpack/runtime/XNNCompiler.cpp:308
tensor_value->num_dims() == 0is allowed to pass withtensor_value->dims() == nullptr, but immediately after this block the code unconditionally callsflatbufferDimsToVector(tensor_value->dims()), which will dereference a null pointer. Handle the scalar case by skippingflatbufferDimsToVectorwhendims()is null (e.g., leavedims_dataempty and pass a null/empty dims pointer whennum_dims==0).
ET_CHECK_OR_RETURN_ERROR(
tensor_value->num_dims() == 0 || tensor_value->dims() != nullptr,
InvalidProgram,
"Tensor dims is null but num_dims is %u",
tensor_value->num_dims());
if (tensor_value->dims() != nullptr) {
ET_CHECK_OR_RETURN_ERROR(
tensor_value->num_dims() == tensor_value->dims()->size(),
InvalidProgram,
"Tensor num_dims %u does not match dims array size %u",
tensor_value->num_dims(),
tensor_value->dims()->size());
}
// Get tensor dims, here we need to use a vector in order
// to properly convert the uint32_t* to size_t*
std::vector<size_t> dims_data = flatbufferDimsToVector(tensor_value->dims());
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…piler Validate that dims array is non-null and num_dims matches the actual array size in defineTensor to prevent heap buffer overflows. Change flatbufferDimsToVector to return Result<> with null-check and per-dimension validation against a 16M limit to prevent unbounded memory allocation from malicious dimension values. Authored-with: Claude
There was a problem hiding this comment.
Pull request overview
This PR primarily hardens XNNPACK flatbuffer deserialization to reject malformed graphs (preventing potential OOB reads), and additionally extends the Qwen3.5 MoE export tooling with a new Metal backend path and a new fpa4w quantization option.
Changes:
- Add XNNPACK runtime validation for tensor dims pointers/sizes in
defineTensor, plus additional null checks and safer dimension handling in several static ops. - Add Metal export + quantization pipeline for
examples/models/qwen3_5_moeand expose--backend metal. - Add
fpa4wshape-compatibility logic for Metal/MPS UIntx INT4 quantization.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
| extension/llm/export/quantize.py | Adds fpa4w shape compatibility rules for Metal/MPS INT4 weight-only quantization. |
| examples/models/qwen3_5_moe/export.py | Introduces Metal preparation/export flow and updates CLI to allow metal backend and fpa4w quantization. |
| backends/xnnpack/runtime/XNNCompiler.cpp | Adds null/size validation for flatbuffer vectors (dims/perm/new_shape/offsets/sizes) and uses vector sizes when calling XNNPACK APIs. |
Comments suppressed due to low confidence (1)
backends/xnnpack/runtime/XNNCompiler.cpp:1155
StaticConstantPadnow checks that both padding vectors are non-null, but it still doesn’t validate thatpre_paddingsandpost_paddingshave the same length. If they differ in a malformed/corrupted flatbuffer, passing the raw pointers into XNNPACK can lead to out-of-bounds reads. Please add a check thatpre_paddings_dims.size() == post_paddings_dims.size()(and ideally fail withInvalidProgramif not).
ET_CHECK_OR_RETURN_ERROR(
graph_node->pre_paddings() != nullptr &&
graph_node->post_paddings() != nullptr,
InvalidProgram,
"StaticConstantPad: pre_paddings or post_paddings is null");
std::vector<size_t> pre_paddings_dims =
flatbufferDimsToVector(graph_node->pre_paddings());
std::vector<size_t> post_paddings_dims =
flatbufferDimsToVector(graph_node->post_paddings());
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| shape = m.weight.shape | ||
| if config_name == "nvfp4": | ||
| compatible = shape[-2] % group_size == 0 and shape[-1] % group_size == 0 | ||
| elif config_name == "fpa4w": |
|
|
||
| elif backend == "metal": | ||
| if args.prequantized: | ||
| raise ValueError("Metal backend does not support --prequantized.") |
| ET_CHECK_OR_RETURN_ERROR( | ||
| tensor_value->num_dims() == tensor_value->dims()->size(), | ||
| InvalidProgram, | ||
| "Tensor num_dims %u does not match dims array size %u", |
| ET_CHECK_OR_RETURN_ERROR( | ||
| graph_node->perm() != nullptr, | ||
| InvalidProgram, | ||
| "StaticTranspose: perm is null"); |
| ET_CHECK_OR_RETURN_ERROR( | ||
| graph_node->new_shape() != nullptr, | ||
| InvalidProgram, | ||
| "StaticReshape: new_shape is null"); |
| InvalidProgram, | ||
| "StaticSlice: offsets size %zu does not match sizes size %zu", | ||
| offsets.size(), | ||
| sizes.size()); |
| ET_CHECK_OR_RETURN_ERROR( | ||
| tensor_value->num_dims() == 0 || tensor_value->dims() != nullptr, | ||
| InvalidProgram, | ||
| "Tensor dims is null but num_dims is %u", | ||
| tensor_value->num_dims()); | ||
|
|
||
| if (tensor_value->dims() != nullptr) { | ||
| ET_CHECK_OR_RETURN_ERROR( | ||
| tensor_value->num_dims() == tensor_value->dims()->size(), | ||
| InvalidProgram, | ||
| "Tensor num_dims %u does not match dims array size %u", | ||
| tensor_value->num_dims(), | ||
| tensor_value->dims()->size()); | ||
| } |
… defineTensor (pytorch#18896) Validate that dims array is non-null and num_dims matches the actual array size in defineTensor to prevent heap buffer overflows. Authored-with: Claude Co-authored-by: Github Executorch <github_executorch@arm.com>
Validate that dims array is non-null and num_dims matches the actual array size in defineTensor to prevent heap buffer overflows.
Authored-with: Claude