Skip to content

Fix stack buffer overflows in ynnpack channelwise quantized tensor and reduce#9843

Open
mohammadmseet-hue wants to merge 2 commits intogoogle:masterfrom
mohammadmseet-hue:fix/channelwise-quantized-channel-dim-check
Open

Fix stack buffer overflows in ynnpack channelwise quantized tensor and reduce#9843
mohammadmseet-hue wants to merge 2 commits intogoogle:masterfrom
mohammadmseet-hue:fix/channelwise-quantized-channel-dim-check

Conversation

@mohammadmseet-hue
Copy link
Copy Markdown
Contributor

Summary

Two stack buffer overflows in ynnpack:

Bug 1: Stack buffer overflow in xnn_define_channelwise_quantized_tensor_value_v3

File: ynnpack/xnnpack/tensor.cc:146

std::copy_n(dims, channel_dim + 1, quantization_dims) copies channel_dim + 1 elements into quantization_dims[YNN_MAX_TENSOR_RANK] (size 8) without checking channel_dim < num_dims or channel_dim < YNN_MAX_TENSOR_RANK. Additionally, xnn_validate_channelwise_quantized_tensor (line 112-116) is empty (return xnn_status_success).

With channel_dim >= 8, this writes past the end of the stack buffer — 88 bytes of controlled overflow confirmed.

ASAN output:

==ERROR: AddressSanitizer: stack-buffer-overflow
WRITE of size 88
    #0 memmove
    #8 xnn_define_channelwise_quantized_tensor_value_v3
[96, 160) 'quantization_dims' (line 145) <== Memory access at offset 160 overflows this variable

Reachable from public API: xnn_define_channelwise_quantized_tensor_value(), xnn_define_channelwise_quantized_tensor_value_v2(), xnn_define_channelwise_quantized_tensor_value_v3().

Fix: Add channel_dim >= num_dims || num_dims > YNN_MAX_TENSOR_RANK check before the std::copy_n.

Bug 2: Stack buffer overflow in get_reduce_identity_value

File: ynnpack/subgraph/reduce.cc:243

For ynn_reduce_min_max with keep_dims=true on a rank-8 tensor, output.extents.push_back(2) (line 345) increases the output rank to 9. Then get_reduce_identity_value computes rank = output.rank() = 9 and accesses dims[rank - 1] = dims[8], writing one element past the size_t dims[YNN_MAX_TENSOR_RANK] (size 8) stack array.

Fix: Add rank bounds check in the ynn_reduce_min_max case before accessing dims[rank - 1].

Comment thread ynnpack/subgraph/reduce.cc Outdated
Comment thread ynnpack/xnnpack/tensor.cc
…d reduce

Bug 1: xnn_define_channelwise_quantized_tensor_value_v3 (tensor.cc:146)
std::copy_n(dims, channel_dim + 1, quantization_dims) copies channel_dim + 1
elements into quantization_dims[YNN_MAX_TENSOR_RANK] (size 8) without checking
channel_dim < num_dims or channel_dim < YNN_MAX_TENSOR_RANK. With
channel_dim >= 8, this writes past the stack buffer.

ASAN trace:
  ==ERROR: AddressSanitizer: stack-buffer-overflow
  WRITE of size 88
  google#8 xnn_define_channelwise_quantized_tensor_value_v3
  [96, 160) 'quantization_dims' (line 145) <== overflows this variable

Fix: Add channel_dim >= num_dims and num_dims > YNN_MAX_TENSOR_RANK checks.

Bug 2: get_reduce_identity_value (reduce.cc:243)
For ynn_reduce_min_max with keep_dims=true on a rank-8 tensor,
output.extents.push_back(2) increases rank to 9. Then dims[rank - 1] = dims[8]
writes one element past the size-8 stack array.

Fix: Add rank bounds check before array access.
- tensor.cc: Split combined check into separate num_dims and channel_dim
  validations with YNN_LOG_ERROR messages. Replace asserts with proper
  error returns for channelwise_zero_point. Remove assert(data) per
  reviewer (XNNPACK limitation, not YNNPACK).
- reduce.cc: Change define_reduce to return ynn_status. Add output rank
  validation after min_max dimension push. Keep rank >= 1 as assert
  (internal invariant). Propagate error via YNN_RETURN_IF_ERROR at call
  site.
@mohammadmseet-hue mohammadmseet-hue force-pushed the fix/channelwise-quantized-channel-dim-check branch from be986be to f781ad1 Compare April 1, 2026 08:16
@mohammadmseet-hue
Copy link
Copy Markdown
Contributor Author

ASan proof for the reduce.cc rank-9 stack-buffer-overflow in this PR

Re-verified on today's master (HEAD 37d77e566). The recent refactor 6c008cf68 Optimize the subgraph glue code for reduce kernels did not add the missing validate_rank in ynn_define_reduce, so the bug is still live.

Trigger chain (public C API only)

  1. ynn_create_subgraph
  2. ynn_define_tensor_value with num_dims = YNN_MAX_TENSOR_RANK = 8
  3. ynn_define_reduce(subgraph, ynn_reduce_min_max, /*num_axes=*/0, nullptr, input_id, YNN_INVALID_VALUE_ID, &output_id, YNN_NODE_FLAG_KEEP_DIMS)
  4. ynn_define_reduce (reduce.cc:413) skips validate_rank — unlike stack, static_expand_dims, split_dim, stencil_copy, static_transpose which all validate.
  5. In define_reduce, the keep-dims loop (reduce.cc:310-318) leaves output.extents.size() == 8, then reduce.cc:322 unconditionally runs output.extents.push_back(2) → size becomes 9.
  6. get_reduce_identity_value(subgraph, output, ynn_reduce_min_max) then executes rank = output.rank() == 9; dims[rank - 1] = 2; → 8-byte OOB write on the 8-element size_t dims[YNN_MAX_TENSOR_RANK] stack array. std::reverse(dims, dims + 9) immediately after compounds the OOB.

Also reachable from the TFLite-delegate dynamic-quant path via ynnpack/xnnpack/dynamic_quantization.cc:214.

Build

g++ -std=c++17 -fsanitize=address,undefined -fno-omit-frame-pointer -g -O0 poc_reduce_minmax_rank9.cc -o poc
./poc

Output (clang/g++ ASan + UBSan)

poc_reduce_minmax_rank9.cc:70:20: runtime error: index 8 out of bounds for type 'long unsigned int [8]'
poc_reduce_minmax_rank9.cc:70:22: runtime error: store to address 0x7f...6a0 with insufficient space for an object of type 'size_t'
=================================================================
==1131527==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7f...6a0
WRITE of size 8 at 0x7f...6a0 thread T0
    #0 get_reduce_identity_value(ynn_value const&, ynn_reduce_operator)
       poc_reduce_minmax_rank9.cc:70
    #1 main poc_reduce_minmax_rank9.cc:105

Address 0x7f...6a0 is located in stack of thread T0 at offset 160 in frame
    #0 get_reduce_identity_value(ynn_value const&, ynn_reduce_operator) poc_reduce_minmax_rank9.cc:50

  This frame has 3 object(s):
    [48, 52) '<unknown>'
    [64, 72) 'value_f32' (line 51)
    [96, 160) 'dims' (line 53) <== Memory access at offset 160 overflows this variable

SUMMARY: AddressSanitizer: stack-buffer-overflow in get_reduce_identity_value

The PoC file is a standalone C++ file that copies the vulnerable get_reduce_identity_value and the define_reduce shape-propagation prelude verbatim from subgraph/reduce.cc, so ASan acts on the real function body.

Suggested regression test for the merge: calling ynn_define_reduce with a rank-8 tensor + ynn_reduce_min_max + YNN_NODE_FLAG_KEEP_DIMS should return xnn_status_unsupported_parameter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants