`PushBuffers::clear_ranges` is quadratic and leaks

  **Describe the bug**

`PushBuffers::clear_ranges` performs an O(N*M) scan to release consumed buffers, where N is the number of buffered ranges and M is the number of ranges to clear. On wide schemas (10k+ columns), this produces quadratic overhead in `PushDecoder` row group construction.

  Additionally, `clear_ranges` matches buffers by exact range equality. When the IO layer coalesces adjacent requested ranges into fewer, larger fetches, the coalesced buffer never exactly matches any individual requested range, so `clear_ranges` silently skips it. The buffer leaks in `PushBuffers` until the decoder finishes or the caller manually calls `release_all_ranges`, increasing peak RSS proportionally to the amount of data coalesced ahead of the current row group.

This puts coalescing in a bind: without it, buffer count scales with range count and the quadratic `clear_ranges` dominates. With it, memory is not reclaimed incrementally.

  **To Reproduce**

Use `PushDecoder` on a Parquet file with a wide schema (10k+ columns). Push data without coalesced buffers. Observe row group construction time scaling quadratically with column count. Conversely if using coalescing, observe memory not being released by `clear_ranges`, growing RSS until the decoder finishes.

  **Expected behavior**

Buffer release should scale with buffer count (not range count), and coalesced or arbitrarily-sized buffers should be released incrementally as the decoder progresses.

  **Additional context**

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`PushBuffers::clear_ranges` is quadratic and leaks #9695

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

PushBuffers::clear_ranges is quadratic and leaks #9695

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

`PushBuffers::clear_ranges` is quadratic and leaks #9695