You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[fix](be) Report physical new parquet skip metrics
### What problem does this PR solve?
Issue Number: close #xxx
Related PR: #64214
Problem Summary: The new parquet page skip profile mixed logical pruning information with physical reader work. Page skip counters used generic names that could be interpreted as all page-index-pruned pages, while they are updated only when Arrow's data page filter callback actually skips a page. ReaderSkipRows also counted scheduler-level logical skips, including rows already removed by page filtering, so it could overstate the actual RecordReader::SkipRecords work. This change renames the page skip counters to data-page-filter-specific names and updates ReaderSkipRows only for rows actually passed to Arrow RecordReader::SkipRecords. Parent complex readers and the synthetic row-position reader no longer add logical read/skip rows to the physical reader counters.
### Release note
None
### Check List (For Author)
- Test: Unit Test
- Pending: NewParquetReaderTest.*
- Behavior changed: No
- Does this need documentation: No
0 commit comments