Commit 1c7e71e
authored
[fix](scan) Fix OOB crash in partition column generation for Iceberg/Paimon tables (#62177)
### What problem does this PR solve?
Problem Summary:
```
3# raise at ../sysdeps/posix/raise.c:27
4# abort at ./stdlib/abort.c:81
5# 0x0000556DDAC000A1 in /mnt/hdd01/ci/doris-deploy-master-local/be/lib/doris_be
6# std::vector, std::allocator >, std::allocator, std::allocator > > >::operator[](unsigned long) const at /usr/local/ldb-toolchain-v0.26/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/stl_vector.h:1282
7# doris::FileScanner::_generate_partition_columns() at ./be/build_ASAN/../src/exec/scan/file_scanner.cpp:1653
8# doris::FileScanner::_get_next_reader() at ./be/build_ASAN/../src/exec/scan/file_scanner.cpp:957
9# doris::FileScanner::_get_block_wrapped(doris::RuntimeState*, doris::Block*, bool*) at ./be/build_ASAN/../src/exec/scan/file_scanner.cpp:439
10# doris::FileScanner::_get_block_impl(doris::RuntimeState*, doris::Block*, bool*) at ./be/build_ASAN/../src/exec/scan/file_scanner.cpp:403
11# doris::Scanner::get_block(doris::RuntimeState*, doris::Block*, bool*) at ./be/build_ASAN/../src/exec/scan/scanner.cpp:143
12# doris::Scanner::get_block_after_projects(doris::RuntimeState*, doris::Block*, bool*) at ./be/build_ASAN/../src/exec/scan/scanner.cpp:119
13# doris::ScannerScheduler::_scanner_scan(std::shared_ptr, std::shared_ptr) at ./be/build_ASAN/../src/exec/scan/scanner_scheduler.cpp:179
14# doris::ScannerScheduler::submit(std::shared_ptr, std::shared_ptr)::$_0::operator()() const::{lambda()#1}::operator()() const::{lambda()#1}::operator()() const at ./be/build_ASAN/../src/exec/scan/scanner_scheduler.cpp:76
```
### Release note
BE crashes with SIGABRT (vector out-of-bounds) in
`FileScanner::_generate_partition_columns()` when scanning Iceberg or
Paimon partitioned tables.
Root cause: `_partition_slot_index_map` is built once from the first
scan range's `columns_from_path_keys`, but different ranges may have
different `columns_from_path` sizes due to:
1. **Iceberg partition evolution**: Tables evolving from non-identity
transforms (e.g., `day(ts)`) to identity transforms (`ts`).
`IcebergUtils.getPartitionInfoMap()` returns null for non-identity
transforms, so `setIcebergParams()` leaves `columnsFromPath` as an empty
list for those ranges while populating it for identity-transform ranges.
2. **Paimon mixed reader paths**: Native reader splits (Parquet/ORC)
call `setPaimonPartitionValues(partitionInfoMap)`, but JNI scanner
splits (for merge-required or unsupported-format data) skip this call
entirely, leaving `paimonPartitionValues` as null.
The root cause is that `createFileRangeDesc()` unconditionally called
`setColumnsFromPath([])` for all ranges, setting the Thrift `__isset`
flag to true even with an empty list. When BE sees `__isset=true`, it
enters the partition column loop and crashes on out-of-bounds access for
ranges that were never populated with actual partition values.
Fix:
- **FileQueryScanNode.createFileRangeDesc()**: Only call
`setColumnsFromPath()` and `setColumnsFromPathKeys()` when
`columnsFromPathKeys` is non-empty. This keeps `__isset=false` for
Iceberg/Paimon ranges that have no path-derived partition keys, so BE
skips partition column generation entirely for those ranges. Ranges that
do have partition values (set later by
`setIcebergParams`/`setPaimonParams`) work correctly since they
explicitly call `setColumnsFromPath()` with actual values.
- **PaimonScanNode**: Additionally set partition values on JNI scanner
splits (previously missing), enabling runtime filter partition pruning
for JNI splits.
Fix a BE crash (SIGABRT) that could occur when querying Iceberg tables
with partition evolution or Paimon tables with mixed-format data splits.1 parent 9125b17 commit 1c7e71e
2 files changed
Lines changed: 7 additions & 3 deletions
File tree
- fe/fe-core/src/main/java/org/apache/doris/datasource
- paimon/source
Lines changed: 4 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
539 | 539 | | |
540 | 540 | | |
541 | 541 | | |
542 | | - | |
543 | | - | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
544 | 546 | | |
545 | 547 | | |
546 | 548 | | |
| |||
Lines changed: 3 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
443 | 443 | | |
444 | 444 | | |
445 | 445 | | |
446 | | - | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
447 | 449 | | |
448 | 450 | | |
449 | 451 | | |
| |||
0 commit comments