Skip to content

Commit 4eeced5

Browse files
adriangbclaude
andcommitted
fix(sort-pushdown): drop runtime row-group reorder hints on Inexact→Exact upgrade
When `try_pushdown_sort` upgrades the `Inexact` result back to `Exact` (post-sort file groups non-overlapping + declared ordering re-validates), the `SortExec` is removed. But the source rebuilt from the `Inexact` result still carried `sort_order_for_reorder` / `reverse_row_groups`, so the parquet opener kept reordering row groups at runtime. With the `SortExec` gone, that runtime reorder is no longer a harmless optimisation — it is load-bearing for correctness, and it is wrong for DESC requests. The opener sorts row groups ASC-by-min and then `reverse()`s them; two row groups within a single file that share the same `min` but differ in `max` get mis-ordered by the reverse. Example: a file declared `WITH ORDER (id DESC)` containing `[10,8,8,8]` whose row groups are `[10,8]` and `[8,8]` (both min 8) streams as `8,8,10,8`. Previously the `SortExec` re-sorted that; once eliminated, the wrong order is the final result. Because the upgrade only fires when each file's declared ordering re-validates against the non-overlapping post-sort file groups, reading the files in their natural (declared-sorted) order already yields the requested ordering — no runtime reorder is needed. Restore the original, hint-free source on upgrade, matching the `Unsupported → Exact` path which also reads files naturally. Existing tests 4.1 / 6.1 / 8.1 / G.1 / G.2 lose the now-absent `sort_order_for_reorder` / `reverse_row_groups` from their `DataSourceExec` plans; data results are unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent a22da9f commit 4eeced5

2 files changed

Lines changed: 26 additions & 6 deletions

File tree

datafusion/datasource/src/file_scan_config/mod.rs

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -978,7 +978,7 @@ impl DataSource for FileScanConfig {
978978
}
979979
}
980980
SortOrderPushdownResult::Inexact { inner } => {
981-
let config = self.rebuild_with_source(inner, false, order)?;
981+
let mut config = self.rebuild_with_source(inner, false, order)?;
982982
// `rebuild_with_source` reorders files by stats; if the
983983
// post-sort files are non-overlapping AND the request now
984984
// validates against the new file groups, `output_ordering`
@@ -992,6 +992,26 @@ impl DataSource for FileScanConfig {
992992
inner: Arc::new(config),
993993
})
994994
} else {
995+
// Upgrading to Exact: the post-sort file groups are
996+
// non-overlapping and each file's declared ordering
997+
// re-validates, so reading the files in their natural
998+
// (declared-sorted) order already yields the requested
999+
// ordering — exactly like the `Unsupported` → Exact path,
1000+
// which reads files in natural order too.
1001+
//
1002+
// Drop the runtime row-group reorder hints the Inexact
1003+
// source carried (`sort_order_for_reorder` /
1004+
// `reverse_row_groups`) by restoring the original,
1005+
// hint-free source. With the `SortExec` removed those
1006+
// hints are not just redundant but unsafe: for a DESC
1007+
// request the opener sorts row groups ASC-by-min and then
1008+
// reverses them, which mis-orders two row groups within a
1009+
// single file that share the same `min` (e.g. a file
1010+
// `[10,8,8,8]` whose row groups are `[10,8]` and `[8,8]`
1011+
// would stream as `8,8,10,8`). The `SortExec` used to
1012+
// mask this; once it is gone the reordered stream is the
1013+
// final, wrong answer.
1014+
config.file_source = Arc::clone(&self.file_source);
9951015
Ok(SortOrderPushdownResult::Exact {
9961016
inner: Arc::new(config),
9971017
})

datafusion/sqllogictest/test_files/sort_pushdown.slt

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1110,7 +1110,7 @@ EXPLAIN SELECT * FROM reversed_parquet ORDER BY id ASC;
11101110
logical_plan
11111111
01)Sort: reversed_parquet.id ASC NULLS LAST
11121112
02)--TableScan: reversed_parquet projection=[id, value]
1113-
physical_plan DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/reversed/c_low.parquet, WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/reversed/b_mid.parquet, WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/reversed/a_high.parquet]]}, projection=[id, value], output_ordering=[id@0 ASC NULLS LAST], file_type=parquet, sort_order_for_reorder=[id@0 ASC NULLS LAST]
1113+
physical_plan DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/reversed/c_low.parquet, WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/reversed/b_mid.parquet, WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/reversed/a_high.parquet]]}, projection=[id, value], output_ordering=[id@0 ASC NULLS LAST], file_type=parquet
11141114

11151115
# Test 4.2: Results must be correct
11161116
query II
@@ -1336,7 +1336,7 @@ EXPLAIN SELECT * FROM reversed_with_order_parquet ORDER BY id ASC;
13361336
logical_plan
13371337
01)Sort: reversed_with_order_parquet.id ASC NULLS LAST
13381338
02)--TableScan: reversed_with_order_parquet projection=[id, value]
1339-
physical_plan DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/reversed/c_low.parquet, WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/reversed/b_mid.parquet, WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/reversed/a_high.parquet]]}, projection=[id, value], output_ordering=[id@0 ASC NULLS LAST], file_type=parquet, sort_order_for_reorder=[id@0 ASC NULLS LAST]
1339+
physical_plan DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/reversed/c_low.parquet, WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/reversed/b_mid.parquet, WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/reversed/a_high.parquet]]}, projection=[id, value], output_ordering=[id@0 ASC NULLS LAST], file_type=parquet
13401340

13411341
# Test 6.2: Results must be correct
13421342
query II
@@ -1473,7 +1473,7 @@ EXPLAIN SELECT * FROM desc_reversed_parquet ORDER BY id DESC;
14731473
logical_plan
14741474
01)Sort: desc_reversed_parquet.id DESC NULLS FIRST
14751475
02)--TableScan: desc_reversed_parquet projection=[id, value]
1476-
physical_plan DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/desc_reversed/b_high.parquet, WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/desc_reversed/a_low.parquet]]}, projection=[id, value], output_ordering=[id@0 DESC], file_type=parquet, sort_order_for_reorder=[id@0 DESC], reverse_row_groups=true
1476+
physical_plan DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/desc_reversed/b_high.parquet, WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/desc_reversed/a_low.parquet]]}, projection=[id, value], output_ordering=[id@0 DESC], file_type=parquet
14771477

14781478
# Test 8.2: Results must be correct
14791479
query II
@@ -2366,7 +2366,7 @@ logical_plan
23662366
physical_plan
23672367
01)SortPreservingMergeExec: [id@0 ASC NULLS LAST]
23682368
02)--BufferExec: capacity=1073741824
2369-
03)----DataSourceExec: file_groups={2 groups: [[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/tg_buffer/b_mid.parquet, WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/tg_buffer/a_high.parquet], [WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/tg_buffer/c_low.parquet]]}, projection=[id, value], output_ordering=[id@0 ASC NULLS LAST], file_type=parquet, sort_order_for_reorder=[id@0 ASC NULLS LAST]
2369+
03)----DataSourceExec: file_groups={2 groups: [[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/tg_buffer/b_mid.parquet, WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/tg_buffer/a_high.parquet], [WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/tg_buffer/c_low.parquet]]}, projection=[id, value], output_ordering=[id@0 ASC NULLS LAST], file_type=parquet
23702370

23712371
# Verify correctness
23722372
query II
@@ -2393,7 +2393,7 @@ logical_plan
23932393
physical_plan
23942394
01)SortPreservingMergeExec: [id@0 ASC NULLS LAST], fetch=3
23952395
02)--BufferExec: capacity=1073741824
2396-
03)----DataSourceExec: file_groups={2 groups: [[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/tg_buffer/b_mid.parquet, WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/tg_buffer/a_high.parquet], [WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/tg_buffer/c_low.parquet]]}, projection=[id, value], limit=3, output_ordering=[id@0 ASC NULLS LAST], file_type=parquet, sort_order_for_reorder=[id@0 ASC NULLS LAST]
2396+
03)----DataSourceExec: file_groups={2 groups: [[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/tg_buffer/b_mid.parquet, WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/tg_buffer/a_high.parquet], [WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/sort_pushdown/tg_buffer/c_low.parquet]]}, projection=[id, value], limit=3, output_ordering=[id@0 ASC NULLS LAST], file_type=parquet
23972397

23982398
query II
23992399
SELECT * FROM tg_buffer ORDER BY id ASC LIMIT 3;

0 commit comments

Comments
 (0)