Skip to content

Commit 8994832

Browse files
committed
address review: early-return + fix typos lint
- adriangb r3294820727: flatten the if/else in try_pushdown_sort Inexact arm into an early-return guard clause so the Exact-upgrade block sits at one less level of indentation - typos CI: 'mis-orders' tripped the spell checker on leading 'mis'; reword the comment as 'reorders ... incorrectly' (no behaviour change)
1 parent f6408c4 commit 8994832

1 file changed

Lines changed: 25 additions & 26 deletions

File tree

  • datafusion/datasource/src/file_scan_config

datafusion/datasource/src/file_scan_config/mod.rs

Lines changed: 25 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -988,34 +988,33 @@ impl DataSource for FileScanConfig {
988988
// path before #21956 routed `column_in_file_schema` cases
989989
// here.
990990
if config.output_ordering.is_empty() {
991-
Ok(SortOrderPushdownResult::Inexact {
991+
return Ok(SortOrderPushdownResult::Inexact {
992992
inner: Arc::new(config),
993-
})
994-
} else {
995-
// Upgrading to Exact: the post-sort file groups are
996-
// non-overlapping and each file's declared ordering
997-
// re-validates, so reading the files in their natural
998-
// (declared-sorted) order already yields the requested
999-
// ordering — exactly like the `Unsupported` → Exact path,
1000-
// which reads files in natural order too.
1001-
//
1002-
// Drop the runtime row-group reorder hints the Inexact
1003-
// source carried (`sort_order_for_reorder` /
1004-
// `reverse_row_groups`) by restoring the original,
1005-
// hint-free source. With the `SortExec` removed those
1006-
// hints are not just redundant but unsafe: for a DESC
1007-
// request the opener sorts row groups ASC-by-min and then
1008-
// reverses them, which mis-orders two row groups within a
1009-
// single file that share the same `min` (e.g. a file
1010-
// `[10,8,8,8]` whose row groups are `[10,8]` and `[8,8]`
1011-
// would stream as `8,8,10,8`). The `SortExec` used to
1012-
// mask this; once it is gone the reordered stream is the
1013-
// final, wrong answer.
1014-
config.file_source = Arc::clone(&self.file_source);
1015-
Ok(SortOrderPushdownResult::Exact {
1016-
inner: Arc::new(config),
1017-
})
993+
});
1018994
}
995+
// Upgrading to Exact: the post-sort file groups are
996+
// non-overlapping and each file's declared ordering
997+
// re-validates, so reading the files in their natural
998+
// (declared-sorted) order already yields the requested
999+
// ordering — exactly like the `Unsupported` → Exact path,
1000+
// which reads files in natural order too.
1001+
//
1002+
// Drop the runtime row-group reorder hints the Inexact
1003+
// source carried (`sort_order_for_reorder` /
1004+
// `reverse_row_groups`) by restoring the original,
1005+
// hint-free source. With the `SortExec` removed those
1006+
// hints are not just redundant but unsafe: for a DESC
1007+
// request the opener sorts row groups ASC-by-min and then
1008+
// reverses them, which reorders two row groups within a
1009+
// single file that share the same `min` incorrectly
1010+
// (e.g. a file `[10,8,8,8]` whose row groups are
1011+
// `[10,8]` and `[8,8]` would stream as `8,8,10,8`).
1012+
// The `SortExec` used to mask this; once it is gone the
1013+
// reordered stream is the final, wrong answer.
1014+
config.file_source = Arc::clone(&self.file_source);
1015+
Ok(SortOrderPushdownResult::Exact {
1016+
inner: Arc::new(config),
1017+
})
10191018
}
10201019
SortOrderPushdownResult::Unsupported => {
10211020
self.try_sort_file_groups_by_statistics(order)

0 commit comments

Comments
 (0)