Skip to content

Commit 1390b5e

Browse files
Dandandanclaude
andcommitted
fix: disable overflow passthrough for distinct aggregates
COUNT(DISTINCT) and similar distinct aggregates produce per-row intermediate state when convert_to_state is called. In overflow mode this turns 100M rows into 100M single-value state objects that the downstream must merge — a 12x regression on Q9. Fix: skip overflow passthrough when any aggregate is distinct. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 01835ce commit 1390b5e

1 file changed

Lines changed: 5 additions & 1 deletion

File tree

datafusion/physical-plan/src/aggregates/row_hash.rs

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -674,10 +674,14 @@ impl GroupedHashAggregateStream {
674674
};
675675

676676
// Overflow passthrough requires convert_to_state support
677-
// (same requirement as skip_aggregation_probe)
677+
// (same requirement as skip_aggregation_probe) and no distinct
678+
// aggregates (convert_to_state for distinct produces per-row
679+
// state objects that are catastrophically expensive to merge).
680+
let has_distinct = aggregate_exprs.iter().any(|e| e.is_distinct());
678681
let overflow_passthrough_max_table_size = if agg.mode == AggregateMode::Partial
679682
&& matches!(group_ordering, GroupOrdering::None)
680683
&& skip_aggregation_probe.is_some()
684+
&& !has_distinct
681685
{
682686
context
683687
.session_config()

0 commit comments

Comments
 (0)