Skip to content

Adaptive query planner does not support sort-based shuffle #1611

@andygrove

Description

@andygrove

Describe the bug

When ballista.planner.adaptive.enabled is true, the BALLISTA_SHUFFLE_SORT_BASED_ENABLED flag is silently ignored and stages are always built with the hash-based ShuffleWriterExec.

The static planner (ballista/scheduler/src/planner.rs) routes shuffle writer construction through create_shuffle_writer_with_config, which picks SortShuffleWriterExec when sort-based shuffle is enabled and the partitioning is Hash. The AQE path does not have an equivalent: BallistaAdapter::adapt_to_ballista in ballista/scheduler/src/state/aqe/adapter.rs unconditionally constructs ShuffleWriterExec (lines 90 and 116).

Downstream AQE code reinforces this — AdaptiveStageInfo::plan (aqe/planner.rs:526) and AdaptiveExecutionGraph::create_resolved_stage (aqe/mod.rs:196) are typed as Arc<ShuffleWriterExec> rather than Arc<dyn ShuffleWriter>, so even fixing the adapter alone is not sufficient.

This is also called out in ballista/client/tests/sort_shuffle.rs:57, where the integration tests disable AQE with the comment "AQE does not support sort shuffle at the moment".

To Reproduce

Enable both flags on a SessionConfig and run any query that introduces a shuffle:

SessionConfig::new_with_ballista()
    .set_str(BALLISTA_SHUFFLE_SORT_BASED_ENABLED, "true")
    .set_bool(BALLISTA_ADAPTIVE_PLANNER_ENABLED, true);

The resulting plan uses ShuffleWriterExec rather than SortShuffleWriterExec.

Expected behavior

When sort-based shuffle is enabled, the adaptive planner should produce SortShuffleWriterExec for hash-partitioned shuffles, matching the behavior of the static planner.

Additional context

Suggested approach:

  • Generalize AdaptiveStageInfo::plan and AdaptiveExecutionGraph::create_resolved_stage over the ShuffleWriter trait.
  • Have BallistaAdapter::adapt_to_ballista use the same writer-selection logic as create_shuffle_writer_with_config, or factor that logic into a shared helper.
  • Re-enable the sort-shuffle integration tests with AQE on once the change lands.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions