Skip to content

chore: Fix all sqllogictest dangling configs#21108

Merged
alamb merged 4 commits intoapache:mainfrom
2010YOUY01:slt-cleanup
Mar 24, 2026
Merged

chore: Fix all sqllogictest dangling configs#21108
alamb merged 4 commits intoapache:mainfrom
2010YOUY01:slt-cleanup

Conversation

@2010YOUY01
Copy link
Copy Markdown
Contributor

@2010YOUY01 2010YOUY01 commented Mar 23, 2026

Which issue does this PR close?

Follow up to #20838

Rationale for this change

in #20838 it adds a way to check and warn about dangling config settings, it has detected several existing violations but has not fixed them:

yongting@Yongtings-MacBook-Pro-2 ~/C/datafusion (cleanup-slt)> git checkout main
Switched to branch 'main'
Your branch is up to date with 'upstream/main'.
yongting@Yongtings-MacBook-Pro-2 ~/C/datafusion (main=)> cargo test --profile=ci --test sqllogictests
   Compiling datafusion-sqllogictest v52.3.0 (/Users/yongting/Code/datafusion/datafusion/sqllogictest)
    Finished `ci` profile [unoptimized] target(s) in 2.09s
     Running bin/sqllogictests.rs (target/ci/deps/sqllogictests-7e2a8cc6115b158a)
Running with 14 test threads (available parallelism: 14)
SLT file aggregate_repartition.slt left modified configuration
  datafusion.execution.target_partitions: 4 -> 1
SLT file arrow_files.slt left modified configuration
  datafusion.sql_parser.map_string_types_to_utf8view: true -> false
SLT file datetime/current_date_timezone.slt left modified configuration
  datafusion.execution.time_zone: NULL -> +00:00
...

This PR does

  1. Fix all the old violations
  2. For future slt dangling configurations, it fails the test run instead of only warn about it.

I tested to add a config statement at the end of one slt file, and run the test, it runs and failed as expected

yongting@Yongtings-MacBook-Pro-2 ~/C/datafusion (cleanup-slt)> cargo test --profile=ci --test sqllogictests
   Compiling datafusion-sqllogictest v52.3.0 (/Users/yongting/Code/datafusion/datafusion/sqllogictest)
    Finished `ci` profile [unoptimized] target(s) in 2.28s
     Running bin/sqllogictests.rs (target/ci/deps/sqllogictests-7e2a8cc6115b158a)
Running with 14 test threads (available parallelism: 14)
Completed 416 test files in 5 seconds
External error: Other Error: SLT file sort_pushdown.slt left modified configuration
  datafusion.optimizer.max_passes: 3 -> 10
Error: Execution("1 failures")
error: test failed, to rerun pass `-p datafusion-sqllogictest --test sqllogictests`

What changes are included in this PR?

Are these changes tested?

UT

Are there any user-facing changes?

no

@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Mar 23, 2026
@2010YOUY01 2010YOUY01 requested a review from Copilot March 23, 2026 05:43
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR tightens datafusion-sqllogictest hygiene by making dangling DataFusion session configuration changes a hard test failure, and updates existing .slt files to restore any modified configs so each file leaves the runner in a clean state.

Changes:

  • Update the SLT runner to collect “config drift” errors per file and fail the test run (instead of only printing warnings on drop).
  • Add/adjust config reset statements across existing .slt files to eliminate current dangling-config violations.
  • Add a unit test to ensure modified configs are detected by the new validation logic.

Reviewed changes

Copilot reviewed 55 out of 56 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
datafusion/sqllogictest/test_files/window_topk_pushdown.slt Adds explicit resets for enable_topk_repartition and documents/sets target_partitions baseline.
datafusion/sqllogictest/test_files/window.slt Adds cleanup reset statements for several execution/optimizer settings at end of file section.
datafusion/sqllogictest/test_files/update.slt Resets optimizer.max_passes during cleanup.
datafusion/sqllogictest/test_files/union.slt Resets execution.batch_size at end to avoid config drift.
datafusion/sqllogictest/test_files/subquery.slt Resets execution.batch_size and explain.logical_plan_only during cleanup.
datafusion/sqllogictest/test_files/struct.slt Resets SQL dialect setting at end of test.
datafusion/sqllogictest/test_files/string/string_view.slt Resets explain.logical_plan_only at end of test.
datafusion/sqllogictest/test_files/sort_pushdown.slt Restores runner baseline partitions and resets parquet/sort-pushdown related configs.
datafusion/sqllogictest/test_files/sort_merge_join.slt Resets optimizer.prefer_hash_join at end of test.
datafusion/sqllogictest/test_files/set_variable.slt Adds cleanup resets for catalog and execution flags altered during the file.
datafusion/sqllogictest/test_files/select.slt Restores runner baseline partitions and resets optimizer.max_passes.
datafusion/sqllogictest/test_files/repartition_subset_satisfaction.slt Restores runner baseline partitions and resets repartition/subset settings before cleanup drops.
datafusion/sqllogictest/test_files/repartition_scan.slt Adds cleanup resets for catalog creation and repartition file sizing.
datafusion/sqllogictest/test_files/repartition.slt Restores runner baseline partitions and resets round-robin repartition setting.
datafusion/sqllogictest/test_files/push_down_filter_unnest.slt Resets explain.physical_plan_only at end of test.
datafusion/sqllogictest/test_files/push_down_filter_regression.slt Restores runner baseline partitions and resets explain/parquet/dynamic-filter configs.
datafusion/sqllogictest/test_files/push_down_filter_parquet.slt Resets explain.physical_plan_only and parquet filter pushdown.
datafusion/sqllogictest/test_files/projection_pushdown.slt Adds “config reset” blocks restoring runner baseline partitions for multi-partition sections.
datafusion/sqllogictest/test_files/preserve_file_partitioning.slt Restores runner baseline partitions and resets preserve-file-partitions and join thresholds.
datafusion/sqllogictest/test_files/predicates.slt Resets catalog.information_schema at end.
datafusion/sqllogictest/test_files/pipe_operator.slt Resets SQL dialect setting at end.
datafusion/sqllogictest/test_files/parquet_statistics.slt Resets statistics/explain flags after tests.
datafusion/sqllogictest/test_files/parquet_sorted_statistics.slt Restores runner baseline partitions and resets collect/split-by-statistics configs.
datafusion/sqllogictest/test_files/parquet.slt Adds cleanup resets for listing/parquet/runtime/catalog settings and restores baseline partitions.
datafusion/sqllogictest/test_files/order.slt Adds cleanup resets for catalog/optimizer/explain/sql_parser configs and restores baseline partitions.
datafusion/sqllogictest/test_files/optimizer_group_by_constant.slt Restores runner baseline partitions and resets explain.logical_plan_only.
datafusion/sqllogictest/test_files/listing_table_statistics.slt Converts cleanup from set ... = false to reset ... for stats-related settings.
datafusion/sqllogictest/test_files/limit_single_row_batches.slt Resets batch size and restores runner baseline partitions during cleanup.
datafusion/sqllogictest/test_files/limit_pruning.slt Resets parquet pushdown filters at end.
datafusion/sqllogictest/test_files/limit.slt Restores runner baseline partitions and resets explain/repartition-file-min-size settings.
datafusion/sqllogictest/test_files/joins.slt Adds cleanup resets for batch size, explain flags, and join/sort optimizer settings; restores baseline partitions.
datafusion/sqllogictest/test_files/join_only.slt Adds cleanup reset for repartition-joins after including join test part.
datafusion/sqllogictest/test_files/join_limit_pushdown.slt Restores runner baseline partitions and resets explain/hash-join preference.
datafusion/sqllogictest/test_files/insert_to_external.slt Restores runner baseline partitions at end.
datafusion/sqllogictest/test_files/insert.slt Restores runner baseline partitions at end.
datafusion/sqllogictest/test_files/information_schema_table_types.slt Resets catalog.information_schema at end.
datafusion/sqllogictest/test_files/information_schema_multiple_catalogs.slt Resets default catalog/schema and information_schema settings.
datafusion/sqllogictest/test_files/information_schema_columns.slt Resets default catalog/schema and information_schema settings.
datafusion/sqllogictest/test_files/information_schema.slt Adds cleanup resets for information_schema, partitions baseline, planning concurrency, and parquet created_by.
datafusion/sqllogictest/test_files/ident_normalization.slt Resets catalog.information_schema after enabling ident normalization.
datafusion/sqllogictest/test_files/group_by.slt Resets batch size and restores runner baseline partitions.
datafusion/sqllogictest/test_files/floor_preimage.slt Resets explain.logical_plan_only at end.
datafusion/sqllogictest/test_files/explain_tree.slt Resets explain format and tree render width at end.
datafusion/sqllogictest/test_files/explain.slt Converts cleanup from set collect_statistics=false to reset collect_statistics.
datafusion/sqllogictest/test_files/dynamic_filter_pushdown_config.slt Adds a parquet max row group size reset and updates cleanup comment block.
datafusion/sqllogictest/test_files/distinct_on.slt Resets explain.logical_plan_only at end.
datafusion/sqllogictest/test_files/delete.slt Resets optimizer.max_passes at end.
datafusion/sqllogictest/test_files/datetime/current_time_timezone.slt Resets execution.time_zone at end.
datafusion/sqllogictest/test_files/datetime/current_date_timezone.slt Resets execution.time_zone at end.
datafusion/sqllogictest/test_files/cte.slt Resets batch size, recursive CTE enablement, and ident normalization setting.
datafusion/sqllogictest/test_files/arrow_files.slt Resets sql_parser.map_string_types_to_utf8view at end.
datafusion/sqllogictest/test_files/aggregate_skip_partial.slt Adds multiple cleanup blocks to reset batch size, partitions baseline, and skip-partial-aggregation knobs.
datafusion/sqllogictest/test_files/aggregate_repartition.slt Restores runner baseline partitions at end.
datafusion/sqllogictest/test_files/aggregate.slt Adds explicit baseline partition restoration before final cleanup.
datafusion/sqllogictest/src/engines/datafusion_engine/runner.rs Replaces Drop-time warning with explicit validation, collects errors for the harness, and adds a unit test.
datafusion/sqllogictest/bin/sqllogictests.rs Aggregates config-drift errors per file, calls shutdown_async, and fails the run when drift is detected.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me -- thank you @2010YOUY01

I left some suggestions on comments and maybe some cleanup but nothing that is required in my mind

I tested by reverting one of the fixes

diff --git a/datafusion/sqllogictest/test_files/arrow_files.slt b/datafusion/sqllogictest/test_files/arrow_files.slt
index 94e150738..c3bc967ba 100644
--- a/datafusion/sqllogictest/test_files/arrow_files.slt
+++ b/datafusion/sqllogictest/test_files/arrow_files.slt
@@ -388,7 +388,3 @@ physical_plan DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/datafusion/
 # querying corrupted stream format should result in error
 query error DataFusion error: Arrow error: Parser error: Unsupported message header type in IPC stream: 'NONE'
 SELECT * FROM arrow_stream_corrupted_metadata_length
-
-# Config reset
-statement ok
-RESET datafusion.sql_parser.map_string_types_to_utf8view;

and running

cargo test --profile=ci --test sqllogictests

Which errors (as expected) with a useful message:

Completed 414 test files in 4 seconds                                                                                   External error: Other Error: SLT file arrow_files.slt left modified configuration
  datafusion.sql_parser.map_string_types_to_utf8view: true -> false
Error: Execution("1 failures")
error: test failed, to rerun pass `-p datafusion-sqllogictest --test sqllogictests`

Caused by:
  process didn't exit successfully: `/Users/andrewlamb/Software/datafusion2/target/ci/deps/sqllogictests-7e2a8cc6115b158a` (exit status: 1)


// If there was no correctness error, check that the config is unchanged.
runner.shutdown_async().await;
take_config_change_result(&config_change_errors)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need to be a mutex? Is it because there is no way to get the errros back from the runner?

Maybe we could add some documentation about what is going on here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's due sqllogictest is an external dependency, and it's not able to directly access the inner DataFusion runner, and I can't find an easier way to do it.

Added comments to explain.

fn take_config_change_result(
config_change_errors: &DataFusionConfigChangeErrors,
) -> Result<()> {
let errors = std::mem::take(&mut *config_change_errors.lock().unwrap());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this just copies the strings anyways, it seems like the take is somewhat unecessary. Maybe you could just get a lock rather than trying to take it?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, simplified.

@alamb
Copy link
Copy Markdown
Contributor

alamb commented Mar 24, 2026

I re-ran this locally and everything looks very nice -- thank you

@alamb alamb added this pull request to the merge queue Mar 24, 2026
Merged via the queue into apache:main with commit 98defe6 Mar 24, 2026
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants