Skip to content

Commit d412ba5

Browse files
authored
Speedup push_down_filter_regression.slt by using uncompressed parquet (#20652)
## Which issue does this PR close? - Closes #20524 ## Rationale for this change `push_down_filter_regression.slt ` is the sqllogictest that takes the longest to run, even after @Tim-53 reduced its time in - #20586 While reviewing #20586 and trying to make the sqllogictest runs faster, I noticed that a substantial amount of the unit test time was spent doing zstd compression/decompression: <img width="2423" height="841" alt="Screenshot 2026-03-02 at 12 50 24 PM" src="https://github.com/user-attachments/assets/75cfe12b-3bb2-4ffa-9c36-63ca00b8c3ff" /> Thus, we can improve the test speed by skipping the zstd step ## What changes are included in this PR? 1. Don't compress the parquet files in the test ## Are these changes tested? Yes by CI Here are my performance runs using @kosiew 's new timing feature ```shell cargo test --profile=ci --test sqllogictests -- --timing-summary top ``` Main: ``` Per-file elapsed summary (deterministic): 1. 4.035s push_down_filter_regression.slt <-- takes over 4 seconds 2. 3.573s joins.slt 3. 3.492s aggregate.slt 4. 3.316s imdb.slt 5. ``` This PR ``` Per-file elapsed summary (deterministic): 1. 3.308s aggregate.slt 2. 3.290s joins.slt 3. 3.181s imdb.slt 4. 2.914s push_down_filter_regression.slt <--- takes less than 3 seconds and is no longer the tallest pole ``` ## Are there any user-facing changes? Faster tests
1 parent b092bd4 commit d412ba5

File tree

1 file changed

+4
-2
lines changed

1 file changed

+4
-2
lines changed

datafusion/sqllogictest/test_files/push_down_filter_regression.slt

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,8 @@
2121
query I
2222
COPY (select i as k, i as v from generate_series(1, 10000000) as t(i))
2323
TO 'test_files/scratch/push_down_filter_regression/t2.parquet'
24-
STORED AS PARQUET;
24+
STORED AS PARQUET
25+
OPTIONS ('format.compression' 'uncompressed');
2526
----
2627
10000000
2728

@@ -92,7 +93,8 @@ COPY (
9293
SELECT arrow_cast('2025-01-01T00:00:00Z'::timestamptz, 'Timestamp(Microsecond, Some("UTC"))') AS start_timestamp
9394
)
9495
TO 'test_files/scratch/push_down_filter_regression/17512.parquet'
95-
STORED AS PARQUET;
96+
STORED AS PARQUET
97+
OPTIONS ('format.compression' 'uncompressed');
9698
----
9799
1
98100

0 commit comments

Comments
 (0)