Upgrade DataFusion to 54#8044
Conversation
Merging this PR will not alter performance
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Simulation | compare[63] |
244.6 µs | 360.3 µs | -32.11% |
| ❌ | Simulation | compare[62] |
254.7 µs | 368.5 µs | -30.89% |
| ❌ | Simulation | compare[56] |
229.5 µs | 332 µs | -30.89% |
| ❌ | Simulation | compare[60] |
248.3 µs | 358.4 µs | -30.71% |
| ❌ | Simulation | compare[61] |
255.4 µs | 367.3 µs | -30.46% |
| ❌ | Simulation | compare[58] |
245.5 µs | 351.9 µs | -30.24% |
| ❌ | Simulation | compare[59] |
250.8 µs | 359 µs | -30.14% |
| ❌ | Simulation | compare[57] |
246.1 µs | 350.5 µs | -29.79% |
| ❌ | Simulation | compare[54] |
236.2 µs | 335.1 µs | -29.51% |
| ❌ | Simulation | compare[55] |
241.1 µs | 341.9 µs | -29.47% |
| ❌ | Simulation | compare[52] |
230 µs | 325.1 µs | -29.27% |
| ❌ | Simulation | compare[48] |
212.2 µs | 300 µs | -29.24% |
| ❌ | Simulation | compare[53] |
236 µs | 333 µs | -29.13% |
| ❌ | Simulation | compare[50] |
226.9 µs | 318.4 µs | -28.72% |
| ❌ | Simulation | compare[51] |
232 µs | 325.3 µs | -28.68% |
| ❌ | Simulation | compare[49] |
227.5 µs | 317.1 µs | -28.25% |
| ❌ | Simulation | compare[46] |
217.9 µs | 301.9 µs | -27.81% |
| ❌ | Simulation | compare[47] |
222.8 µs | 308.6 µs | -27.81% |
| ❌ | Simulation | compare[44] |
211.3 µs | 291.5 µs | -27.53% |
| ❌ | Simulation | compare[45] |
218.2 µs | 300.3 µs | -27.34% |
| ... | ... | ... | ... | ... | ... |
ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing adamg/df-54 (585236e) with develop (5e3aedb)
Polar Signals Profiling ResultsLatest Run
Previous Runs (4)
Powered by Polar Signals Cloud |
Polar Signals Profiling ResultsLatest Run
Powered by Polar Signals Cloud |
Benchmarks: PolarSignals ProfilingVortex (geomean): 1.104x ❌ How to read Verdict and Engines
datafusion / vortex-file-compressed (1.104x ❌, 0↑ 5↓)
No file size changes detected. |
Benchmarks: FineWeb NVMeVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.008x ➖, 0↑ 0↓)
datafusion / vortex-compact (1.019x ➖, 0↑ 0↓)
datafusion / parquet (0.988x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.059x ➖, 0↑ 1↓)
duckdb / vortex-compact (1.040x ➖, 0↑ 0↓)
duckdb / parquet (1.058x ➖, 0↑ 1↓)
File Size Changes (1 files changed, -0.0% overall, 0↑ 1↓)
Totals:
|
Benchmarks: TPC-H SF=1 on NVMEVerdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.983x ➖, 1↑ 1↓)
datafusion / vortex-compact (1.020x ➖, 1↑ 3↓)
datafusion / parquet (1.021x ➖, 1↑ 3↓)
datafusion / arrow (0.978x ➖, 4↑ 3↓)
duckdb / vortex-file-compressed (1.000x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.011x ➖, 0↑ 0↓)
duckdb / parquet (1.022x ➖, 1↑ 4↓)
duckdb / duckdb (0.997x ➖, 0↑ 0↓)
File Size Changes (11 files changed, +0.1% overall, 6↑ 5↓)
Totals:
|
Benchmarks: TPC-DS SF=1 on NVMEVerdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.968x ➖, 13↑ 4↓)
datafusion / vortex-compact (0.973x ➖, 10↑ 4↓)
datafusion / parquet (0.977x ➖, 6↑ 5↓)
duckdb / vortex-file-compressed (1.006x ➖, 0↑ 3↓)
duckdb / vortex-compact (1.003x ➖, 0↑ 2↓)
duckdb / parquet (1.005x ➖, 0↑ 0↓)
duckdb / duckdb (0.997x ➖, 0↑ 0↓)
File Size Changes (6 files changed, -0.0% overall, 1↑ 5↓)
Totals:
|
Benchmarks: Statistical and Population GeneticsVerdict: No clear signal (low confidence) How to read Verdict and Engines
duckdb / vortex-file-compressed (1.057x ➖, 0↑ 1↓)
duckdb / vortex-compact (1.046x ➖, 0↑ 0↓)
duckdb / parquet (1.042x ➖, 0↑ 0↓)
File Size Changes (1 files changed, +0.0% overall, 1↑ 0↓)
Totals:
|
Benchmarks: FineWeb S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.091x ➖, 0↑ 2↓)
datafusion / vortex-compact (1.038x ➖, 0↑ 0↓)
datafusion / parquet (1.132x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.131x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.086x ➖, 0↑ 1↓)
duckdb / parquet (1.090x ➖, 0↑ 0↓)
|
Benchmarks: TPC-H SF=10 on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.940x ➖, 2↑ 1↓)
datafusion / vortex-compact (0.958x ➖, 5↑ 1↓)
datafusion / parquet (0.984x ➖, 1↑ 1↓)
datafusion / arrow (0.947x ➖, 8↑ 5↓)
duckdb / vortex-file-compressed (0.989x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.997x ➖, 0↑ 0↓)
duckdb / parquet (0.998x ➖, 0↑ 0↓)
duckdb / duckdb (0.991x ➖, 0↑ 0↓)
File Size Changes (27 files changed, -0.0% overall, 13↑ 14↓)
Totals:
|
File Sizes: TPC-H SF=10 on NVMEFile Size Changes (48 files changed, -0.0% overall, 0↑ 48↓)
Totals:
|
Benchmarks: Clickbench on NVMEVerdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.889x ✅, 13↑ 2↓)
datafusion / parquet (0.820x ✅, 21↑ 0↓)
duckdb / vortex-file-compressed (1.062x ➖, 1↑ 10↓)
duckdb / parquet (1.008x ➖, 2↑ 0↓)
duckdb / duckdb (1.015x ➖, 1↑ 1↓)
File Size Changes (106 files changed, -0.0% overall, 49↑ 57↓)
Totals:
|
File Sizes: Clickbench on NVMEFile Size Changes (201 files changed, -0.0% overall, 0↑ 201↓)
Totals:
|
Benchmarks: TPC-H SF=1 on S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.057x ➖, 0↑ 1↓)
datafusion / vortex-compact (1.137x ➖, 0↑ 7↓)
datafusion / parquet (1.071x ➖, 0↑ 2↓)
duckdb / vortex-file-compressed (1.105x ➖, 0↑ 1↓)
duckdb / vortex-compact (1.146x ➖, 0↑ 4↓)
duckdb / parquet (1.099x ➖, 0↑ 0↓)
|
Benchmarks: TPC-H SF=10 on S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.039x ➖, 0↑ 1↓)
datafusion / vortex-compact (1.100x ➖, 0↑ 5↓)
datafusion / parquet (1.034x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.096x ➖, 0↑ 1↓)
duckdb / vortex-compact (1.130x ➖, 0↑ 1↓)
duckdb / parquet (1.159x ➖, 0↑ 0↓)
|
Benchmarks: Random AccessVortex (geomean): 0.952x ➖ How to read Verdict and Engines
unknown / unknown (1.012x ➖, 11↑ 6↓)
|
Benchmarks: CompressionVortex (geomean): 1.003x ➖ How to read Verdict and Engines
unknown / unknown (1.042x ➖, 2↑ 23↓)
|
Benchmarks: Appian on NVMEVerdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.124x ❌, 0↑ 3↓)
datafusion / parquet (1.101x ❌, 1↑ 2↓)
duckdb / vortex-file-compressed (1.121x ❌, 0↑ 6↓)
duckdb / parquet (1.095x ➖, 0↑ 3↓)
duckdb / duckdb (1.096x ➖, 0↑ 4↓)
File Size Changes (4 files changed, -0.0% overall, 1↑ 3↓)
Totals:
|
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
| return Ok(cast(child, cast_dtype)); | ||
| } | ||
|
|
||
| if let Some(cast_col_expr) = df.as_any().downcast_ref::<df_expr::CastColumnExpr>() { |
There was a problem hiding this comment.
why can we remove the cast expressions?
There was a problem hiding this comment.
CastColumn was removed, and its functionality was merged into Cast.
Summary
This PR includes an upgrade of our DataFusion dependency/integration to the upcoming 54 release. It aims to make the minimal amount of changes, and implementing the new
MorselizerAPI will be part of a future PR (I have an old PR that was based on an earlier PoC, I'll try and pull stuff from there when the time comes).54.0.0(Apr 2026 / May 2026) apache/datafusion#21080