Skip to content

Commit 3950f0b

Browse files
authored
generate change log (#1485)
1 parent 5e98229 commit 3950f0b

1 file changed

Lines changed: 74 additions & 0 deletions

File tree

CHANGELOG.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,80 @@
1919

2020
# Changelog
2121

22+
## [52.0.0](https://github.com/apache/datafusion-ballista/tree/52.0.0) (2026-03-02)
23+
24+
**Performance related:**
25+
26+
- perf: optimize shuffle writer with buffered I/O and fix file size bug [#1386](https://github.com/apache/datafusion-ballista/pull/1386) (andygrove)
27+
28+
**Implemented enhancements:**
29+
30+
- feat: add config option for skipping arrow ipc read validation [#1374](https://github.com/apache/datafusion-ballista/pull/1374) (killzoner)
31+
- feat: improve tpch benchmark CLI [#1391](https://github.com/apache/datafusion-ballista/pull/1391) (andygrove)
32+
- feat: Add sort-based shuffle implementation [#1389](https://github.com/apache/datafusion-ballista/pull/1389) (andygrove)
33+
- feat: New ballista python interface [#1338](https://github.com/apache/datafusion-ballista/pull/1338) (milenkovicm)
34+
- feat: Add batch coalescing ability to shuffle reader exec [#1380](https://github.com/apache/datafusion-ballista/pull/1380) (danielhumanmod)
35+
- feat: Add arrow flight proxy to scheduler [#1351](https://github.com/apache/datafusion-ballista/pull/1351) (sebbegg)
36+
- feat: Creating SubstraitSchedulerClient and standalone Substrait examples [#1376](https://github.com/apache/datafusion-ballista/pull/1376) (mattcuento)
37+
- feat: Cluster RPC customisations to support TLS and custom headers [#1400](https://github.com/apache/datafusion-ballista/pull/1400) (phillipleblanc)
38+
- feat: add -c config override flag to tpch benchmark [#1435](https://github.com/apache/datafusion-ballista/pull/1435) (andygrove)
39+
- feat: Extract `execution_graph` to a trait [#1361](https://github.com/apache/datafusion-ballista/pull/1361) (milenkovicm)
40+
- feat: Add spark-compat mode to integrate datafusion-spark features au… [#1416](https://github.com/apache/datafusion-ballista/pull/1416) (mattcuento)
41+
- feat: add `Dataframe.cache()` factory (no planner handling) [#1420](https://github.com/apache/datafusion-ballista/pull/1420) (killzoner)
42+
- feat: Adaptive query execution (AQE) planner fundamentals [#1372](https://github.com/apache/datafusion-ballista/pull/1372) (milenkovicm)
43+
- feat: Make push scheduling policy default as it has lower latency [#1461](https://github.com/apache/datafusion-ballista/pull/1461) (milenkovicm)
44+
- feat: job scheduling with push based job status updates [#1478](https://github.com/apache/datafusion-ballista/pull/1478) (milenkovicm)
45+
46+
**Fixed bugs:**
47+
48+
- fix: compile issue after unsuccessful merge [#1402](https://github.com/apache/datafusion-ballista/pull/1402) (milenkovicm)
49+
- fix: prost build keda and TLS RPC example [#1429](https://github.com/apache/datafusion-ballista/pull/1429) (killzoner)
50+
- fix: remove `scheduler_config_spec.toml` as it is unused [#1462](https://github.com/apache/datafusion-ballista/pull/1462) (milenkovicm)
51+
- fix: Don't use `maxrows` as a "fetched rows" but calculate it from the batches [#1480](https://github.com/apache/datafusion-ballista/pull/1480) (martin-g)
52+
53+
**Documentation updates:**
54+
55+
- docs: fix outdated content in documentation [#1385](https://github.com/apache/datafusion-ballista/pull/1385) (andygrove)
56+
- docs: use tpchgen-rs for TPC-H data generation [#1390](https://github.com/apache/datafusion-ballista/pull/1390) (andygrove)
57+
- docs: add Jupyter notebook support documentation [#1399](https://github.com/apache/datafusion-ballista/pull/1399) (andygrove)
58+
- chore: Document ballista features in README.md [#1418](https://github.com/apache/datafusion-ballista/pull/1418) (mattcuento)
59+
60+
**Merged pull requests:**
61+
62+
- feat: add config option for skipping arrow ipc read validation [#1374](https://github.com/apache/datafusion-ballista/pull/1374) (killzoner)
63+
- docs: fix outdated content in documentation [#1385](https://github.com/apache/datafusion-ballista/pull/1385) (andygrove)
64+
- restrict python CI to python directory [#1383](https://github.com/apache/datafusion-ballista/pull/1383) (Huy1Ng)
65+
- perf: optimize shuffle writer with buffered I/O and fix file size bug [#1386](https://github.com/apache/datafusion-ballista/pull/1386) (andygrove)
66+
- docs: use tpchgen-rs for TPC-H data generation [#1390](https://github.com/apache/datafusion-ballista/pull/1390) (andygrove)
67+
- feat: improve tpch benchmark CLI [#1391](https://github.com/apache/datafusion-ballista/pull/1391) (andygrove)
68+
- doc: Add Ballista extensions example to the docs. [#1382](https://github.com/apache/datafusion-ballista/pull/1382) (LouisBurke)
69+
- feat: Add sort-based shuffle implementation [#1389](https://github.com/apache/datafusion-ballista/pull/1389) (andygrove)
70+
- feat: New ballista python interface [#1338](https://github.com/apache/datafusion-ballista/pull/1338) (milenkovicm)
71+
- doc: add more details for protobuf extension [#1393](https://github.com/apache/datafusion-ballista/pull/1393) (LouisBurke)
72+
- feat: Add batch coalescing ability to shuffle reader exec [#1380](https://github.com/apache/datafusion-ballista/pull/1380) (danielhumanmod)
73+
- docs: add Jupyter notebook support documentation [#1399](https://github.com/apache/datafusion-ballista/pull/1399) (andygrove)
74+
- feat: Add arrow flight proxy to scheduler [#1351](https://github.com/apache/datafusion-ballista/pull/1351) (sebbegg)
75+
- chore: update datafusion to 52 [#1394](https://github.com/apache/datafusion-ballista/pull/1394) (killzoner)
76+
- feat: Creating SubstraitSchedulerClient and standalone Substrait examples [#1376](https://github.com/apache/datafusion-ballista/pull/1376) (mattcuento)
77+
- fix: compile issue after unsuccessful merge [#1402](https://github.com/apache/datafusion-ballista/pull/1402) (milenkovicm)
78+
- feat: Cluster RPC customisations to support TLS and custom headers [#1400](https://github.com/apache/datafusion-ballista/pull/1400) (phillipleblanc)
79+
- chore: Document ballista features in README.md [#1418](https://github.com/apache/datafusion-ballista/pull/1418) (mattcuento)
80+
- fix: prost build keda and TLS RPC example [#1429](https://github.com/apache/datafusion-ballista/pull/1429) (killzoner)
81+
- Improve sort-based shuffle: single spill file per partition and batch coalescing [#1431](https://github.com/apache/datafusion-ballista/pull/1431) (andygrove)
82+
- feat: add -c config override flag to tpch benchmark [#1435](https://github.com/apache/datafusion-ballista/pull/1435) (andygrove)
83+
- feat: Extract `execution_graph` to a trait [#1361](https://github.com/apache/datafusion-ballista/pull/1361) (milenkovicm)
84+
- chore: add confirmation before tarball is released [#1445](https://github.com/apache/datafusion-ballista/pull/1445) (milenkovicm)
85+
- minor: add test to cover IPC arrow file read [#1450](https://github.com/apache/datafusion-ballista/pull/1450) (milenkovicm)
86+
- feat: Add spark-compat mode to integrate datafusion-spark features au… [#1416](https://github.com/apache/datafusion-ballista/pull/1416) (mattcuento)
87+
- feat: add `Dataframe.cache()` factory (no planner handling) [#1420](https://github.com/apache/datafusion-ballista/pull/1420) (killzoner)
88+
- fix: remove `scheduler_config_spec.toml` as it is unused [#1462](https://github.com/apache/datafusion-ballista/pull/1462) (milenkovicm)
89+
- feat: Adaptive query execution (AQE) planner fundamentals [#1372](https://github.com/apache/datafusion-ballista/pull/1372) (milenkovicm)
90+
- feat: Make push scheduling policy default as it has lower latency [#1461](https://github.com/apache/datafusion-ballista/pull/1461) (milenkovicm)
91+
- minor: improve log statements [#1482](https://github.com/apache/datafusion-ballista/pull/1482) (milenkovicm)
92+
- chore: update datafusion to 52.2 and other deps to latest [#1483](https://github.com/apache/datafusion-ballista/pull/1483) (milenkovicm)
93+
- fix: Don't use `maxrows` as a "fetched rows" but calculate it from the batches [#1480](https://github.com/apache/datafusion-ballista/pull/1480) (martin-g)
94+
- feat: job scheduling with push based job status updates [#1478](https://github.com/apache/datafusion-ballista/pull/1478) (milenkovicm)
95+
2296
## [51.0.0](https://github.com/apache/datafusion-ballista/tree/51.0.0) (2026-01-11)
2397

2498
**Implemented enhancements:**

0 commit comments

Comments
 (0)