Skip to content

Commit e80076b

Browse files
authored
Add higher-order functions changes to upgrade guide (#22107)
## Which issue does this PR close? Follow up of #21679 ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->
1 parent 6b134dd commit e80076b

1 file changed

Lines changed: 155 additions & 4 deletions

File tree

  • docs/source/library-user-guide/upgrading

docs/source/library-user-guide/upgrading/54.0.0.md

Lines changed: 155 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -751,10 +751,10 @@ SET datafusion.runtime.file_statistics_cache_limit = '0'
751751

752752
Use the file statistics cache provided by the `CacheManager` when initializing a new `ListingTable`:
753753

754-
````rust,ignore
754+
```rust,ignore
755755
ListingTable::try_new(config)?
756756
.with_cache(ctx.runtime_env().cache_manager.get_file_statistic_cache())
757-
[#21075]: https://github.com/apache/datafusion/pull/21075
757+
```
758758

759759
### `UnionsToFilter` optimizer rule is now disabled by default
760760

@@ -771,7 +771,7 @@ SELECT * FROM t WHERE a = 2
771771

772772
-- After: one scan
773773
SELECT DISTINCT * FROM t WHERE a = 1 OR a = 2
774-
````
774+
```
775775

776776
**Who is affected:**
777777

@@ -788,7 +788,158 @@ a combined `OR` (e.g., index-backed sources).
788788
SET datafusion.optimizer.enable_unions_to_filter = true;
789789
```
790790

791-
### The `RegisterFunction` enum has a new `HigherOrder` variant
791+
See [PR #21075](https://github.com/apache/datafusion/pull/21075) for more details
792+
793+
### Higher-order functions and lambdas
794+
795+
The changes below are related to the added support for higher-order functions
796+
and lambdas. For more info, see [#14205], [PR #18921], [PR #21679] and
797+
[EPIC #21172].
798+
799+
[#14205]: https://github.com/apache/datafusion/issues/14205
800+
[pr #18921]: https://github.com/apache/datafusion/pull/18921
801+
[pr #21679]: https://github.com/apache/datafusion/pull/21679
802+
[epic #21172]: https://github.com/apache/datafusion/issues/21172
803+
804+
#### `FunctionRegistry` exposes two additional methods
805+
806+
`FunctionRegistry` exposes two additional methods, `higher_order_function`
807+
which returns the registered higher-order function with the given name, if
808+
any, and `higher_order_function_names` which exposes the set of registered
809+
user defined higher-order function names.
810+
811+
**Who is affected:**
812+
813+
- Users who implement the `FunctionRegistry` trait
814+
815+
**Migration guide:**
816+
817+
Add `higher_order_function` and `higher_order_function_names` to your implementation.
818+
819+
```diff
820+
impl FunctionRegistry for FunctionRegistryImpl {
821+
fn udfs(&self) -> HashSet<String> {
822+
self.scalar_functions.keys().cloned().collect()
823+
}
824+
+
825+
+ fn higher_order_function(&self, name: &str) -> Result<Arc<dyn HigherOrderUDF>> {
826+
+ self.higher_order_functions
827+
+ .get(name)
828+
+ .cloned()
829+
+ .ok_or_else(|| plan_datafusion_err!("Higher-order function {name} not found"))
830+
+ }
831+
+
832+
+ fn higher_order_function_names(&self) -> HashSet<String> {
833+
+ self.higher_order_functions.keys().cloned().collect()
834+
+ }
835+
}
836+
```
837+
838+
#### `ContextProvider` exposes two additional methods
839+
840+
`ContextProvider` exposes two additional methods, `get_higher_order_meta`
841+
which returns the registered higher-order function with the given name,
842+
if any, and `higher_order_function_names` which exposes the registered
843+
user defined higher-order function names.
844+
845+
**Who is affected:**
846+
847+
- Users who implement the `ContextProvider` trait
848+
849+
**Migration guide:**
850+
851+
Add `get_higher_order_meta` and `higher_order_function_names` to your implementation.
852+
853+
```diff
854+
impl ContextProvider for ContextProviderImpl {
855+
fn udfs(&self) -> HashSet<String> {
856+
self.scalar_functions.keys().cloned().collect()
857+
}
858+
+
859+
+ fn get_higher_order_meta(&self, name: &str) -> Option<Arc<dyn HigherOrderUDF>> {
860+
+ self.higher_order_functions.get(name).cloned()
861+
+ }
862+
+
863+
+ fn higher_order_function_names(&self) -> Vec<String> {
864+
+ self.higher_order_functions.keys().cloned().collect()
865+
+ }
866+
}
867+
```
868+
869+
#### Add `higher_order_functions()` method to `Session`
870+
871+
The `higher_order_functions` method has been added to the `Session` trait,
872+
which exposes the registered user defined higher-order functions.
873+
874+
**Who is affected:**
875+
876+
- Users who implement the `Session` trait
877+
878+
**Migration guide:**
879+
880+
Add `higher_order_functions` to your implementation.
881+
882+
```diff
883+
impl Session for MySession {
884+
...
885+
+ fn higher_order_functions(&self) -> &HashMap<String, Arc<dyn HigherOrderUDF>> {
886+
+ &self.higher_order_functions
887+
+ }
888+
}
889+
```
890+
891+
#### New argument on `TaskContext::new`
892+
893+
`TaskContext::new` expects a new argument, `higher_order_functions`, which is
894+
a map of higher-order functions keyed by name.
895+
896+
**Who is affected:**
897+
898+
- Users who call `TaskContext::new`
899+
900+
**Migration guide:**
901+
902+
Provide the new argument to the function. An empty hash map is sufficient.
903+
904+
```diff
905+
+let higher_order_functions = HashMap::new();
906+
907+
TaskContext::new(
908+
task_id,
909+
session_id,
910+
session_config,
911+
scalar_functions,
912+
+ higher_order_functions,
913+
aggregate_functions,
914+
window_functions,
915+
runtime,
916+
)
917+
```
918+
919+
#### The `Expr` enum has three new variants
920+
921+
- `HigherOrderFunction`: Call a higher-order function with a set of arguments
922+
- `Lambda`: A Lambda expression with a set of parameters names and a body
923+
- `LambdaVariable`: A named reference to a lambda parameter
924+
925+
**Who is affected:**
926+
927+
- Users who match on an `Expr` without a default branch `_ => {}`
928+
929+
**Migration guide:**
930+
931+
Add the new branches to the match with the logic applicable to the context.
932+
933+
```diff
934+
match expr {
935+
Expr::Column(column) => ...,
936+
...,
937+
+ Expr::HigherOrderFunction(func) => {},
938+
+ Expr::Lambda(lambda) => {},
939+
+ Expr::LambdaVariable(lambda_var) => {},
940+
```
941+
942+
#### The `RegisterFunction` enum has a new `HigherOrder` variant
792943

793944
`RegisterFunction` now has a `HigherOrder(Arc<dyn HigherOrderUDF>)` variant
794945
so user-defined higher-order functions can be registered

0 commit comments

Comments
 (0)