Commit 81faba3
authored
Basic Extension Type Registry Implementation (apache#20312)
## Which issue does this PR close?
This is a PR based on apache#18552
that contains a basic implementation of an extension type registry. The
driving use case is pretty-printing data frames with custom types.
- Closes apache#18223.
Ping @paleolimbot @adriangb if you're still interested.
### Most Important Changes to the Old PR
- We no longer use the Logical Type, as there is no real conses on how
DataFusion should allow "inline" references to extension types. As a
consequence, the formatting query plans use case in the old PR no longer
works. Extension types can only be used where DataFusion has a reference
to a registry (e.g., DataFrame pretty-printing). @paleolimbot I've
called it `DFExtensionType` instead of `BoundExtensionType` to avoid the
need of explaining "bind". If you think there is merit in the other
term, let me know. I think otherwise, this aligns with your proposal.
- Added a more complex example with a parameterized type to demonstrate
the entire ability of the API
- No extension types are registered by default, users must opt-in
## Rationale for this change
- Allow customized behavior based on extension type metadata.
## What changes are included in this PR?
- Add an `ExtensionTypeRegistry`
- Add `DFArrayFormatterFactory` which creates custom pretty-printers
when formatting data frames.
- Add an extension type registry to the `SessionState` /
`SessionContext`
- A Full Example of using the API
- An implementation for the UUID canonical extension type
## Are these changes tested?
- Yes, but only two end-to-end tests.
- One for pretty-printing UUID values
- One for pretty-printing in the example
Happy to add more tests if this PR has a chance of being merged
## Are there any user-facing changes?
Yes, the entire Extension Type API is new.1 parent a0869e9 commit 81faba3
26 files changed
Lines changed: 1209 additions & 20 deletions
File tree
- datafusion-examples
- examples/extension_types
- datafusion
- common
- src/types
- canonical_extensions
- core
- src
- dataframe
- datasource
- execution
- tests
- extension_types
- datasource-arrow/src
- datasource/src
- expr
- src
- extension_types
- ffi/src/session
- session/src
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
126 | 126 | | |
127 | 127 | | |
128 | 128 | | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
129 | 139 | | |
130 | 140 | | |
131 | 141 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
0 commit comments