Commit f63e2c6
authored
feat(arrow/array): Add new arreflect package (#771)
### Rationale for this change
Attempting to address apache/arrow-adbc#4185,
there is no built-in way to convert between arrow arrays/records and
native Go objects and types using reflection. Users currently must
manually construct builders, iterate columns and handle type mapping for
their own schemas. Some other Arrow implementations (e.g. pyarrow) offer
higher-level APIs for this, so we can close the gap for Go.
### What changes are included in this PR?
Adds a new opt-in sub-package `arrow/array/arreflect` providing
bidirectional Go↔Arrow conversion via reflection.
**Public API**:
- `At[T]`, `ToSlice[T]` — Arrow array → Go value/slice
- `FromSlice[T]` — Go slice → Arrow array (variadic `Option` for
dict/listview/ree/decimal/temporal overrides)
- `RecordToSlice[T]`, `RecordFromSlice[T]` — `RecordBatch` ↔ Go struct
slices
- `RecordAt[T]`, `RecordAtAny` — single-row record accessors (typed and
runtime-inferred)
- `RecordToAnySlice` — runtime-inferred full-record conversion (no
compile-time Go type needed)
- `InferSchema[T]`, `InferType[T]` — infer `*arrow.Schema` /
`arrow.DataType` from Go types
- `InferGoType` — invert Arrow→Go type mapping at runtime via
`reflect.StructOf`
- `AtAny`, `ToAnySlice` — dynamic accessors when the Go type is not
known at compile time
- `WithDict()`, `WithListView()`, `WithREE()`, `WithDecimal(p,s)`,
`WithTemporal(s)` — encoding options
- Sentinel errors `ErrUnsupportedType`, `ErrTypeMismatch` (usable with
`errors.Is`)
**Supported Arrow types**: all primitives,
Timestamp/Date32/Date64/Time32/Time64/Duration, Decimal32/64/128/256,
Struct, List/LargeList/ListView/LargeListView (read), FixedSizeList,
Map, Dictionary (`dict` tag), RunEndEncoded (`ree` tag).
*Struct tag control* (follows `encoding/json` conventions):
```go
type Row struct {
Name string `arrow:"name"`
Score float64 `arrow:"score"`
Skip string `arrow:"-"`
Enc string `arrow:"enc,dict"`
When time.Time `arrow:"when,date32"`
Vals []int `arrow:"vals,listview"`
Price decimal128.Num `arrow:"price,decimal(18,2)"`
}
```
Key implementation details:
- Pointer fields → nullable Arrow fields (nil = null); multi-level
pointers fully dereferenced
- Embedded struct fields promoted following `encoding/json` BFS rules
(`collectFieldCandidates` + `resolveFieldCandidates`)
- Struct metadata cached per type via `sync.Map`
- `WithTemporal` validates input, returning `ErrUnsupportedType` for
unrecognized values
- `FromSlice` empty-slice path applies all encoding options consistently
with the non-empty path (decimal, temporal, dict, listview, ree)
- Tag parsing uses parenthesis-aware `splitTagTokens` for decimal(p,s) —
no fragile comma reassembly
- `InferGoType` validates all runes of exported field names, rejects
non-identifier characters (hyphens, dots, spaces, digit prefixes), and
detects duplicate exported names after capitalization
- `validateDictValueType` enforced on all dict paths (struct tags,
`FromSlice` opts, empty-slice)
- Primitive types cached as package-level `reflect.Type` vars
- Internal duplication minimized via helpers: `asTime`/`asDuration`
(TypeAssert), `appendListElement` (list builder dispatch with checked
type assertion), `listLike` interface (Elem() unification)
- Large list variants (`LARGE_LIST`, `LARGE_LIST_VIEW`) supported for
reading but not produced by `FromSlice`
### Are these changes tested?
Yes, comprehensive test coverage along with testable examples that will
show up in the docs.
### Are there any user-facing changes?
Yes, the entirely new public API in the new `arrow/array/arreflect`1 parent c5f0943 commit f63e2c6
13 files changed
Lines changed: 7658 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
0 commit comments