Skip to content

Commit a0d4293

Browse files
fix(sqlite): reconstruct all basic payload types + validate at build time (#27)
* fix(sqlite): reconstruct LargeUtf8 and Utf8View on read-back * fix(sqlite): reconstruct List<Utf8View> on read-back * fix(sqlite): use LargeStringBuilder for List<LargeUtf8> read-back * fix(sqlite): reconstruct LargeList payloads on read-back * test(sqlite): cover null and empty list cells on read-back * feat(sqlite): validate payload types at build time PR #27 closed the reachable LargeUtf8/Utf8View/list read-back gaps, but the underlying defect class remained: the write side accepted any Arrow type (silent TEXT/NULL fallback) while the read side reconstructed only a subset, so an unsupported payload column built a "successful" sidecar that then failed on the first query (runtimedb#631). Close the class structurally: - Add `payload_type_supported` as the single source of truth for which Arrow types round-trip, and `validate_payload_schema` to gate every build entry point (`SqliteSidecarBuilder::begin`, `open_or_build`). An unsupported column now fails at index-build time with the column named, instead of at query time — which is where runtimedb's index-create step will surface it. - Widen actual support to the common parquet/DuckDB scalar types that previously fell through: Boolean, Date32, Date64, and Timestamp (all units, time zone preserved on read-back). Tests: reject a Decimal128 payload at build time; round-trip Boolean, Date32, and a tz-carrying Timestamp. * feat(sqlite): support all basic scalar payload types Round out the supported set with the remaining primitive Arrow types so a parquet/DuckDB column of any basic type round-trips instead of being rejected at build time: - small integers: Int8, Int16, UInt8, UInt16 (→ INTEGER) - time-of-day: Time32 (Second/Millisecond), Time64 (Microsecond/Nanosecond) (→ INTEGER, unit restored from the schema on read-back) - binary: Binary, LargeBinary (→ BLOB) These add match arms only; each column hits a single arm, so there is no runtime cost for types a given sidecar doesn't use. Float16 (needs the `half` crate) and Decimal128/256 (need lossless text encoding) remain out of scope and are still rejected early by validate_payload_schema. Test: round-trip all eight new types with null cells, asserting both the reconstructed Arrow type and the values. * fix(sqlite): key off output field 0 in payload validation `validate_payload_schema` was passed `begin`'s `key_col_index`, but that indexes the input batch while the function skips that position in the *output* schema — whose key is always field 0. For any `key_col_index != 0` this skipped a real payload column (leaving an unsupported type to fail at query time, the exact defect this guards against) and validated the key column instead. Drop the parameter and always skip output field 0, matching how `finish()` and `open_or_build` derive the key from `schema.field(0)`. Add a regression test driving `begin` with a non-zero `key_col_index` and an unsupported payload column. --------- Co-authored-by: Anoop Narang <anoop@hotdata.dev>
1 parent 3d8c7a2 commit a0d4293

2 files changed

Lines changed: 1095 additions & 60 deletions

File tree

0 commit comments

Comments
 (0)