Commit 3ca0e03
committed
feat(backend/kernel): plumb _use_arrow_native_complex_types to kernel's complex_types_as_json
The connector's `_use_arrow_native_complex_types` toggle is honoured
by the Thrift backend (forwarded server-side as `complexTypesAsArrow`)
but was silently ignored by the kernel backend — the kernel always
returned native Arrow `List` / `Map` / `Struct` regardless. This was
the root cause of the 5 `THRIFT_VS_KERNEL_COMPLEX_DISABLED` diffs in
the comparator's COMPLEX_TYPES suite.
The kernel side gained an opt-in `complex_types_as_json` post-
processor (kernel PR #36) that rewrites complex columns to `Utf8`
columns of compact JSON text, matching the Thrift wire format
byte-for-byte. This change wires the connector's existing kwarg
through to that flag:
- `session.py`: pass `_use_arrow_native_complex_types` to the kernel
client (it was being dropped on the floor for the kernel branch).
- `backend/kernel/client.py`: read it from kwargs (default `True`,
matching the connector-wide default), invert at the boundary, and
set `complex_types_as_json=not _use_arrow_native_complex_types`
on the kernel `Session()` constructor.
- `backend/kernel/type_mapping.py`: extend `_databricks_type_for_field`
to honour `databricks.type_name` for `ARRAY` / `MAP` / `STRUCT` (it
already did this for `VARIANT`). When the kernel JSON path is on,
the columns arrive as `Utf8` but the kernel preserves the original
SQL type name in metadata; `description` should report `array` /
`map` / `struct`, matching what the Thrift backend reports under
`complexTypesAsArrow=False`.
Verified end-to-end against the pecotesting comparator workspace:
the `THRIFT_VS_KERNEL_COMPLEX_DISABLED` suite drops from 5 type-shape
diffs + 1 row diff to 1 row diff. The remaining row diff is a Thrift
server-side bug — Thrift emits invalid JSON for map values containing
embedded `"` characters (`{"k":"val with "quote""}` — unescaped
inner quote), while the kernel emits the correctly-escaped form
(`{"k":"val with \"quote\""}`). The kernel is right here; matching
Thrift would mean deliberately producing un-parseable output.
Unit tests:
- Parametrised test of `_use_arrow_native_complex_types` (default /
True / False) → kernel `Session(complex_types_as_json=…)`.
- Parametrised test of `description_from_arrow_schema` recovering
`array` / `map` / `struct` from metadata, case-insensitively.
- Negative test that an unknown `databricks.type_name` defers to the
Arrow type rather than corrupting the description.
85 → 94 kernel unit tests; full suite green; black-formatted.
Co-authored-by: Isaac
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>1 parent c3cae63 commit 3ca0e03
5 files changed
Lines changed: 128 additions & 5 deletions
File tree
- src/databricks/sql
- backend/kernel
- tests/unit
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
98 | 98 | | |
99 | 99 | | |
100 | 100 | | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
101 | 112 | | |
102 | 113 | | |
103 | 114 | | |
| |||
155 | 166 | | |
156 | 167 | | |
157 | 168 | | |
| 169 | + | |
158 | 170 | | |
159 | 171 | | |
160 | 172 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
123 | 123 | | |
124 | 124 | | |
125 | 125 | | |
126 | | - | |
127 | | - | |
128 | | - | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
129 | 137 | | |
130 | 138 | | |
131 | 139 | | |
132 | 140 | | |
133 | | - | |
134 | | - | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
135 | 155 | | |
136 | 156 | | |
137 | 157 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
146 | 146 | | |
147 | 147 | | |
148 | 148 | | |
| 149 | + | |
149 | 150 | | |
150 | 151 | | |
151 | 152 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
241 | 241 | | |
242 | 242 | | |
243 | 243 | | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
244 | 282 | | |
245 | 283 | | |
246 | 284 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
114 | 114 | | |
115 | 115 | | |
116 | 116 | | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
117 | 169 | | |
118 | 170 | | |
119 | 171 | | |
| |||
0 commit comments