Commit 9760106
authored
refactor: unify ResultSet implementations on Arrow-backed path (#175)
## Summary
Collapse the two ResultSet families (streaming Arrow + row-based
metadata) onto a single Arrow-backed pipeline so there is one accessor
implementation, one set of type semantics, and one place to fix bugs.
Tighten root-allocator hygiene end-to-end while we are in there, and
bring `getTypeInfo()` and integer-column accessor coercion into line
with JDBC 4.2.
## Why
The driver previously carried two parallel result-set implementations:
`StreamingResultSet` for query results (Arrow IPC, columnar accessors)
and `SimpleResultSet` / `DataCloudMetadataResultSet` / `ColumnAccessor`
for metadata (row-oriented `List<List<Object>>`, hand-rolled per-cell
coercion). Same JDBC surface, two divergent code paths. Bugs found in
one were rarely fixed in the other; type semantics drifted (e.g.
`getBoolean` on an integer column behaved differently between the two),
and the metadata path silently `toString()`'d any payload you handed it.
The pre-rebase review of this PR also surfaced several allocator leak
windows and JDBC-spec compliance gaps that the unification made it
natural to fix.
## What changed
**Unified result set.** `StreamingResultSet`,
`DataCloudMetadataResultSet`, `SimpleResultSet`, and `ColumnAccessor`
are removed. Every JDBC metadata call (`getTables`, `getColumns`,
`getSchemas`, `getTypeInfo`, the empty-metadata helpers) now flows
through `DataCloudResultSet` via a new `MetadataResultSets` factory.
`MetadataResultSets` builds a single-batch Arrow IPC stream by reusing
`VectorPopulator` (the same code path the JDBC parameter encoder uses)
and `HyperTypeToArrow.toField`, then hands the resulting reader +
allocator to `DataCloudResultSet.of`.
`DataCloudResultSet` is now a `public class` rather than the prior empty
marker interface; the concrete implementation is no longer a sibling
type called `StreamingResultSet`. The `of(...)` factory takes a
`QueryResultArrowStream.Result` (reader + allocator pair) and owns both
their lifecycles.
**Root allocator hygiene.** Six independent leak windows closed:
- `QueryResultArrowStream.toArrowStreamReader` returns a `Result` holder
that pairs reader + allocator and closes both in order (reader first so
ArrowBuf accounting clears before the allocator's budget check). The 100
MB cap moves to a public `ROOT_ALLOCATOR_BUDGET_BYTES` constant and is
now reused by the metadata path.
- `MetadataResultSets.of` and
`QueryResultArrowStream.toArrowStreamReader` both wrap allocator +
reader construction in try/catch so the allocator is closed if `new
ArrowStreamReader(...)` ever throws before ownership transfers.
- `DataCloudResultSet.of`'s construction-failure cleanup now wraps both
reader.close and allocator.close with `addSuppressed` so neither close
masks the original `SQLException`.
- `ArrowStreamReaderCursor.close` uses try-with-resources over
`(allocator, reader)` so reader closes first and any allocator-close
exception attaches as suppressed onto the reader's instead of replacing
it.
- `DataCloudStatement.executeQuery` and `getResultSet` hoist
`iterator.getQueryStatus().getQueryId()` once before allocator
construction, so a throw between allocator creation and
`DataCloudResultSet.of` taking ownership can no longer strand the
allocator.
- `DataCloudResultSet.close` is idempotent across cursor.close failures:
the `closed` flag flips before delegating, so a defensive caller's retry
is a no-op rather than a double-close.
**JDBC spec compliance.**
- `getTypeInfo()` boolean columns (`CASE_SENSITIVE`,
`UNSIGNED_ATTRIBUTE`, `FIXED_PREC_SCALE`, `AUTO_INCREMENT`) are now
declared as `BOOLEAN` per JDBC 4.2 (DatabaseMetaData.getTypeInfo
Javadoc) and pgjdbc, via a new `bool(...)` helper in `MetadataSchemas`.
They were previously declared as `text(...)` while the row producer
wrote raw `Boolean` values, which only "worked" because
`VarCharVectorSetter` silently `toString()`'d everything.
- `BaseIntVectorAccessor.getBoolean` now matches
`ResultSet.getBoolean`'s spec text on integer columns: 0 returns false,
1 returns true, anything else throws `SQLException` with SQLState
`22018` (matching pgjdbc's strict CANNOT_COERCE behavior in
`BooleanTypeUtil.fromNumber`). Previously inherited the abstract default
that threw `SQLFeatureNotSupportedException` for everything.
- `VectorPopulator.VarCharVectorSetter` is tightened from
`<VarCharVector, Object>` to `<VarCharVector, String>`, so non-String
payloads fail fast at the `BaseVectorSetter` type guard. The `byte[]`
arm was dead — `setBytes` / `setBinaryStream` / `setUnicodeStream` /
`setAsciiStream` all throw FEATURE_NOT_SUPPORTED in
`DataCloudPreparedStatement`.
- Integer-family setters (`IntVectorSetter`, `SmallIntVectorSetter`,
`TinyIntVectorSetter`) now range-check before narrowing rather than
silently truncating; `DecimalVectorSetter` does the same via a
`bitLength() > 63` guard on the unscaled value.
- `QueryJDBCAccessor.getObject(Class)` gains an `isInstance` fallback so
`getObject(col, String.class)` on a VARCHAR (and analogous
identity-class paths on every other accessor) works without each
accessor implementing typed `getObject` itself.
## Observable behavior changes
- `rs.getBoolean("CASE_SENSITIVE")` on `getTypeInfo()` returns a real
Boolean (was `SQLFeatureNotSupportedException` via the broken VARCHAR
path).
- `rs.getBoolean("NULLABLE")` on `getColumns()` (and any other integer
column) returns `false` for `0` and `true` for `1`, instead of throwing.
Other integer values throw `SQLException` (SQLState `22018`).
- `rs.getDate(intCol)` / `getTime(intCol)` / `getTimestamp(intCol)` on
metadata rows throw `SQLException` (was
`UnsupportedOperationException`).
- `rs.getObject(intCol, Boolean.class)` on metadata rows now throws (the
strict `isInstance` path).
- `rs.getMetaData().getColumnType(...)` on the four `getTypeInfo()`
boolean columns returns `Types.BOOLEAN`, not `Types.VARCHAR`.
- `rs.getMetaData().getColumnTypeName(...)` on every metadata result set
(`getTables`, `getColumns`, `getTypeInfo`, …) returns the JDBC type name
derived from the column's `HyperType` (`"VARCHAR"`, `"SMALLINT"`,
`"INTEGER"`, `"BOOLEAN"`) rather than the prior Hyper-flavored labels
(`"TEXT"`, `"SHORT"`, `"INTEGER"`, `"BOOL"`). The JDBC spec only
requires *some* database-specific type name and does not pin specific
strings; this aligns with the names other JDBC consumers in the driver
already use.
- `ps.setObject(idx, x, Types.VARCHAR)` with a non-String / non-byte[]
argument now throws `IllegalArgumentException` instead of silently
`toString()`-ing the argument.
- `ps.setObject(idx, x, Types.INTEGER)` (and INT2/INT8) throws
`IllegalArgumentException` for out-of-range Numbers instead of silently
narrowing; same for `Types.DECIMAL` when the unscaled value exceeds 64
bits.
## Breaking changes
`com.salesforce.datacloud.jdbc.core.DataCloudResultSet` is now a `public
class` rather than a `public interface`. External code that wrote `class
MyRs implements DataCloudResultSet` (decorators, wrappers, hand-rolled
doubles) no longer compiles; code that consumes the standard
`java.sql.ResultSet` / `DataCloudResultSet` API as an opaque type
recompiles unchanged.
The previously-public types `StreamingResultSet`,
`DataCloudMetadataResultSet`, `SimpleResultSet`, `ColumnAccessor` are
removed. External callers of `StreamingResultSet.of(ArrowStreamReader,
...)` should switch to
`DataCloudResultSet.of(QueryResultArrowStream.Result, ...)`.
## Test plan
- [x] `./gradlew :jdbc-core:test` — full module suite green.
- [x] `./gradlew :jdbc-core:spotlessCheck` — formatting clean.
- [x] `./gradlew clean build` — full build including
`:spark-datasource`, JaCoCo coverage, verification.
- New tests pin the load-bearing invariants: `MetadataSchemasTest` adds
three TYPE_INFO position-by-position assertions;
`VarCharVectorSetterStrictTypeTest` regresses on Boolean / byte[] /
Number payloads; `IntegerVectorSetterRangeCheckTest` extends to
`DecimalVectorSetter`; `ArrowStreamReaderCursorTest` pins
reader-before-allocator close ordering plus `addSuppressed` chaining
when both throw; `DataCloudResultSetMethodTest` pins `close()`
idempotence under cursor.close failure;
`DataCloudDatabaseMetadataTest.testGetTypeInfo` now exercises
`getBoolean` on all four boolean columns end-to-end.
BREAKING CHANGE: `DataCloudResultSet` is now a class instead of an
interface; `StreamingResultSet`, `DataCloudMetadataResultSet`,
`SimpleResultSet`, `ColumnAccessor` are removed; metadata int-column
`getDate`/`getTime`/`getTimestamp` throw `SQLException` (was
`UnsupportedOperationException`); `getTypeInfo()` boolean columns are
typed `BOOLEAN` instead of `VARCHAR` (`getObject` returns `Boolean`, not
`String`); `getColumnTypeName` on metadata result sets returns the JDBC
type name (`VARCHAR`/`SMALLINT`/`INTEGER`/`BOOLEAN`) instead of the
prior Hyper-flavored labels (`TEXT`/`SHORT`/`INTEGER`/`BOOL`);
`ps.setObject` with `Types.VARCHAR` rejects non-String/byte[] payloads;
integer-family and DECIMAL setters reject out-of-range values instead of
silently narrowing.1 parent f47714f commit 9760106
39 files changed
Lines changed: 1695 additions & 2008 deletions
File tree
- jdbc-core/src
- main/java/com/salesforce/datacloud/jdbc
- core
- accessor/impl
- metadata
- resultset
- protocol
- data
- util
- test/java/com/salesforce/datacloud/jdbc
- core
- metadata
- resultset
- examples
- protocol
- data
Lines changed: 21 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
| 20 | + | |
20 | 21 | | |
21 | 22 | | |
22 | 23 | | |
23 | 24 | | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
24 | 34 | | |
25 | 35 | | |
26 | 36 | | |
27 | 37 | | |
28 | 38 | | |
29 | 39 | | |
| 40 | + | |
30 | 41 | | |
31 | 42 | | |
32 | 43 | | |
33 | 44 | | |
34 | 45 | | |
35 | 46 | | |
36 | 47 | | |
37 | | - | |
| 48 | + | |
38 | 49 | | |
| 50 | + | |
39 | 51 | | |
40 | 52 | | |
41 | 53 | | |
| |||
91 | 103 | | |
92 | 104 | | |
93 | 105 | | |
94 | | - | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
95 | 114 | | |
96 | 115 | | |
Lines changed: 3 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
48 | 48 | | |
49 | 49 | | |
50 | 50 | | |
| 51 | + | |
51 | 52 | | |
52 | 53 | | |
53 | 54 | | |
| |||
220 | 221 | | |
221 | 222 | | |
222 | 223 | | |
223 | | - | |
| 224 | + | |
224 | 225 | | |
225 | 226 | | |
226 | 227 | | |
| |||
263 | 264 | | |
264 | 265 | | |
265 | 266 | | |
266 | | - | |
| 267 | + | |
267 | 268 | | |
268 | 269 | | |
269 | 270 | | |
| |||
Lines changed: 11 additions & 12 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
16 | | - | |
| 16 | + | |
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
| |||
706 | 706 | | |
707 | 707 | | |
708 | 708 | | |
709 | | - | |
| 709 | + | |
710 | 710 | | |
711 | 711 | | |
712 | 712 | | |
713 | 713 | | |
714 | 714 | | |
715 | | - | |
| 715 | + | |
716 | 716 | | |
717 | 717 | | |
718 | 718 | | |
719 | 719 | | |
720 | 720 | | |
721 | | - | |
| 721 | + | |
722 | 722 | | |
723 | 723 | | |
724 | 724 | | |
725 | 725 | | |
726 | | - | |
| 726 | + | |
727 | 727 | | |
728 | 728 | | |
729 | 729 | | |
730 | 730 | | |
731 | | - | |
| 731 | + | |
732 | 732 | | |
733 | 733 | | |
734 | 734 | | |
735 | 735 | | |
736 | | - | |
| 736 | + | |
737 | 737 | | |
738 | 738 | | |
739 | 739 | | |
740 | 740 | | |
741 | | - | |
| 741 | + | |
742 | 742 | | |
743 | 743 | | |
744 | 744 | | |
| |||
750 | 750 | | |
751 | 751 | | |
752 | 752 | | |
753 | | - | |
| 753 | + | |
754 | 754 | | |
755 | 755 | | |
756 | 756 | | |
757 | 757 | | |
758 | | - | |
759 | | - | |
| 758 | + | |
760 | 759 | | |
761 | 760 | | |
762 | 761 | | |
763 | 762 | | |
764 | 763 | | |
765 | | - | |
| 764 | + | |
766 | 765 | | |
767 | 766 | | |
768 | 767 | | |
| |||
Lines changed: 0 additions & 164 deletions
This file was deleted.
0 commit comments