Skip to content

Commit 9760106

Browse files
authored
refactor: unify ResultSet implementations on Arrow-backed path (#175)
## Summary Collapse the two ResultSet families (streaming Arrow + row-based metadata) onto a single Arrow-backed pipeline so there is one accessor implementation, one set of type semantics, and one place to fix bugs. Tighten root-allocator hygiene end-to-end while we are in there, and bring `getTypeInfo()` and integer-column accessor coercion into line with JDBC 4.2. ## Why The driver previously carried two parallel result-set implementations: `StreamingResultSet` for query results (Arrow IPC, columnar accessors) and `SimpleResultSet` / `DataCloudMetadataResultSet` / `ColumnAccessor` for metadata (row-oriented `List<List<Object>>`, hand-rolled per-cell coercion). Same JDBC surface, two divergent code paths. Bugs found in one were rarely fixed in the other; type semantics drifted (e.g. `getBoolean` on an integer column behaved differently between the two), and the metadata path silently `toString()`'d any payload you handed it. The pre-rebase review of this PR also surfaced several allocator leak windows and JDBC-spec compliance gaps that the unification made it natural to fix. ## What changed **Unified result set.** `StreamingResultSet`, `DataCloudMetadataResultSet`, `SimpleResultSet`, and `ColumnAccessor` are removed. Every JDBC metadata call (`getTables`, `getColumns`, `getSchemas`, `getTypeInfo`, the empty-metadata helpers) now flows through `DataCloudResultSet` via a new `MetadataResultSets` factory. `MetadataResultSets` builds a single-batch Arrow IPC stream by reusing `VectorPopulator` (the same code path the JDBC parameter encoder uses) and `HyperTypeToArrow.toField`, then hands the resulting reader + allocator to `DataCloudResultSet.of`. `DataCloudResultSet` is now a `public class` rather than the prior empty marker interface; the concrete implementation is no longer a sibling type called `StreamingResultSet`. The `of(...)` factory takes a `QueryResultArrowStream.Result` (reader + allocator pair) and owns both their lifecycles. **Root allocator hygiene.** Six independent leak windows closed: - `QueryResultArrowStream.toArrowStreamReader` returns a `Result` holder that pairs reader + allocator and closes both in order (reader first so ArrowBuf accounting clears before the allocator's budget check). The 100 MB cap moves to a public `ROOT_ALLOCATOR_BUDGET_BYTES` constant and is now reused by the metadata path. - `MetadataResultSets.of` and `QueryResultArrowStream.toArrowStreamReader` both wrap allocator + reader construction in try/catch so the allocator is closed if `new ArrowStreamReader(...)` ever throws before ownership transfers. - `DataCloudResultSet.of`'s construction-failure cleanup now wraps both reader.close and allocator.close with `addSuppressed` so neither close masks the original `SQLException`. - `ArrowStreamReaderCursor.close` uses try-with-resources over `(allocator, reader)` so reader closes first and any allocator-close exception attaches as suppressed onto the reader's instead of replacing it. - `DataCloudStatement.executeQuery` and `getResultSet` hoist `iterator.getQueryStatus().getQueryId()` once before allocator construction, so a throw between allocator creation and `DataCloudResultSet.of` taking ownership can no longer strand the allocator. - `DataCloudResultSet.close` is idempotent across cursor.close failures: the `closed` flag flips before delegating, so a defensive caller's retry is a no-op rather than a double-close. **JDBC spec compliance.** - `getTypeInfo()` boolean columns (`CASE_SENSITIVE`, `UNSIGNED_ATTRIBUTE`, `FIXED_PREC_SCALE`, `AUTO_INCREMENT`) are now declared as `BOOLEAN` per JDBC 4.2 (DatabaseMetaData.getTypeInfo Javadoc) and pgjdbc, via a new `bool(...)` helper in `MetadataSchemas`. They were previously declared as `text(...)` while the row producer wrote raw `Boolean` values, which only "worked" because `VarCharVectorSetter` silently `toString()`'d everything. - `BaseIntVectorAccessor.getBoolean` now matches `ResultSet.getBoolean`'s spec text on integer columns: 0 returns false, 1 returns true, anything else throws `SQLException` with SQLState `22018` (matching pgjdbc's strict CANNOT_COERCE behavior in `BooleanTypeUtil.fromNumber`). Previously inherited the abstract default that threw `SQLFeatureNotSupportedException` for everything. - `VectorPopulator.VarCharVectorSetter` is tightened from `<VarCharVector, Object>` to `<VarCharVector, String>`, so non-String payloads fail fast at the `BaseVectorSetter` type guard. The `byte[]` arm was dead — `setBytes` / `setBinaryStream` / `setUnicodeStream` / `setAsciiStream` all throw FEATURE_NOT_SUPPORTED in `DataCloudPreparedStatement`. - Integer-family setters (`IntVectorSetter`, `SmallIntVectorSetter`, `TinyIntVectorSetter`) now range-check before narrowing rather than silently truncating; `DecimalVectorSetter` does the same via a `bitLength() > 63` guard on the unscaled value. - `QueryJDBCAccessor.getObject(Class)` gains an `isInstance` fallback so `getObject(col, String.class)` on a VARCHAR (and analogous identity-class paths on every other accessor) works without each accessor implementing typed `getObject` itself. ## Observable behavior changes - `rs.getBoolean("CASE_SENSITIVE")` on `getTypeInfo()` returns a real Boolean (was `SQLFeatureNotSupportedException` via the broken VARCHAR path). - `rs.getBoolean("NULLABLE")` on `getColumns()` (and any other integer column) returns `false` for `0` and `true` for `1`, instead of throwing. Other integer values throw `SQLException` (SQLState `22018`). - `rs.getDate(intCol)` / `getTime(intCol)` / `getTimestamp(intCol)` on metadata rows throw `SQLException` (was `UnsupportedOperationException`). - `rs.getObject(intCol, Boolean.class)` on metadata rows now throws (the strict `isInstance` path). - `rs.getMetaData().getColumnType(...)` on the four `getTypeInfo()` boolean columns returns `Types.BOOLEAN`, not `Types.VARCHAR`. - `rs.getMetaData().getColumnTypeName(...)` on every metadata result set (`getTables`, `getColumns`, `getTypeInfo`, …) returns the JDBC type name derived from the column's `HyperType` (`"VARCHAR"`, `"SMALLINT"`, `"INTEGER"`, `"BOOLEAN"`) rather than the prior Hyper-flavored labels (`"TEXT"`, `"SHORT"`, `"INTEGER"`, `"BOOL"`). The JDBC spec only requires *some* database-specific type name and does not pin specific strings; this aligns with the names other JDBC consumers in the driver already use. - `ps.setObject(idx, x, Types.VARCHAR)` with a non-String / non-byte[] argument now throws `IllegalArgumentException` instead of silently `toString()`-ing the argument. - `ps.setObject(idx, x, Types.INTEGER)` (and INT2/INT8) throws `IllegalArgumentException` for out-of-range Numbers instead of silently narrowing; same for `Types.DECIMAL` when the unscaled value exceeds 64 bits. ## Breaking changes `com.salesforce.datacloud.jdbc.core.DataCloudResultSet` is now a `public class` rather than a `public interface`. External code that wrote `class MyRs implements DataCloudResultSet` (decorators, wrappers, hand-rolled doubles) no longer compiles; code that consumes the standard `java.sql.ResultSet` / `DataCloudResultSet` API as an opaque type recompiles unchanged. The previously-public types `StreamingResultSet`, `DataCloudMetadataResultSet`, `SimpleResultSet`, `ColumnAccessor` are removed. External callers of `StreamingResultSet.of(ArrowStreamReader, ...)` should switch to `DataCloudResultSet.of(QueryResultArrowStream.Result, ...)`. ## Test plan - [x] `./gradlew :jdbc-core:test` — full module suite green. - [x] `./gradlew :jdbc-core:spotlessCheck` — formatting clean. - [x] `./gradlew clean build` — full build including `:spark-datasource`, JaCoCo coverage, verification. - New tests pin the load-bearing invariants: `MetadataSchemasTest` adds three TYPE_INFO position-by-position assertions; `VarCharVectorSetterStrictTypeTest` regresses on Boolean / byte[] / Number payloads; `IntegerVectorSetterRangeCheckTest` extends to `DecimalVectorSetter`; `ArrowStreamReaderCursorTest` pins reader-before-allocator close ordering plus `addSuppressed` chaining when both throw; `DataCloudResultSetMethodTest` pins `close()` idempotence under cursor.close failure; `DataCloudDatabaseMetadataTest.testGetTypeInfo` now exercises `getBoolean` on all four boolean columns end-to-end. BREAKING CHANGE: `DataCloudResultSet` is now a class instead of an interface; `StreamingResultSet`, `DataCloudMetadataResultSet`, `SimpleResultSet`, `ColumnAccessor` are removed; metadata int-column `getDate`/`getTime`/`getTimestamp` throw `SQLException` (was `UnsupportedOperationException`); `getTypeInfo()` boolean columns are typed `BOOLEAN` instead of `VARCHAR` (`getObject` returns `Boolean`, not `String`); `getColumnTypeName` on metadata result sets returns the JDBC type name (`VARCHAR`/`SMALLINT`/`INTEGER`/`BOOLEAN`) instead of the prior Hyper-flavored labels (`TEXT`/`SHORT`/`INTEGER`/`BOOL`); `ps.setObject` with `Types.VARCHAR` rejects non-String/byte[] payloads; integer-family and DECIMAL setters reject out-of-range values instead of silently narrowing.
1 parent f47714f commit 9760106

39 files changed

Lines changed: 1695 additions & 2008 deletions

jdbc-core/src/main/java/com/salesforce/datacloud/jdbc/core/ArrowStreamReaderCursor.java

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,25 +17,37 @@
1717
import lombok.SneakyThrows;
1818
import lombok.extern.slf4j.Slf4j;
1919
import lombok.val;
20+
import org.apache.arrow.memory.BufferAllocator;
2021
import org.apache.arrow.vector.FieldVector;
2122
import org.apache.arrow.vector.VectorSchemaRoot;
2223
import org.apache.arrow.vector.ipc.ArrowStreamReader;
2324

25+
/**
26+
* Row cursor over an {@link ArrowStreamReader} that drives the {@link DataCloudResultSet}.
27+
*
28+
* <p>The cursor owns the supplied {@link BufferAllocator} alongside the reader: closing the
29+
* cursor closes the reader (which releases ArrowBuf accounting) and then the allocator (which
30+
* returns its budget). This is the single place that guarantees root-allocator hygiene for the
31+
* driver; callers of {@link DataCloudResultSet#of} hand ownership over and do not close the
32+
* allocator themselves.
33+
*/
2434
@Slf4j
2535
class ArrowStreamReaderCursor implements AutoCloseable {
2636

2737
private static final int INIT_ROW_NUMBER = -1;
2838

2939
private final ArrowStreamReader reader;
40+
private final BufferAllocator allocator;
3041
private final ZoneId sessionZone;
3142

3243
@lombok.Getter
3344
private int rowsSeen = 0;
3445

3546
private final AtomicInteger currentIndex = new AtomicInteger(INIT_ROW_NUMBER);
3647

37-
ArrowStreamReaderCursor(ArrowStreamReader reader, ZoneId sessionZone) {
48+
ArrowStreamReaderCursor(ArrowStreamReader reader, BufferAllocator allocator, ZoneId sessionZone) {
3849
this.reader = reader;
50+
this.allocator = allocator;
3951
this.sessionZone = sessionZone;
4052
}
4153

@@ -91,6 +103,13 @@ public boolean next() {
91103
@SneakyThrows
92104
@Override
93105
public void close() {
94-
reader.close();
106+
// try-with-resources closes in reverse declaration order: reader first (releases the
107+
// buffers accounted against the allocator so its closing budget check passes), then
108+
// allocator. If both throw, Java attaches the second as suppressed onto the first
109+
// instead of dropping the reader exception via the standard try/finally semantics.
110+
try (BufferAllocator a = allocator;
111+
ArrowStreamReader r = reader) {
112+
// resource cleanup happens at exit
113+
}
95114
}
96115
}

jdbc-core/src/main/java/com/salesforce/datacloud/jdbc/core/DataCloudConnection.java

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@
4848
import java.sql.Statement;
4949
import java.sql.Struct;
5050
import java.time.Duration;
51+
import java.time.ZoneId;
5152
import java.util.Arrays;
5253
import java.util.List;
5354
import java.util.Map;
@@ -220,7 +221,7 @@ public DataCloudResultSet getRowBasedResultSet(String queryId, long offset, long
220221
QueryResultArrowStream.OUTPUT_FORMAT);
221222
val arrowStream = SQLExceptionQueryResultIterator.createSqlExceptionArrowStreamReader(
222223
iterator, connectionProperties.isIncludeCustomerDetailInReason(), queryId, null);
223-
return StreamingResultSet.of(arrowStream, queryId);
224+
return DataCloudResultSet.of(arrowStream, queryId, ZoneId.systemDefault());
224225
} catch (StatusRuntimeException ex) {
225226
throw QueryExceptionHandler.createException(
226227
connectionProperties.isIncludeCustomerDetailInReason(), null, queryId, ex);
@@ -263,7 +264,7 @@ public DataCloudResultSet getChunkBasedResultSet(String queryId, long chunkId, l
263264
QueryResultArrowStream.OUTPUT_FORMAT);
264265
val arrowStream = SQLExceptionQueryResultIterator.createSqlExceptionArrowStreamReader(
265266
iterator, connectionProperties.isIncludeCustomerDetailInReason(), queryId, null);
266-
return StreamingResultSet.of(arrowStream, queryId);
267+
return DataCloudResultSet.of(arrowStream, queryId, ZoneId.systemDefault());
267268
} catch (StatusRuntimeException ex) {
268269
throw QueryExceptionHandler.createException(
269270
connectionProperties.isIncludeCustomerDetailInReason(), null, queryId, ex);

jdbc-core/src/main/java/com/salesforce/datacloud/jdbc/core/DataCloudDatabaseMetadata.java

Lines changed: 11 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313

1414
import com.google.common.collect.ImmutableList;
1515
import com.salesforce.datacloud.jdbc.config.DriverVersion;
16-
import com.salesforce.datacloud.jdbc.core.metadata.DataCloudResultSetMetaData;
16+
import com.salesforce.datacloud.jdbc.core.metadata.MetadataResultSets;
1717
import com.salesforce.datacloud.jdbc.core.types.HyperTypes;
1818
import com.salesforce.datacloud.jdbc.util.JdbcURL;
1919
import com.salesforce.datacloud.jdbc.util.ThrowingJdbcSupplier;
@@ -706,39 +706,39 @@ public ResultSet getColumns(String catalog, String schemaPattern, String tableNa
706706
@Override
707707
public ResultSet getColumnPrivileges(String catalog, String schema, String table, String columnNamePattern)
708708
throws SQLException {
709-
return DataCloudMetadataResultSet.empty();
709+
return MetadataResultSets.emptyNoColumns();
710710
}
711711

712712
@Override
713713
public ResultSet getTablePrivileges(String catalog, String schemaPattern, String tableNamePattern)
714714
throws SQLException {
715-
return DataCloudMetadataResultSet.empty();
715+
return MetadataResultSets.emptyNoColumns();
716716
}
717717

718718
@Override
719719
public ResultSet getBestRowIdentifier(String catalog, String schema, String table, int scope, boolean nullable)
720720
throws SQLException {
721-
return DataCloudMetadataResultSet.empty();
721+
return MetadataResultSets.emptyNoColumns();
722722
}
723723

724724
@Override
725725
public ResultSet getVersionColumns(String catalog, String schema, String table) throws SQLException {
726-
return DataCloudMetadataResultSet.empty();
726+
return MetadataResultSets.emptyNoColumns();
727727
}
728728

729729
@Override
730730
public ResultSet getPrimaryKeys(String catalog, String schema, String table) throws SQLException {
731-
return DataCloudMetadataResultSet.empty();
731+
return MetadataResultSets.emptyNoColumns();
732732
}
733733

734734
@Override
735735
public ResultSet getImportedKeys(String catalog, String schema, String table) throws SQLException {
736-
return DataCloudMetadataResultSet.empty();
736+
return MetadataResultSets.emptyNoColumns();
737737
}
738738

739739
@Override
740740
public ResultSet getExportedKeys(String catalog, String schema, String table) throws SQLException {
741-
return DataCloudMetadataResultSet.empty();
741+
return MetadataResultSets.emptyNoColumns();
742742
}
743743

744744
@Override
@@ -750,19 +750,18 @@ public ResultSet getCrossReference(
750750
String foreignSchema,
751751
String foreignTable)
752752
throws SQLException {
753-
return DataCloudMetadataResultSet.empty();
753+
return MetadataResultSets.emptyNoColumns();
754754
}
755755

756756
@Override
757757
public ResultSet getTypeInfo() throws SQLException {
758-
return DataCloudMetadataResultSet.of(
759-
new DataCloudResultSetMetaData(MetadataSchemas.TYPE_INFO), HyperTypes.typeInfoRows());
758+
return MetadataResultSets.ofRawRows(MetadataSchemas.TYPE_INFO, HyperTypes.typeInfoRows());
760759
}
761760

762761
@Override
763762
public ResultSet getIndexInfo(String catalog, String schema, String table, boolean unique, boolean approximate)
764763
throws SQLException {
765-
return DataCloudMetadataResultSet.empty();
764+
return MetadataResultSets.emptyNoColumns();
766765
}
767766

768767
@Override

jdbc-core/src/main/java/com/salesforce/datacloud/jdbc/core/DataCloudMetadataResultSet.java

Lines changed: 0 additions & 164 deletions
This file was deleted.

0 commit comments

Comments
 (0)