Skip to content

Commit 9770ecf

Browse files
authored
[refactor](jdbc) Unify JDBC scanning into FileQueryScanNode/JniReader framework (#61141)
### What problem does this PR solve? Refactor the JDBC data source scanning architecture to integrate with the unified FileScanner/JniReader framework, replacing the standalone ExternalScanNode-based JDBC scan path. #### Motivation The JDBC scan path was independently implemented with its own operator (JDBCScanLocalState), scanner (JdbcScanner), and JNI connector (JniConnector → BaseJdbcExecutor hierarchy), while other JNI-based data sources (Paimon, Hudi, MaxCompute, TrinoConnector) already use the unified FileScanner → JniReader → JniScanner path. This caused code duplication, maintenance burden, and architectural inconsistency. #### Changes 1. BE: Split JniConnector into two focused classes: - JniReader: JNI lifecycle management (open/read/close), base class for all JNI readers (PaimonJniReader, HudiJniReader, JdbcJniReader, etc.) - JniDataBridge: Stateless data exchange between C++ Blocks and Java shared memory via JNI 2. Java: Introduce Strategy Pattern for database-specific type handling: - JdbcTypeHandler interface with DefaultTypeHandler base implementation - Per-database handlers: MySQLTypeHandler, OracleTypeHandler, PostgreSQLTypeHandler, ClickHouseTypeHandler, SQLServerTypeHandler, DB2TypeHandler, SapHanaTypeHandler, TrinoTypeHandler, GbaseTypeHandler - JdbcTypeHandlerFactory for handler selection - JdbcJniScanner (extends JniScanner) for read path - JdbcJniWriter (extends JniWriter) for write path - Old BaseJdbcExecutor subclasses marked @deprecated but preserved 3. FE: Migrate JdbcScanNode from ExternalScanNode to FileQueryScanNode: - JdbcScanNode now extends FileQueryScanNode - Introduces JdbcSplit (extends FileSplit) to carry JDBC params - Uses TFileScanRange with FORMAT_JNI and table_format_type="jdbc" - Adds jdbc_params field to TTableFormatFileDesc in Thrift - Adapts PhysicalPlanTranslator.visitPhysicalJdbcScan() accordingly 4. BE: Add JdbcJniReader and integrate into FileScanner: - JdbcJniReader handles special types (bitmap, HLL, quantile_state, JSONB) via string-based intermediary and CAST - FileScanner._get_next_reader() adds "jdbc" table_format_type branch - JdbcUtils utility for JDBC driver URL resolution - Existing JdbcScanner preserved as transitional (deprecated) All predicate push-down logic (createJdbcFilters, getJdbcQueryStr, etc.) is preserved from the original JdbcScanNode implementation.
1 parent 2205eff commit 9770ecf

File tree

81 files changed

+5426
-2817
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

81 files changed

+5426
-2817
lines changed

be/src/core/data_type_serde/data_type_quantilestate_serde.h

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -67,15 +67,17 @@ class DataTypeQuantileStateSerDe : public DataTypeSerDe {
6767
}
6868
Status deserialize_one_cell_from_json(IColumn& column, Slice& slice,
6969
const FormatOptions& options) const override {
70-
return Status::NotSupported("deserialize_one_cell_from_text with type " +
71-
column.get_name());
70+
auto& data_column = assert_cast<ColumnQuantileState&>(column);
71+
QuantileState quantile_state(slice);
72+
data_column.insert_value(std::move(quantile_state));
73+
return Status::OK();
7274
}
7375

7476
Status deserialize_column_from_json_vector(IColumn& column, std::vector<Slice>& slices,
7577
uint64_t* num_deserialized,
7678
const FormatOptions& options) const override {
77-
return Status::NotSupported("deserialize_column_from_text_vector with type " +
78-
column.get_name());
79+
DESERIALIZE_COLUMN_FROM_JSON_VECTOR();
80+
return Status::OK();
7981
}
8082

8183
Status write_column_to_pb(const IColumn& column, PValues& result, int64_t start,

0 commit comments

Comments
 (0)