Describe the bug
After upgrading from 3.3.3 to 3.4.1, DatabaseMetaData.getColumns() calls against a SQL warehouse went from sub-second to 20–165 seconds per call. Per the 3.4.1 changelog, getColumns/getTables/getSchemas now execute SQL SHOW commands instead of Thrift metadata RPCs.
Each metadata call is now a full statement execution on the warehouse (queued and polled like any query), so under concurrent load metadata discovery becomes slower by two orders of magnitude
Measurements
Same test suite, same warehouse, same CI executor; only the driver version changed:
| Driver version |
3.3.3 |
3.4.1 |
| Full test module (17 tests, ~36 getColumns calls) |
2 min 38 s |
> 20 min (killed by CI timeout) |
| Individual getColumns latency |
sub-second |
20–165 s observed |
Our code calls getColumns once per table during schema discovery (to introspecting a Databricks catalog). With two test modules running in parallel against the warehouse, each module completed only ~36 metadata calls in 19 minutes — an average of one every ~32 s
Expected behavior
Metadata discovery latency comparable to 3.3.3
Client Environment (please complete the following information):
- OS: Linux
- Java version: Java 25
- Java vendor: Azul
- Driver Version: 3.4.1 (using jdbc-thin)
Additional context
While investigating we also found that each SHOW-based getColumns call floods the DriverManager log writer with caught Invalid column index stack traces — reported separately in #1490
Describe the bug
After upgrading from 3.3.3 to 3.4.1,
DatabaseMetaData.getColumns()calls against a SQL warehouse went from sub-second to 20–165 seconds per call. Per the 3.4.1 changelog, getColumns/getTables/getSchemas now execute SQL SHOW commands instead of Thrift metadata RPCs.Each metadata call is now a full statement execution on the warehouse (queued and polled like any query), so under concurrent load metadata discovery becomes slower by two orders of magnitude
Measurements
Same test suite, same warehouse, same CI executor; only the driver version changed:
Our code calls
getColumnsonce per table during schema discovery (to introspecting a Databricks catalog). With two test modules running in parallel against the warehouse, each module completed only ~36 metadata calls in 19 minutes — an average of one every ~32 sExpected behavior
Metadata discovery latency comparable to 3.3.3
Client Environment (please complete the following information):
Additional context
While investigating we also found that each SHOW-based
getColumnscall floods theDriverManagerlog writer with caught Invalid column index stack traces — reported separately in #1490