Skip to content

[BUG] Connection evictor threads accumulate over time with Databricks JDBC driver 3.1.1 in long‑running service #1221

@aliakseistruh-dr

Description

@aliakseistruh-dr

Describe the bug
In a long‑running Spring Boot WebFlux service using com.databricks:databricks-jdbc:3.1.1, the number of JVM threads steadily increases over time and never comes back down, even when the service is idle and all JDBC connections created by the application are closed.

Thread dumps show an increasing number of threads named "Connection evictor" in TIMED_WAITING state, with stacks rooted in com.databricks.internal.apache.http.impl.client.IdleConnectionEvictor$1.run. From the driver source (DatabricksHttpClient), these threads are created per DatabricksHttpClient and only stopped in DatabricksHttpClient.close(). It appears some driver code paths create new DatabricksHttpClient instances without reliably closing them, so their evictor threads accumulate over time.

To Reproduce
High‑level steps (simplified):

  1. Create a long‑running JVM service (Spring Boot WebFlux) that:
  • Uses DriverManager.getConnection("jdbc:databricks://...", properties) to talk to Databricks SQL.
  • Runs queries via jOOQ / JDBC inside Reactor (many requests over hours).
  1. Scope each JDBC Connection via a helper that always calls connection.close() on complete, error, and cancel using Mono.usingWhen / Flux.usingWhen. Example:
   fun <T : Any> withContext(..., op: (CloseableDSLContext) -> Mono<T>): Mono<T> =
       Mono.usingWhen(
           acquire(...),                  // Mono<CloseableDSLContext>
           op,
           { ctx -> release(ctx) },       // on complete
           { ctx, err -> release(ctx) },  // on error
           { ctx -> release(ctx) }        // on cancel
       )

   private fun release(ctx: CloseableDSLContext): Mono<Void> =
       Mono.fromRunnable { ctx.close() }
  1. Instrument connection lifecycle with simple counters (totalConnections, activeConnections) and run a perf test for several hours.
  2. Observe:
  • activeConnections returns to 0 when the service goes idle.
  • JVM thread count grows over time and never decreases.
  1. Take a thread dump after some runtime and inspect threads named "Connection evictor".
    Example Connection evictor thread from the dump:
{
  "threadName": "Connection evictor",
  "threadState": "TIMED_WAITING",
  "stackTrace": [
    { "className": "java.lang.Thread", "methodName": "sleep" },
    {
      "className": "com.databricks.internal.apache.http.impl.client.IdleConnectionEvictor$1",
      "fileName": "IdleConnectionEvictor.java",
      "lineNumber": 66,
      "methodName": "run"
    },
    { "className": "java.lang.Thread", "methodName": "run" }
  ]
}

Over time, more of these threads appear and none go away.

Expected behavior

  • Driver‑owned background threads (especially IdleConnectionEvictor threads) should remain bounded in number in a long‑running service.
  • When a DatabricksHttpClient (or any higher‑level client that owns it) is no longer in use, its associated IdleConnectionEvictor thread should be stopped by reliably calling DatabricksHttpClient.close() internally.
  • Application code that correctly closes JDBC connections (and, where applicable, public volume clients) should not see an unbounded increase in "Connection evictor" threads.

Screenshots
N/A for now – behavior is visible in thread dumps and thread count metrics (can attach if needed).

Client side logs

  • Application logs show all JDBC connections being closed (instrumented via counters).
  • No client‑side errors from the driver; the only symptom is growing "Connection evictor" thread count.
  • Representative thread dump snippet shown above.

Client Environment (please complete the following information):

  • OS: macOS
  • Java version: 17.0.18
  • Java vendor: OpenJDK
  • Driver Version: 3.1.1
  • BI Tool (if used): N/A (custom Spring Boot WebFlux service)
  • BI Tool version: N/A

Additional context
From your source code (DatabricksHttpClient):

public class DatabricksHttpClient implements IDatabricksHttpClient, Closeable {

  private IdleConnectionEvictor idleConnectionEvictor;

  DatabricksHttpClient(IDatabricksConnectionContext ctx, HttpClientType type) {
    connectionManager = initializeConnectionManager(ctx);
    httpClient = makeClosableHttpClient(ctx, type);
    idleConnectionEvictor =
        new IdleConnectionEvictor(
            connectionManager, ctx.getIdleHttpConnectionExpiry(), TimeUnit.SECONDS);
    idleConnectionEvictor.start();
    asyncClient = GlobalAsyncHttpClient.getClient();
  }

  @Override
  public void close() throws IOException {
    if (idleConnectionEvictor != null) {
      idleConnectionEvictor.shutdown();
    }
    if (httpClient != null) {
      httpClient.close();
    }
    if (connectionManager != null) {
      connectionManager.shutdown();
    }
    if (asyncClient != null) {
      GlobalAsyncHttpClient.releaseClient();
      asyncClient = null;
    }
  }
}

This clearly ties the lifecycle of the IdleConnectionEvictor thread to DatabricksHttpClient.close().

Given that:

  • Our application code:
    • Only uses the public JDBC API (DriverManager.getConnection(...)).
    • Closes all JDBC connections.
    • Does not directly create or manage DatabricksHttpClient.
  • Yet we still see more and more "Connection evictor" threads over time

it appears that some internal code paths are creating DatabricksHttpClient instances without reliably invoking close() when those clients are no longer needed.

From an operational standpoint, this shows up as an apparent “thread leak” attributable to the driver when running in a long‑lived JVM, even though the application itself is correctly closing JDBC connections.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions