Skip to content

Memory leak with sockets (SSLSocketImpl) in MonitoredCache #2129

@csolem

Description

@csolem

Bug Description

After upgrading cloud-sql-connector-jdbc-sqlserver from 1.23.1 to version 1.24.0 we have experienced several occasional memory leaks in a couple of our applications after running for a hours or even days.

InternalConnectorRegistry holds a map of unnamedConnectors. One of the entries in that map is an instance of MonitoredCache that seems to have been introduced in #2056

MonitoredCache has a field List<Socket> sockets, and when we observe the memory leak, this list has grown to a large size. In our last case, this list held 91568 instances of sun.security.ssl.SSLSocketImpl with a total memory consumption 3.7GB according to the heap dump.

https://github.com/GoogleCloudPlatform/cloud-sql-jdbc-socket-factory/blob/v1.24.0/core/src/main/java/com/google/cloud/sql/core/MonitoredCache.java#L41

Applications can run for days before this happens, We have not figured out what the trigger is. (Our applications have used the cloud-sql-connector-jdbc-sqlserver dependency for years without issues, and the memory leak did only occur after the upgrade to 1.24.0)

The screenshot below shows the relevant part from the heap dump taken from the application during the memory leak. It has been imported into Intellij for investigation, and it clearly shows the memory consumption in relation to MonitoredCache.sockets:
Image

Relevant parts of dependency tree

[INFO] +- com.google.cloud.sql:cloud-sql-connector-jdbc-sqlserver:jar:1.24.0:compile
[INFO] |  \- com.google.cloud.sql:jdbc-socket-factory-core:jar:1.24.0:compile
[INFO] |     +- com.github.jnr:jnr-enxio:jar:0.32.18:compile
[INFO] |     \- com.google.apis:google-api-services-sqladmin:jar:v1beta4-rev20250310-2.0.0:compile
[INFO] +- com.microsoft.sqlserver:mssql-jdbc:jar:12.10.0.jre11:compile

Application runs in kubernetes on GKE with the base image eclipse-temurin:21-jre-alpine. Java application is built with jib maven plugin 3.4.4 and java compile/release version is 21. Application consumes messages from pubsub pull and writes to a mssql database using the socket factory

Example code (or command)

// JDBC url
jdbc:sqlserver://localhost;databaseName=<databaseName>;socketFactoryConstructorArg=<dbInstance>?ipTypes=PRIVATE;socketFactoryClass=com.google.cloud.sql.sqlserver.SocketFactory;trustServerCertificate=true;encrypt=true;socketTimeout=120000;queryTimeout=30;cancelQueryTimeout=20;

// Setting up data source
  public static HikariDataSource getDataSource(final ParameterResolver parameterResolver, final String databasePassword) {
    final String dbUrl = "jdbc:sqlserver://localhost;databaseName=<databaseName>;socketFactoryConstructorArg=<dbInstance>?ipTypes=PRIVATE;socketFactoryClass=com.google.cloud.sql.sqlserver.SocketFactory;trustServerCertificate=true;encrypt=true;socketTimeout=120000;queryTimeout=30;cancelQueryTimeout=20;";
    final String dbUsername = "username";
    final HikariDataSource dataSource = new HikariDataSource();
    dataSource.setDriverClassName("com.microsoft.sqlserver.jdbc.SQLServerDriver");
    dataSource.setJdbcUrl(dbUrl);
    dataSource.setUsername(dbUsername);
    dataSource.setPassword(databasePassword);
    dataSource.setLeakDetectionThreshold(40000);
    dataSource.setKeepaliveTime(30000);
    dataSource.setMaxLifetime(60000);
    dataSource.setTransactionIsolation("TRANSACTION_READ_COMMITTED");
    final int maximumPoolSize = 20;
    dataSource.setMaximumPoolSize(maximumPoolSize);
    return dataSource;
  }

Stacktrace

Steps to reproduce?

It is hard to reproduce this memory leak issue. Will provide more information if possible.

Environment

  1. OS type and version: eclipse-temurin:21-jre-alpine
  2. Java SDK version: OpenJDK 64-Bit Server VM Temurin-21.0.6+7 (build 21.0.6+7-LTS, mixed mode, sharing)
  3. Cloud SQL Java Socket Factory version: 1.24.0

Additional Details

No response

Metadata

Metadata

Assignees

Labels

priority: p0Highest priority. Critical issue. P0 implies highest priority.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions