[PECOBLR-25] Add server feature flag consumption in the driver by samikshya-db · Pull Request #828 · databricks/databricks-jdbc

samikshya-db · 2025-05-16T08:45:13Z

Description

This PR adds client implementation of feature flag complex, i.e., achieves the following :

Add server feature flag consumption in the repo
Consume the telemetry feature flag in the telemetry client creation
Remove the fetching of DatabricksConfig from threadContext (We instead fetch it from connectionContext for fetching server side feature flag context)
As we are adding server side flag implementation, it would require the auth headers. Hence, the no-auth endpoint is no longer valid in context of this driver. This PR removes instances of no-auth endpoint too.

Testing

Added unit tests

Additional Notes to the Reviewer

The server side changes is yet to be reviewed and merged : https://github.com/databricks-eng/universe/pull/1038986, I will be testing and raising a new PR (if at all new changes are required) once the changes are merged. (We need to launch telemetry ASAP, hence working on both in parallel)
Additional context on client side caching for feature flags : we have decided to cache on compute level rather than connectionContext level or driver level. More details here : https://databricks.atlassian.net/browse/PECOBLR-25?focusedCommentId=6550419

NO_CHANGELOG=true

vikrantpuppala · 2025-05-20T14:03:21Z

i think this should be e, message?

i feel the args of our logger are very confusing, i keep making the same mistake everytime with our logger impl, we should improve this long term

For error logs, e should be 1st param

vikrantpuppala · 2025-05-20T14:05:33Z

I know this is the pattern across jdbc but I don't know why we started creating factory classes as singleton, is there a reason you know?

if not, can we stop using this pattern going forward and just have a simple factory pattern without the singleton-ness?

Just skimmed through this class. It is not a stateless class and manages the reusable ClientConfigurator beans (again not sure of the whole context around ClientConfigurator instances). in that sense singleton should be fine imo?

the reusable ClientConfigurator instances could just be stored in a static map (instances below changed to static), no?

static map should be fine too, that is another way of imposing singleton instances. We are doing same thing by having the static factory.

if not, can we stop using this pattern going forward and just have a simple factory pattern without the singleton-ness?

I agree this is a pattern that I do not understand myself. (and like you pointed out, I have used the other way in the factoryFlagContext). I don't think there is any major differences to the pattern though.

static map should be fine too, that is another way of imposing singleton instances. We are doing same thing by having the static factory.

yes it's similar. I think this is not an anti-pattern at all. the name factory is misleading. I will rename such factory instances to Handler like DatabricksAuthClientHandler, DatabricksHttpClientHandler, etc.

vikrantpuppala · 2025-05-20T14:17:38Z

might be better to do statusCode / 100 == 2

We are expecting a 200 only from the server, I don't think we would want to include 2xx codes at this point.

but will the response be available in non-200 cases?

vikrantpuppala · 2025-05-20T14:19:49Z

should we make this function future-proof accounting for non-boolean flags?

[c1] I will skip it for now. I know a JSON will come our way (based on https://github.com/databricks-eng/universe/pull/1038986/files?w=1). I will make changes after server changes are merged and we know what the response will look like.

+1. We will get all kind of values in the feature value, that ideally itself should be a JSON. Feature enabled will be specific to a particular feature.

I don't think it'll be a json unless we plan on deviating from the existing safe implementation. The allowed types for a safe flag are Integer, String, Boolean, Long, Double, String List

I don't think it'll be a json unless we plan on deviating from the existing safe implementation

The server implementation PR indicates that the response can be JSON. This is WIP and will change accordingly.

vikrantpuppala · 2025-05-20T14:20:39Z

❤️ not using singleton for this factory

gopalldb · 2025-05-21T09:02:52Z

Nice catch!

gopalldb · 2025-05-21T09:09:14Z

handle this gracefully:

The feature name may not exist (null, treat as false)

It may not be boolean (parsing error, treat as false)

Skipping this because:

This is handled by hashmap.get

This is handled by Boolean.parseBoolean

jayantsing-db · 2025-05-21T09:14:58Z

It looks like that for each connection, we will have feature flag context. That is okay. But right now, for each connection, it will send a request to connector-service. This can have serious implications to connector-service and safe-service given our user-base and concurrent environments. The connector-service includes a caching TTL in response. We should consume this field carefully and have appropriate caching on the client side.

gopalldb · 2025-05-21T09:16:06Z

IIUC, we also send expiry time from server. If that is present, we should cache that as well, and keep updating this after reaching the threshold.

We can use readymade cacheLoader for doing this kind of stuff. Check LoadingCache from Guava: https://medium.com/@ramachandrankrish/building-an-in-memory-cache-with-ttl-expiry-using-guava-and-expiringmap-d767e12b4c1b

Instead of session-Id, we should cache this on key cluster-resource-Id (warehouse-Id or workspace-Id). This will allow reusing the value for another connection for the same cluster resource.

https://databricks.atlassian.net/browse/PECOBLR-25?focusedCommentId=6550419 : more context here after the discussion with you and Jayant.

Check LoadingCache from Guava

Nice suggestion 😄 Implemented ✅

gopalldb · 2025-05-26T05:21:28Z

we can instead use refreshAfterWrite(..), it will thus refresh asynchronously, and save latency when value is needed

refreshAfterWrite does not apply to this use-case as we are fetching the feature flags in bulk. Guava LoadingCache expects per-key refresh, not full map refresh

Why do we need per key expiry here? Won't all the keys be fetched in one shot? I had thought that cache would be keyed on compute resource-Id.

We are not going to have per feature flag separate expiry, thus this expiry will be irrelevant.

I think we can do like this:

CacheBuilder => compute resource-Id -> featureFlags
featureFlags can be regular map, Map<String, Values>

Got it, looks like you are talking about LoadingCache here. Makes sense, implemented ✅

Add server feature flag consumption in the driver

1f525c6

samikshya-db temporarily deployed to azure-prod May 16, 2025 08:46 — with GitHub Actions Inactive

use auth accessor to fetch clientConfigurator

08528e4

samikshya-db temporarily deployed to azure-prod May 18, 2025 07:28 — with GitHub Actions Inactive

samikshya-db added 2 commits May 18, 2025 15:14

Add feature flag context

187a86f

Changes minus tests

b2bcd4a

samikshya-db temporarily deployed to azure-prod May 19, 2025 06:19 — with GitHub Actions Inactive

fix tests

6156dea

samikshya-db temporarily deployed to azure-prod May 19, 2025 12:29 — with GitHub Actions Inactive

Add more tests

8e2da85

samikshya-db temporarily deployed to azure-prod May 19, 2025 16:25 — with GitHub Actions Inactive

samikshya-db requested review from gopalldb and jayantsing-db May 19, 2025 16:25

samikshya-db changed the title ~~[PECOBLR-25] [Draft] Add server feature flag consumption in the driver~~ [PECOBLR-25] Add server feature flag consumption in the driver May 19, 2025

gopalldb requested a review from vikrantpuppala May 20, 2025 06:41

vikrantpuppala reviewed May 20, 2025

View reviewed changes

gopalldb reviewed May 21, 2025

View reviewed changes

Comment thread src/main/java/com/databricks/jdbc/exception/DatabricksChunkDownloadException.java Outdated

gopalldb reviewed May 21, 2025

View reviewed changes

samikshya-db and others added 2 commits May 22, 2025 13:31

Address comments

5db170a

Merge branch 'main' into samikshya-chand_data/telemetryClosure

09b41f8

samikshya-db temporarily deployed to azure-prod May 22, 2025 08:03 — with GitHub Actions Inactive

consume TTL

6bf76b5

samikshya-db temporarily deployed to azure-prod May 22, 2025 21:10 — with GitHub Actions Inactive

Make use of thread safe google cache builder

fd59e21

samikshya-db temporarily deployed to azure-prod May 23, 2025 12:27 — with GitHub Actions Inactive

samikshya-db temporarily deployed to azure-prod May 23, 2025 12:28 — with GitHub Actions Inactive

samikshya-db requested review from gopalldb and vikrantpuppala May 23, 2025 12:30

gopalldb reviewed May 26, 2025

View reviewed changes

samikshya-db requested a review from gopalldb May 28, 2025 10:40

Implement loading refresh in guava cache

ab47842

samikshya-db temporarily deployed to azure-prod May 29, 2025 13:26 — with GitHub Actions Inactive

gopalldb approved these changes May 30, 2025

View reviewed changes

samikshya-db merged commit 4d420f2 into databricks:main May 30, 2025
16 checks passed

samikshya-db deleted the samikshya-chand_data/telemetryClosure branch May 30, 2025 07:09

Conversation

samikshya-db commented May 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Testing

Additional Notes to the Reviewer

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vikrantpuppala May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

samikshya-db May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jayantsing-db commented May 21, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

samikshya-db May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

samikshya-db commented May 16, 2025 •

edited

Loading

vikrantpuppala May 20, 2025 •

edited

Loading

samikshya-db May 22, 2025 •

edited

Loading

samikshya-db May 23, 2025 •

edited

Loading