[PECOBLR-25] Add server feature flag consumption in the driver#828
Conversation
There was a problem hiding this comment.
i think this should be e, message?
i feel the args of our logger are very confusing, i keep making the same mistake everytime with our logger impl, we should improve this long term
There was a problem hiding this comment.
For error logs, e should be 1st param
There was a problem hiding this comment.
I know this is the pattern across jdbc but I don't know why we started creating factory classes as singleton, is there a reason you know?
if not, can we stop using this pattern going forward and just have a simple factory pattern without the singleton-ness?
There was a problem hiding this comment.
Just skimmed through this class. It is not a stateless class and manages the reusable ClientConfigurator beans (again not sure of the whole context around ClientConfigurator instances). in that sense singleton should be fine imo?
There was a problem hiding this comment.
the reusable ClientConfigurator instances could just be stored in a static map (instances below changed to static), no?
There was a problem hiding this comment.
static map should be fine too, that is another way of imposing singleton instances. We are doing same thing by having the static factory.
There was a problem hiding this comment.
if not, can we stop using this pattern going forward and just have a simple factory pattern without the singleton-ness?
I agree this is a pattern that I do not understand myself. (and like you pointed out, I have used the other way in the factoryFlagContext). I don't think there is any major differences to the pattern though.
There was a problem hiding this comment.
static map should be fine too, that is another way of imposing singleton instances. We are doing same thing by having the static factory.
yes it's similar. I think this is not an anti-pattern at all. the name factory is misleading. I will rename such factory instances to Handler like DatabricksAuthClientHandler, DatabricksHttpClientHandler, etc.
There was a problem hiding this comment.
might be better to do statusCode / 100 == 2
There was a problem hiding this comment.
We are expecting a 200 only from the server, I don't think we would want to include 2xx codes at this point.
There was a problem hiding this comment.
but will the response be available in non-200 cases?
There was a problem hiding this comment.
should we make this function future-proof accounting for non-boolean flags?
There was a problem hiding this comment.
[c1] I will skip it for now. I know a JSON will come our way (based on https://github.com/databricks-eng/universe/pull/1038986/files?w=1). I will make changes after server changes are merged and we know what the response will look like.
There was a problem hiding this comment.
+1. We will get all kind of values in the feature value, that ideally itself should be a JSON. Feature enabled will be specific to a particular feature.
There was a problem hiding this comment.
I don't think it'll be a json unless we plan on deviating from the existing safe implementation. The allowed types for a safe flag are Integer, String, Boolean, Long, Double, String List
There was a problem hiding this comment.
I don't think it'll be a json unless we plan on deviating from the existing safe implementation
The server implementation PR indicates that the response can be JSON. This is WIP and will change accordingly.
There was a problem hiding this comment.
❤️ not using singleton for this factory
There was a problem hiding this comment.
handle this gracefully:
- The feature name may not exist (null, treat as false)
- It may not be boolean (parsing error, treat as false)
There was a problem hiding this comment.
Skipping this because:
- This is handled by
hashmap.get - This is handled by
Boolean.parseBoolean
|
It looks like that for each connection, we will have feature flag context. That is okay. But right now, for each connection, it will send a request to connector-service. This can have serious implications to connector-service and safe-service given our user-base and concurrent environments. The connector-service includes a caching TTL in response. We should consume this field carefully and have appropriate caching on the client side. |
There was a problem hiding this comment.
IIUC, we also send expiry time from server. If that is present, we should cache that as well, and keep updating this after reaching the threshold.
We can use readymade cacheLoader for doing this kind of stuff. Check LoadingCache from Guava: https://medium.com/@ramachandrankrish/building-an-in-memory-cache-with-ttl-expiry-using-guava-and-expiringmap-d767e12b4c1b
There was a problem hiding this comment.
Instead of session-Id, we should cache this on key cluster-resource-Id (warehouse-Id or workspace-Id). This will allow reusing the value for another connection for the same cluster resource.
There was a problem hiding this comment.
https://databricks.atlassian.net/browse/PECOBLR-25?focusedCommentId=6550419 : more context here after the discussion with you and Jayant.
There was a problem hiding this comment.
Check LoadingCache from Guava
Nice suggestion 😄 Implemented ✅
There was a problem hiding this comment.
we can instead use refreshAfterWrite(..), it will thus refresh asynchronously, and save latency when value is needed
There was a problem hiding this comment.
refreshAfterWrite does not apply to this use-case as we are fetching the feature flags in bulk. Guava LoadingCache expects per-key refresh, not full map refresh
There was a problem hiding this comment.
Why do we need per key expiry here? Won't all the keys be fetched in one shot? I had thought that cache would be keyed on compute resource-Id.
We are not going to have per feature flag separate expiry, thus this expiry will be irrelevant.
I think we can do like this:
CacheBuilder => compute resource-Id -> featureFlags
featureFlags can be regular map, Map<String, Values>
There was a problem hiding this comment.
Got it, looks like you are talking about LoadingCache here. Makes sense, implemented ✅
Description
This PR adds client implementation of feature flag complex, i.e., achieves the following :
Testing
Additional Notes to the Reviewer
NO_CHANGELOG=true