Skip to content

feat(csharp/src/Drivers/Databricks): Implement ClientCredentialsProvider#2743

Merged
CurtHagenlocher merged 11 commits into
apache:mainfrom
toddmeng-db:toddmeng-db/client-creds-1
Apr 29, 2025
Merged

feat(csharp/src/Drivers/Databricks): Implement ClientCredentialsProvider#2743
CurtHagenlocher merged 11 commits into
apache:mainfrom
toddmeng-db:toddmeng-db/client-creds-1

Conversation

@toddmeng-db

@toddmeng-db toddmeng-db commented Apr 25, 2025

Copy link
Copy Markdown
Contributor

Implements OAuth 2.0 Client Credentials flow for machine-to-machine (M2M) authentication in the Databricks ADBC driver.

This is the first PR in a series to add complete OAuth client credentials support. This PR focuses on the core token
provider infrastructure.

Changes

  • New OAuthClientCredentialsProvider class: Handles OAuth token lifecycle
    • Fetches access tokens from Databricks OAuth service using client credentials
    • Implements automatic token refresh based on expiration
    • Caches tokens to minimize authentication requests
    • Thread-safe token management
  • Unit tests in OAuthClientCredentialsProviderTests:
    • Token retrieval and caching
    • Automatic refresh when token expires
    • Error handling for invalid credentials
  • Test configuration support for integration testing

Testing

Unit Tests

Run the OAuth client credentials tests:

cd csharp
dotnet test --filter "FullyQualifiedName~OAuthClientCredentialsServiceTests"

Integration Testing

To test with real credentials:

1. Create a databricks_test_config.json:
{
  "oauth_client_id": "your-client-id",
  "oauth_client_secret": "your-client-secret",
  "host": "your-workspace.databricks.com"
}
2. Set the environment variable:
  - macOS/Linux:
  export DATABRICKS_TEST_CONFIG_FILE=/path/to/databricks_test_config.json
  - Windows PowerShell:
  $env:DATABRICKS_TEST_CONFIG_FILE = "C:\path\to\databricks_test_config.json"
3. Run the tests (see command above)

Follow-up Work

- Integrate OAuth client credentials provider with the Databricks driver connection flow
- Add support for OAuth scopes configuration
- Integrate with existing authentication parameter handling

Authentication Flow

The provider implements the standard OAuth 2.0 Client Credentials grant:
1. Client sends credentials to OAuth token endpoint
2. Server validates and returns access token with expiration
3. Provider caches token and reuses until near expiration
4. Provider automatically refreshes token when needed

@toddmeng-db toddmeng-db changed the title [databricks] Implement ClientCredentialsProvider [databricks] Implement ClientCredentialsProvider (work in progress) Apr 25, 2025
@toddmeng-db toddmeng-db force-pushed the toddmeng-db/client-creds-1 branch from 0e1afe3 to 217c58a Compare April 25, 2025 23:41
/// <summary>
/// Service for obtaining OAuth access tokens using the client credentials grant type.
/// </summary>
internal class OAuthClientCredentialsService : IDisposable

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename this, it is not a service maybe call it a provider

/// <param name="cancellationToken">A cancellation token to cancel the operation.</param>
/// <returns>The access token.</returns>
/// <exception cref="DatabricksException">Thrown when the token request fails or the response is invalid.</exception>
public async Task<string> GetAccessTokenAsync(CancellationToken cancellationToken = default)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need it to be async? This is get credentials, which should be blocking?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like it's a bit awkward to implement a non-async method when there's an http call involved in c#?

public Credentials GetCredentials()
{
    var response = Task.Run(() => _httpClient.GetAsync(...))
                       .Unwrap()
                       .GetAwaiter()
                       .GetResult();
    // process response...
    return credentials;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I guess we need to take a look at the callsite, maybe check SparkHttpConnection to see how we pass those token as the header, I believe those functions are all sync functions, which means it should call sync functions as well.
Maybe we should just expose a sync version of GetAccessToken() which return GetAccessTokenAsync().GetAwaiter().GetResult();

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, what's the concern with using async here?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no concern of using async, but c# appears need to follow a pattern than only async function can call async function. If you integrate with the rest of the driver you can see all the callsite is not async functions. So you must somehow return some results immediately, or else you will only get the Task.

@toddmeng-db toddmeng-db Apr 28, 2025

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I noticed that too...

Comment thread csharp/src/Drivers/Databricks/Auth/OAuthClientCredentialsService.cs Outdated
@toddmeng-db toddmeng-db force-pushed the toddmeng-db/client-creds-1 branch from 96bdd44 to efe06c2 Compare April 28, 2025 05:23
@CurtHagenlocher CurtHagenlocher changed the title [databricks] Implement ClientCredentialsProvider (work in progress) feat(csharp/src/Drivers/Databricks): Implement ClientCredentialsProvider (work in progress) Apr 28, 2025
@toddmeng-db toddmeng-db marked this pull request as ready for review April 29, 2025 00:34
@toddmeng-db toddmeng-db changed the title feat(csharp/src/Drivers/Databricks): Implement ClientCredentialsProvider (work in progress) feat(csharp/src/Drivers/Databricks): Implement ClientCredentialsProvider Apr 29, 2025
@github-actions github-actions Bot added this to the ADBC Libraries 18 milestone Apr 29, 2025
Comment thread csharp/src/Drivers/Databricks/Auth/OAuthClientCredentialsProvider.cs Outdated
Comment thread csharp/src/Drivers/Databricks/Auth/OAuthClientCredentialsProvider.cs Outdated
@toddmeng-db toddmeng-db force-pushed the toddmeng-db/client-creds-1 branch 4 times, most recently from 8c66a9c to 1d3edf8 Compare April 29, 2025 17:44
@toddmeng-db toddmeng-db force-pushed the toddmeng-db/client-creds-1 branch from 1d3edf8 to ae6bb29 Compare April 29, 2025 17:44

@CurtHagenlocher CurtHagenlocher left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I take it there will be a followup where this is hooked up to the driver?

@CurtHagenlocher CurtHagenlocher merged commit a88751d into apache:main Apr 29, 2025
colin-rogers-dbt pushed a commit to dbt-labs/arrow-adbc that referenced this pull request Jun 10, 2025
…der (apache#2743)

First PR for Class to get token via oauth service for M2M
authentication. Includes simple expiration and refresh logic.

SDK refresh logic
[here](https://github.com/databricks/databricks-sdk-java/blob/main/databricks-sdk-java/src/main/java/com/databricks/sdk/core/oauth/RefreshableTokenSource.java)

Follow up is to integrate with the rest of the driver


To test out:
1. Create a `databricks_test_config.json`
```
   {
     "oauth_client_id": "...",
     "oauth_client_secret": "...",
     "host": "databricks....com" // workspace hostname
   }
```
2.
   On macOS/Linux
export
DATABRICKS_TEST_CONFIG_FILE=/path/to/your/databricks_test_config.json
   
   On Windows PowerShell
$env:DATABRICKS_TEST_CONFIG_FILE =
"C:\path\to\your\databricks_test_config.json"

3.
```/csharp% dotnet test --filter "FullyQualifiedName~OAuthClientCredentialsServiceTests"```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants