Skip to content

Commit 5aa9f1e

Browse files
authored
feat(csharp/src/Drivers/BigQuery): Add support for AAD/Entra authentication (#2655)
- Adds support for users to login with their Entra / Azure AD account - Adds a retry concept to the driver that will check whether a token needs to be refreshed and then invoke a delegate so an outside caller can perform the token update. Will only go this path if the user has defined a handler for UpdateToken. - Includes long running tests to demonstrate the concept: ![image](https://github.com/user-attachments/assets/0d633848-052d-417e-994b-10138b5e30a7) --------- Co-authored-by: David Coe <>
1 parent a353b96 commit 5aa9f1e

17 files changed

Lines changed: 966 additions & 148 deletions

csharp/src/Drivers/BigQuery/BigQueryConnection.cs

Lines changed: 217 additions & 76 deletions
Large diffs are not rendered by default.

csharp/src/Drivers/BigQuery/BigQueryParameters.cs

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,8 +20,10 @@ namespace Apache.Arrow.Adbc.Drivers.BigQuery
2020
/// <summary>
2121
/// Parameters used for connecting to BigQuery data sources.
2222
/// </summary>
23-
public class BigQueryParameters
23+
internal class BigQueryParameters
2424
{
25+
public const string AccessToken = "adbc.bigquery.access_token";
26+
public const string AudienceUri = "adbc.bigquery.audience_uri";
2527
public const string ProjectId = "adbc.bigquery.project_id";
2628
public const string BillingProjectId = "adbc.bigquery.billing_project_id";
2729
public const string ClientId = "adbc.bigquery.client_id";
@@ -36,6 +38,8 @@ public class BigQueryParameters
3638
public const string Scopes = "adbc.bigquery.scopes";
3739
public const string IncludeConstraintsWithGetObjects = "adbc.bigquery.include_constraints_getobjects";
3840
public const string ClientTimeout = "adbc.bigquery.client.timeout";
41+
public const string MaximumRetryAttempts = "adbc.bigquery.maximum_retries";
42+
public const string RetryDelayMs = "adbc.bigquery.retry_delay_ms";
3943
public const string GetQueryResultsOptionsTimeout = "adbc.bigquery.get_query_results_options.timeout";
4044
public const string MaxFetchConcurrency = "adbc.bigquery.max_fetch_concurrency";
4145
public const string IncludePublicProjectId = "adbc.bigquery.include_public_project_id";
@@ -47,13 +51,21 @@ public class BigQueryParameters
4751
/// <summary>
4852
/// Constants used for default parameter values.
4953
/// </summary>
50-
public class BigQueryConstants
54+
internal class BigQueryConstants
5155
{
5256
public const string UserAuthenticationType = "user";
57+
public const string EntraIdAuthenticationType = "aad";
5358
public const string ServiceAccountAuthenticationType = "service";
5459
public const string TokenEndpoint = "https://accounts.google.com/o/oauth2/token";
5560
public const string TreatLargeDecimalAsString = "true";
5661

62+
// Entra ID / Azure AD constants
63+
public const string EntraGrantType = "urn:ietf:params:oauth:grant-type:token-exchange";
64+
public const string EntraSubjectTokenType = "urn:ietf:params:oauth:token-type:id_token";
65+
public const string EntraRequestedTokenType = "urn:ietf:params:oauth:token-type:access_token";
66+
public const string EntraIdScope = "https://www.googleapis.com/auth/cloud-platform";
67+
public const string EntraStsTokenEndpoint = "https://sts.googleapis.com/v1/token";
68+
5769
// default value per https://pkg.go.dev/cloud.google.com/go/bigquery#section-readme
5870
public const string DetectProjectId = "*detect-project-id*";
5971
}

csharp/src/Drivers/BigQuery/BigQueryStatement.cs

Lines changed: 157 additions & 57 deletions
Large diffs are not rendered by default.
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
/*
2+
* Licensed to the Apache Software Foundation (ASF) under one or more
3+
* contributor license agreements. See the NOTICE file distributed with
4+
* this work for additional information regarding copyright ownership.
5+
* The ASF licenses this file to You under the Apache License, Version 2.0
6+
* (the "License"); you may not use this file except in compliance with
7+
* the License. You may obtain a copy of the License at
8+
*
9+
* http://www.apache.org/licenses/LICENSE-2.0
10+
*
11+
* Unless required by applicable law or agreed to in writing, software
12+
* distributed under the License is distributed on an "AS IS" BASIS,
13+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
* See the License for the specific language governing permissions and
15+
* limitations under the License.
16+
*/
17+
18+
using System.Text.Json.Serialization;
19+
20+
namespace Apache.Arrow.Adbc.Drivers.BigQuery
21+
{
22+
/// <summary>
23+
/// The token response from BigQuery
24+
/// </summary>
25+
internal class BigQueryStsTokenResponse
26+
{
27+
[JsonPropertyName("access_token")]
28+
public string? AccessToken { get; set; }
29+
30+
[JsonPropertyName("issued_token_type")]
31+
public string? IssuedTokenType { get; set; }
32+
33+
[JsonPropertyName("token_type")]
34+
public string? TokenType { get; set; }
35+
36+
[JsonPropertyName("expires_in")]
37+
public int? ExpiresIn { get; set; }
38+
}
39+
}

csharp/src/Drivers/BigQuery/BigQueryTableTypes.cs

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,12 +14,10 @@
1414
* See the License for the specific language governing permissions and
1515
* limitations under the License.
1616
*/
17-
using System.Collections.Generic;
18-
1917
namespace Apache.Arrow.Adbc.Drivers.BigQuery
2018
{
2119
internal static class BigQueryTableTypes
2220
{
23-
public static readonly string[] TableTypes = new string[]{ "BASE TABLE", "VIEW", "CLONE", "SNAPSHOT" };
21+
public static readonly string[] TableTypes = new string[] { "BASE TABLE", "VIEW", "CLONE", "SNAPSHOT" };
2422
}
2523
}
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
/*
2+
* Licensed to the Apache Software Foundation (ASF) under one or more
3+
* contributor license agreements. See the NOTICE file distributed with
4+
* this work for additional information regarding copyright ownership.
5+
* The ASF licenses this file to You under the Apache License, Version 2.0
6+
* (the "License"); you may not use this file except in compliance with
7+
* the License. You may obtain a copy of the License at
8+
*
9+
* http://www.apache.org/licenses/LICENSE-2.0
10+
*
11+
* Unless required by applicable law or agreed to in writing, software
12+
* distributed under the License is distributed on an "AS IS" BASIS,
13+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
* See the License for the specific language governing permissions and
15+
* limitations under the License.
16+
*/
17+
18+
using System;
19+
using Google;
20+
21+
namespace Apache.Arrow.Adbc.Drivers.BigQuery
22+
{
23+
internal class BigQueryUtils
24+
{
25+
public static bool TokenRequiresUpdate(Exception ex)
26+
{
27+
bool result = false;
28+
29+
if (ex is GoogleApiException gaex && gaex.HttpStatusCode == System.Net.HttpStatusCode.Unauthorized)
30+
{
31+
result = true;
32+
}
33+
34+
return result;
35+
}
36+
}
37+
}
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
/*
2+
* Licensed to the Apache Software Foundation (ASF) under one or more
3+
* contributor license agreements. See the NOTICE file distributed with
4+
* this work for additional information regarding copyright ownership.
5+
* The ASF licenses this file to You under the Apache License, Version 2.0
6+
* (the "License"); you may not use this file except in compliance with
7+
* the License. You may obtain a copy of the License at
8+
*
9+
* http://www.apache.org/licenses/LICENSE-2.0
10+
*
11+
* Unless required by applicable law or agreed to in writing, software
12+
* distributed under the License is distributed on an "AS IS" BASIS,
13+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
* See the License for the specific language governing permissions and
15+
* limitations under the License.
16+
*/
17+
18+
using System;
19+
using System.Threading.Tasks;
20+
21+
namespace Apache.Arrow.Adbc.Drivers.BigQuery
22+
{
23+
/// <summary>
24+
/// Common interface for a token protected resource.
25+
/// </summary>
26+
internal interface ITokenProtectedResource
27+
{
28+
/// <summary>
29+
/// The function to call when updating the token.
30+
/// </summary>
31+
Func<Task>? UpdateToken { get; set; }
32+
33+
/// <summary>
34+
/// Determines the token needs to be updated.
35+
/// </summary>
36+
/// <param name="ex">The exception that occurs.</param>
37+
/// <returns>True/False indicating a refresh is needed.</returns>
38+
bool TokenRequiresUpdate(Exception ex);
39+
}
40+
}
Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+

2+
/*
3+
* Licensed to the Apache Software Foundation (ASF) under one or more
4+
* contributor license agreements. See the NOTICE file distributed with
5+
* this work for additional information regarding copyright ownership.
6+
* The ASF licenses this file to You under the Apache License, Version 2.0
7+
* (the "License"); you may not use this file except in compliance with
8+
* the License. You may obtain a copy of the License at
9+
*
10+
* http://www.apache.org/licenses/LICENSE-2.0
11+
*
12+
* Unless required by applicable law or agreed to in writing, software
13+
* distributed under the License is distributed on an "AS IS" BASIS,
14+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
* See the License for the specific language governing permissions and
16+
* limitations under the License.
17+
*/
18+
19+
using System;
20+
using System.Threading.Tasks;
21+
22+
namespace Apache.Arrow.Adbc.Drivers.BigQuery
23+
{
24+
/// <summary>
25+
/// Class that will retry calling a method with a backoff.
26+
/// </summary>
27+
internal class RetryManager
28+
{
29+
public static async Task<T> ExecuteWithRetriesAsync<T>(
30+
ITokenProtectedResource tokenProtectedResource,
31+
Func<Task<T>> action,
32+
int maxRetries = 5,
33+
int initialDelayMilliseconds = 200)
34+
{
35+
if (action == null)
36+
{
37+
throw new AdbcException("There is no method to retry", AdbcStatusCode.InvalidArgument);
38+
}
39+
40+
int retryCount = 0;
41+
int delay = initialDelayMilliseconds;
42+
43+
while (retryCount < maxRetries)
44+
{
45+
try
46+
{
47+
T result = await action();
48+
return result;
49+
}
50+
catch (Exception ex)
51+
{
52+
retryCount++;
53+
if (retryCount >= maxRetries)
54+
{
55+
if ((tokenProtectedResource?.UpdateToken != null))
56+
{
57+
if (tokenProtectedResource?.TokenRequiresUpdate(ex) == true)
58+
{
59+
throw new AdbcException($"Cannot update access token after {maxRetries} tries", AdbcStatusCode.Unauthenticated, ex);
60+
}
61+
}
62+
63+
throw new AdbcException($"Cannot execute {action.Method.Name} after {maxRetries} tries", AdbcStatusCode.UnknownError, ex);
64+
}
65+
66+
if ((tokenProtectedResource?.UpdateToken != null))
67+
{
68+
if (tokenProtectedResource.TokenRequiresUpdate(ex) == true)
69+
{
70+
await tokenProtectedResource.UpdateToken();
71+
}
72+
}
73+
74+
await Task.Delay(delay);
75+
delay = Math.Min(2 * delay, 5000);
76+
}
77+
}
78+
79+
throw new AdbcException($"Could not successfully call {action.Method.Name}", AdbcStatusCode.UnknownError);
80+
}
81+
}
82+
}
Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
/*
2+
* Licensed to the Apache Software Foundation (ASF) under one or more
3+
* contributor license agreements. See the NOTICE file distributed with
4+
* this work for additional information regarding copyright ownership.
5+
* The ASF licenses this file to You under the Apache License, Version 2.0
6+
* (the "License"); you may not use this file except in compliance with
7+
* the License. You may obtain a copy of the License at
8+
*
9+
* http://www.apache.org/licenses/LICENSE-2.0
10+
*
11+
* Unless required by applicable law or agreed to in writing, software
12+
* distributed under the License is distributed on an "AS IS" BASIS,
13+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
* See the License for the specific language governing permissions and
15+
* limitations under the License.
16+
*/
17+
18+
using System;
19+
using System.Threading.Tasks;
20+
using Google.Apis.Auth.OAuth2;
21+
using Google.Cloud.BigQuery.Storage.V1;
22+
23+
namespace Apache.Arrow.Adbc.Drivers.BigQuery
24+
{
25+
/// <summary>
26+
/// Manages a <see cref="BigQueryReadClient"/> that is protected by a token.
27+
/// </summary>
28+
internal class TokenProtectedReadClientManger : ITokenProtectedResource
29+
{
30+
BigQueryReadClient bigQueryReadClient;
31+
32+
public TokenProtectedReadClientManger(GoogleCredential credential)
33+
{
34+
UpdateCredential(credential);
35+
36+
if (bigQueryReadClient == null)
37+
{
38+
throw new InvalidOperationException("could not create a read client");
39+
}
40+
}
41+
42+
public BigQueryReadClient ReadClient => bigQueryReadClient;
43+
44+
public void UpdateCredential(GoogleCredential? credential)
45+
{
46+
if (credential == null)
47+
{
48+
throw new ArgumentNullException(nameof(credential));
49+
}
50+
51+
BigQueryReadClientBuilder readClientBuilder = new BigQueryReadClientBuilder();
52+
readClientBuilder.Credential = credential;
53+
this.bigQueryReadClient = readClientBuilder.Build();
54+
}
55+
56+
public Func<Task>? UpdateToken { get; set; }
57+
58+
public bool TokenRequiresUpdate(Exception ex) => BigQueryUtils.TokenRequiresUpdate(ex);
59+
}
60+
}

csharp/src/Drivers/BigQuery/readme.md

Lines changed: 32 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -34,13 +34,17 @@ The ADBC driver passes the configured credentials to BigQuery, but you may need
3434

3535
The following parameters can be used to configure the driver behavior. The parameters are case sensitive.
3636

37+
**adbc.bigquery.access_token**<br>
38+
&nbsp;&nbsp;&nbsp;&nbsp;Sets the access token to use as the credential. Currently, this is for Microsoft Entra, but this could be used for other OAuth implementations as well.
39+
40+
**adbc.bigquery.audience_uri**<br>
41+
&nbsp;&nbsp;&nbsp;&nbsp;Sets the audience URI for the authentication token. Currently, this is for Microsoft Entra, but this could be used for other OAuth implementations as well.
42+
3743
**adbc.bigquery.allow_large_results**<br>
3844
&nbsp;&nbsp;&nbsp;&nbsp;Sets the [AllowLargeResults](https://cloud.google.com/dotnet/docs/reference/Google.Cloud.BigQuery.V2/latest/Google.Cloud.BigQuery.V2.QueryOptions#Google_Cloud_BigQuery_V2_QueryOptions_AllowLargeResults) value of the QueryOptions to `true` if configured; otherwise, the default is `false`.
3945

4046
**adbc.bigquery.auth_type**<br>
41-
&nbsp;&nbsp;&nbsp;&nbsp;Required. Must be `user` or `service`
42-
43-
https://cloud.google.com/dotnet/docs/reference/Google.Cloud.BigQuery.V2/latest/Google.Cloud.BigQuery.V2.QueryOptions#Google_Cloud_BigQuery_V2_QueryOptions_AllowLargeResults
47+
&nbsp;&nbsp;&nbsp;&nbsp;Required. Must be `user`, `aad` (for Microsoft Entra) or `service`.
4448

4549
**adbc.bigquery.billing_project_id**<br>
4650
&nbsp;&nbsp;&nbsp;&nbsp;The [Project ID](https://cloud.google.com/resource-manager/docs/creating-managing-projects) used for accessing billing BigQuery. If not specified, will default to the detected project ID.
@@ -60,6 +64,9 @@ https://cloud.google.com/dotnet/docs/reference/Google.Cloud.BigQuery.V2/latest/G
6064
**adbc.bigquery.get_query_results_options.timeout**<br>
6165
&nbsp;&nbsp;&nbsp;&nbsp;Optional. Sets the timeout (in seconds) for the GetQueryResultsOptions value. If not set, defaults to 5 minutes. Similar to a CommandTimeout.
6266

67+
**adbc.bigquery.maximum_retries**<br>
68+
&nbsp;&nbsp;&nbsp;&nbsp;Optional. The maximum number of retries. Defaults to 5.
69+
6370
**adbc.bigquery.max_fetch_concurrency**<br>
6471
&nbsp;&nbsp;&nbsp;&nbsp;Optional. Sets the [maxStreamCount](https://cloud.google.com/dotnet/docs/reference/Google.Cloud.BigQuery.Storage.V1/latest/Google.Cloud.BigQuery.Storage.V1.BigQueryReadClient#Google_Cloud_BigQuery_Storage_V1_BigQueryReadClient_CreateReadSession_System_String_Google_Cloud_BigQuery_Storage_V1_ReadSession_System_Int32_Google_Api_Gax_Grpc_CallSettings_) for the CreateReadSession method. If not set, defaults to 1.
6572

@@ -75,18 +82,21 @@ https://cloud.google.com/dotnet/docs/reference/Google.Cloud.BigQuery.V2/latest/G
7582
**adbc.bigquery.include_constraints_getobjects**<br>
7683
&nbsp;&nbsp;&nbsp;&nbsp;Optional. Some callers do not need the constraint details when they get the table information and can improve the speed of obtaining the results. Setting this value to `"false"` will not include the constraint details. The default value is `"true"`.
7784

85+
**adbc.bigquery.include_public_project_id**<br>
86+
&nbsp;&nbsp;&nbsp;&nbsp;Include the `bigquery-public-data` project ID with the list of project IDs.
87+
7888
**adbc.bigquery.large_results_destination_table**<br>
7989
&nbsp;&nbsp;&nbsp;&nbsp;Optional. Sets the [DestinationTable](https://cloud.google.com/dotnet/docs/reference/Google.Cloud.BigQuery.V2/latest/Google.Cloud.BigQuery.V2.QueryOptions#Google_Cloud_BigQuery_V2_QueryOptions_DestinationTable) value of the QueryOptions if configured. Expects the format to be `{projectId}.{datasetId}.{tableId}` to set the corresponding values in the [TableReference](https://github.com/googleapis/google-api-dotnet-client/blob/6c415c73788b848711e47c6dd33c2f93c76faf97/Src/Generated/Google.Apis.Bigquery.v2/Google.Apis.Bigquery.v2.cs#L9348) class.
8090

8191
**adbc.bigquery.project_id**<br>
8292
&nbsp;&nbsp;&nbsp;&nbsp;The [Project ID](https://cloud.google.com/resource-manager/docs/creating-managing-projects) used for accessing BigQuery. If not specified, will default to detect the projectIds the credentials have access to.
8393

84-
**adbc.bigquery.include_public_project_id**<br>
85-
&nbsp;&nbsp;&nbsp;&nbsp;Include the `bigquery-public-data` project ID with the list of project IDs.
86-
8794
**adbc.bigquery.refresh_token**<br>
8895
&nbsp;&nbsp;&nbsp;&nbsp;The refresh token used for when the generated OAuth token expires. Required for `user` authentication.
8996

97+
**adbc.bigquery.retry_delay_ms**<br>
98+
&nbsp;&nbsp;&nbsp;&nbsp;Optional The delay between retries. Defaults to 200ms. The retries could take up to `adbc.bigquery.maximum_retries` x `adbc.bigquery.retry_delay_ms` to complete.
99+
90100
**adbc.bigquery.scopes**<br>
91101
&nbsp;&nbsp;&nbsp;&nbsp;Optional. Comma separated list of scopes to include for the credential.
92102

@@ -119,3 +129,19 @@ The following table depicts how the BigQuery ADBC driver converts a BigQuery typ
119129
+A JSON string
120130

121131
See [Arrow Schema Details](https://cloud.google.com/bigquery/docs/reference/storage/#arrow_schema_details) for how BigQuery handles Arrow types.
132+
133+
## Microsoft Entra
134+
The driver supports authenticating with a [Microsoft Entra](https://learn.microsoft.com/en-us/entra/fundamentals/what-is-entra) ID. For long running operations, the Entra token may timeout if the operation takes longer than the Entra token's lifetime. The driver has the ability to perform token refreshes by subscribing to the `UpdateToken` delegate on the `BigQueryConnection`. In this scenario, the driver will attempt to perform an operation. If that operation fails due to an Unauthorized error, then the token will be refreshed via the `UpdateToken` delegate.
135+
136+
Sample code to refresh the token:
137+
138+
```
139+
Dictionary<string,string> properties = ...;
140+
BigQueryConnection connection = new BigQueryConnection(properties);
141+
connection.UpdateToken = () => Task.Run(() =>
142+
{
143+
connection.SetOption(BigQueryParameters.AccessToken, GetAccessToken());
144+
});
145+
```
146+
147+
In the sample above, when a new token is needed, the delegate is invoked and updates the `adbc.bigquery.access_token` parameter on the connection object.

0 commit comments

Comments
 (0)