Skip to content

chore: add support to use ManagedIdentityCredential when AZURE_CLIENT_ID env is specified#146

Open
mittachaitu wants to merge 2 commits into
run-ai:masterfrom
mittachaitu:sai/bump_azureclient_ver
Open

chore: add support to use ManagedIdentityCredential when AZURE_CLIENT_ID env is specified#146
mittachaitu wants to merge 2 commits into
run-ai:masterfrom
mittachaitu:sai/bump_azureclient_ver

Conversation

@mittachaitu
Copy link
Copy Markdown

@mittachaitu mittachaitu commented May 14, 2026

What This PR Does:
This PR add support to authenticate azure-blob-storage through ManagedIdentityCredential when AZURE_CLIENT_ID env is configured and DefaultAzureCredential is failed.

Summary by CodeRabbit

  • New Features

    • Added Managed Identity support for Azure Storage authentication with automatic fallback to a user-assigned identity when configured.
    • New configuration option to specify a client ID for user-assigned managed identities via environment variable or settings.
  • Bug Fixes

    • Credential verification updated to include client ID for more accurate authentication validation.

Review Change Stack

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 14, 2026

Greptile Summary

This PR adds ManagedIdentityCredential as a fallback when DefaultAzureCredential fails, triggered only when AZURE_CLIENT_ID is set, and fixes verify_credentials to include client_id in the identity comparison.

  • Replaces the former static DefaultAzureCredential with a process-level singleton (GetSharedDefaultAzureCredential) that calls GetToken() once as a fail-fast probe with a 5-second deadline, then falls back to ManagedIdentityCredential(_client_id) on failure.
  • Propagates client_id through ClientConfiguration, AzureClient member, API parameter parsing, and verify_credentials so that clients with different managed identities are no longer considered equivalent.

Confidence Score: 3/5

The core auth fallback logic is architecturally sound, but the call_once failure-retry semantics cause a serial 5-second timeout chain for every concurrent client when DefaultAzureCredential consistently fails.

When AZURE_CLIENT_ID is set and DefaultAzureCredential is expected to fail, N concurrent workers serialize through N separate 5-second GetToken() timeouts before each falls back to ManagedIdentityCredential, directly undermining startup reliability.

cpp/azure/client/client.cc — specifically GetSharedDefaultAzureCredential and the call_once failure path

Important Files Changed

Filename Overview
cpp/azure/client/client.cc Introduces GetSharedDefaultAzureCredential() singleton with call_once + GetToken() fail-fast probe and a try/catch fallback to ManagedIdentityCredential. The call_once failure-retry semantics cause serial 5-second timeout chains when DefaultAzureCredential consistently fails (the primary use case). Missing <chrono> direct include.
cpp/azure/client/client.h Adds _client_id optional string member to AzureClient; straightforward header change with no issues.
cpp/azure/client_configuration/client_configuration.cc Reads AZURE_CLIENT_ID env var into client_id for managed-identity fallback; logic is correct but the debug log message is misleading.
cpp/azure/client_configuration/client_configuration.h Adds optional client_id field to ClientConfiguration struct; clean change with no issues.

Reviews (3): Last reviewed commit: "chore: address bot comments" | Re-trigger Greptile

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 68708a8d-3665-45ca-a369-3516867ea6e4

📥 Commits

Reviewing files that changed from the base of the PR and between 87eb066 and c3d83ec.

📒 Files selected for processing (1)
  • cpp/azure/client/client.cc
🚧 Files skipped from review as they are similar to previous changes (1)
  • cpp/azure/client/client.cc

Walkthrough

Azure client adds an optional client_id configuration, reads AZURE_CLIENT_ID from the environment, accepts client_id overrides, and changes Blob initialization to try DefaultAzureCredential then fall back to ManagedIdentityCredential(client_id) when the default fails.

Changes

Managed Identity authentication support

Layer / File(s) Summary
Configuration schema and environment loading
cpp/azure/client_configuration/client_configuration.h, cpp/azure/client_configuration/client_configuration.cc
ClientConfiguration adds an optional client_id field and reads AZURE_CLIENT_ID from the environment to enable user-assigned managed identity targeting.
Client: fields and parameter parsing
cpp/azure/client/client.h, cpp/azure/client/client.cc
AzureClient adds a private _client_id, initializes it from configuration, and accepts client_id overrides from config.initial_params.
Credential selection and BlobServiceClient initialization
cpp/azure/client/client.cc
BlobServiceClient initialization for account_name-only auth now tries DefaultAzureCredential (forcing token acquisition via a shared helper) and falls back to ManagedIdentityCredential constructed with the configured client_id on failure. Managed Identity header is imported.
verify_credentials parsing and equality check
cpp/azure/client/client.cc
verify_credentials now parses client_id from params into a temp field and includes client_id in the final equality comparison with _client_id.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A managed identity springs to light,
When defaults falter, I hold tight.
client_id hums the route to send,
Tokens found, the blob calls mend. ✨🔑

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding ManagedIdentityCredential support when AZURE_CLIENT_ID is specified, which aligns with the core functionality added across all modified files.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread cpp/azure/client/client.cc
Comment thread cpp/azure/client_configuration/client_configuration.cc Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cpp/azure/client_configuration/client_configuration.cc`:
- Around line 40-43: The code currently logs the raw AZURE_CLIENT_ID value
(variable cid) before assigning it to client_id; update the logging in the
client configuration and client code paths (where cid/client_id are used and
where LOG(DEBUG) prints the id) to avoid emitting the full identifier — either
log a generic message like "Using AZURE_CLIENT_ID for managed identity
(present)" or log a masked form (e.g., only last 4 chars) instead; locate usages
around the getenv<std::string>("AZURE_CLIENT_ID", "") call and the LOG(DEBUG)
statements in client_configuration.cc and client.cc and replace the direct cid
print with the non-sensitive alternative.

In `@cpp/azure/client/client.cc`:
- Around line 66-69: verify_credentials currently doesn't compare the newly
parsed _client_id, so an override can be ignored and an old client reused;
update the verify_credentials function to include _client_id in its matching
logic (alongside other credential fields it checks) so that any change to
client_id forces a mismatch and triggers client refresh; locate
verify_credentials and add comparison against the _client_id member (and ensure
any cache/key used to identify the active client incorporates client_id as
well).
- Around line 113-131: The current try/catch only surrounds BlobServiceClient
construction but Azure auth is deferred, so replace the approach by attempting
to acquire a token from DefaultAzureCredential before committing to the
credential: instantiate DefaultAzureCredential, call its token acquisition
method (e.g., GetToken / AcquireToken equivalent) for the storage scope and if
that call succeeds use that credential to construct BlobServiceClient; if the
token acquisition throws or fails and _client_id.has_value() create a
ManagedIdentityCredential, attempt token acquisition with it, and only then
construct BlobServiceClient; ensure exceptions are logged with e.what() and
rethrow if both token acquisitions fail; target symbols: DefaultAzureCredential,
ManagedIdentityCredential, BlobServiceClient, and the credential
GetToken/AcquireToken call.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 36de2ac3-453b-4286-a75d-71807b65fab7

📥 Commits

Reviewing files that changed from the base of the PR and between f73a19a and dca2f1b.

📒 Files selected for processing (4)
  • cpp/azure/client/client.cc
  • cpp/azure/client/client.h
  • cpp/azure/client_configuration/client_configuration.cc
  • cpp/azure/client_configuration/client_configuration.h

Comment thread cpp/azure/client_configuration/client_configuration.cc
Comment thread cpp/azure/client/client.cc
Comment thread cpp/azure/client/client.cc Outdated
…configured

This commit add supports for ManagedIdentityCredential
when AZURE_CLIENT_ID is configured

Signed-off-by: Mitta Sai Chaithanya <mittas@microsoft.com>
@mittachaitu mittachaitu force-pushed the sai/bump_azureclient_ver branch from dca2f1b to 87eb066 Compare May 14, 2026 17:43
Comment thread cpp/azure/client/client.cc Outdated
Comment on lines +117 to +123
auto default_cred = std::make_shared<Azure::Identity::DefaultAzureCredential>();
Azure::Core::Credentials::TokenRequestContext token_ctx;
token_ctx.Scopes = {"https://storage.azure.com/.default"};
Azure::Core::Context ctx;
(void)default_cred->GetToken(token_ctx, ctx); // force auth now
credential = default_cred;
LOG(DEBUG) << "Azure client initialized with DefaultAzureCredential for account: " << _account_name.value();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Token caching regression: removing static exposes IMDS throttling risk

The original code used a static DefaultAzureCredential instance specifically to share token caching across all clients and avoid overwhelming IMDS — the comment even warned about "fatal throttling errors." This PR removes static (correctly, to allow the fallback to work), but does not replace it with any caching mechanism. Combined with the new GetToken() call on line 121, every AzureClient construction now independently acquires a fresh token from IMDS/Azure AD. Under load (e.g., many concurrent model-streamer workers, each constructing its own AzureClient), this will hit IMDS with one token request per client, exactly the scenario the original author explicitly guarded against.

Consider hoisting the DefaultAzureCredential to a process-level singleton (e.g., a static local in a free function or a static class member) that is separate from the fallback path. When GetToken() succeeds on that shared instance, use it; when it throws, fall back to a per-client ManagedIdentityCredential.

Comment thread cpp/azure/client/client.cc Outdated
Comment on lines +118 to +121
Azure::Core::Credentials::TokenRequestContext token_ctx;
token_ctx.Scopes = {"https://storage.azure.com/.default"};
Azure::Core::Context ctx;
(void)default_cred->GetToken(token_ctx, ctx); // force auth now
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Unbounded block in constructor due to missing cancellation deadline

DefaultAzureCredential probes every credential source in its chain — EnvironmentCredential, WorkloadIdentityCredential, ManagedIdentityCredential (via IMDS), AzureCliCredential, etc. — sequentially. The Azure::Core::Context ctx here is default-constructed with no deadline, so if the host is outside Azure or IMDS is throttled/unreachable, the constructor blocks for the full SDK timeout per credential type before finally throwing. In practice this can be tens of seconds, stalling whichever thread is constructing the client.

Pass a context with a deadline so the probe fails fast, e.g. Azure::Core::Context::ApplicationContext.WithDeadline(std::chrono::system_clock::now() + std::chrono::seconds(5)).

Signed-off-by: Mitta Sai Chaithanya <mittas@microsoft.com>
Comment on lines +35 to +49
std::shared_ptr<Azure::Identity::DefaultAzureCredential> GetSharedDefaultAzureCredential() {
static std::shared_ptr<Azure::Identity::DefaultAzureCredential> shared_credential;
static std::once_flag init_flag;
std::call_once(init_flag, []() {
auto cred = std::make_shared<Azure::Identity::DefaultAzureCredential>();
// Validate credential chain early via GetToken() as fail-fast mechanism.
Azure::Core::Credentials::TokenRequestContext token_ctx;
token_ctx.Scopes = {"https://storage.azure.com/.default"};
Azure::Core::Context ctx = Azure::Core::Context::ApplicationContext.WithDeadline(
std::chrono::system_clock::now() + std::chrono::seconds(5));
(void)cred->GetToken(token_ctx, ctx); // throws if credential chain fails
shared_credential = cred; // only assign if GetToken() succeeds
});
return shared_credential;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Serial retry storm when DefaultAzureCredential consistently fails

Per the C++ standard, when the callable passed to std::call_once exits via exception, the once_flag is not marked "done" and the next call will retry the callable. In the exact scenario this PR targets — AZURE_CLIENT_ID set, DefaultAzureCredential expected to fail — every concurrent AzureClient construction will serially queue through call_once and each wait up to 5 seconds before throwing. With N concurrent workers this produces O(N × 5 s) startup latency, which can be 40+ seconds for an 8-worker deployment.

A simple fix is to track the failure state in a static std::atomic<bool> so that after the first failed probe the function throws immediately without retrying the 5-second GetToken() timeout each time.

@kyleknap
Copy link
Copy Markdown
Contributor

@mittachaitu thanks for the PR. It is unfortunate that it looks like the default credential C++ SDK does not respect the AZURE_CLIENT_ID environment variable but the default credential in Python SDK however does.

In general, it would be interesting to see if we could just add support for it in the identity C++ SDK itself especially given this is a parity gap with the Python SDK. I'd be happy to help review that with Azure SDK with you if you have the bandwidth to send a PR up.

In terms of unblocking us now, I'm a little hesitant to try to try default credentials and then assume if it fails and AZURE_CLIENT_ID is set, then it is a managed identity credential. Mainly, there are some extra hops involved and there are other credential providers that use AZURE_CLIENT_ID so resolving to the managed identity provider might not be a safe assumption to make.

I think for now the safest thing for us to do until the upstream C++ SDK adds support, would be to define our own credential chain that forks the default azure credential chain with the added logic that it will inject a client_id for the managed identity provider based on the AZURE_CLIENT_ID environment variable. The additional benefit in this approach is if support does land in the upstream C++ SDK, we can just revert this back to what we had before.

Let me know what you think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants