Skip to content

[Feature Request] Add Compute Metadata Detection API for MSI mTLS PoP Capability on IMDS #5969

@gladjohn

Description

@gladjohn

The task

We need a way for higher-level SDKs (Azure SDK, Identity.Web) to ask MSAL: "can this VM do MSI mTLS Proof-of-Possession?" — independent of which IMDS endpoint happens to be deployed. Today, "is getPlatformMetadata reachable?" is being used as a proxy for that capability, but once IMDS v2 rolls out broadly to Gen1 VMs (which lack KeyGuard), endpoint-existence is no longer a reliable signal. The actual signal lives in the IMDS compute API under securityProfile.securityType.

The API also needs to be generic enough that future feature gating (OS-aware behavior, VM SKU checks, region detection) can reuse the same call without us shipping a new method each time.

What IMDS actually returns

Live capture from a TrustedLaunch VM, GET /metadata/instance/compute?api-version=2023-07-01 with Metadata: true:

{
  "azEnvironment": "AzurePublicCloud",
  "location": "westus2",
  "osType": "Windows",
  "vmId": "05320aba-...",
  "vmSize": "Standard_D16as_v5",
  "subscriptionId": "...", "resourceGroupName": "...", "resourceId": "...",
  "licenseType": "Windows_Client",
  "securityProfile": {
    "securityType":      "TrustedLaunch",   //  <-- the field that answers "PoP-capable?"
    "secureBootEnabled": "true",
    "virtualTpmEnabled": "true",
    "encryptionAtHost":  "true"
  },
  "storageProfile": { "osDisk": { "osType": "Windows" }, "imageReference": { ... } },
  "additionalCapabilities": { "hibernationEnabled": "true" },
  "tagsList": [ ... ]
}

Two things to note for the API design:

  • IMDS returns booleans as JSON strings ("true"). The public surface should normalize to bool.
  • securityType only appears in api-version 2022-08-01 and later — we should pin to that or newer.

Design principles

A few things worth getting right up front, since this is a public API that Azure SDK and Identity.Web will take a hard dependency on:

  • Strongly typed flags, not strings. Callers shouldn't have to write x == "true".
  • Immutable result. A DTO returned from a static factory should have no public setters — caches and concurrent callers should be able to share a single instance safely.
  • Disambiguate failure modes. "Not on Azure" vs "transient IMDS blip" vs "parse failure" are operationally very different. Identity.Web wants to throw on a transient error at startup but log-and-skip when not on Azure. A bare null collapses all three into one.
  • One semantic capability accessor. Without it, every consumer re-implements the TrustedLaunch || ConfidentialVM + vTPM rule and they will drift. Internalize it once as SupportsMtlsPop.
  • Cache. Compute metadata is effectively immutable for a VM. Region detection and other internal callers should reuse the result without re-hitting 169.254 each time.
  • Forward-compatible. Adding new IMDS fields shouldn't require a new MSAL release. Surface a curated set of strongly-typed properties, plus an untyped dictionary for everything else.

Proposed shape

Two layers — explicit booleans + enum on top, dictionary underneath for forward-compat.

public sealed class AzureComputeMetadata
{
    // Cloud / location
    public string AzureCloudEnvironment { get; }
    public string Location { get; }
    public string Zone { get; }

    // Identity
    public string VmId { get; }
    public string VmName { get; }
    public string VmSize { get; }
    public string SubscriptionId { get; }
    public string ResourceGroupName { get; }
    public string ResourceId { get; }

    // OS — explicit per @christothes's ask
    public string OsType { get; }       // "Windows" / "Linux"
    public string OsVersion { get; }

    // Security (strongly typed)
    public AzureSecurityProfile SecurityProfile { get; }

    // The primary answer to the issue — rule lives here, not in every caller
    public bool SupportsMtlsPop { get; }

    // Forward-compat: any IMDS field MSAL hasn't promoted yet
    public IReadOnlyDictionary<string, JsonElement> AdditionalProperties { get; }
}

public sealed class AzureSecurityProfile
{
    public AzureVmSecurityType SecurityType { get; }   // enum, not magic string
    public bool IsTrustedLaunch          { get; }
    public bool IsConfidentialVm         { get; }
    public bool IsSecureBootEnabled      { get; }
    public bool IsVirtualTpmEnabled      { get; }
    public bool IsEncryptionAtHostEnabled{ get; }
}

public enum AzureVmSecurityType { Unknown = 0, Standard, TrustedLaunch, ConfidentialVM }

This shape lands both asks at once:

  • @christothes — explicit IsTrustedLaunch, IsSecureBoot, OsType etc., no string comparisons.
  • Nidhi — the API stays generic; new IMDS fields land as new strongly-typed properties (non-breaking) and unknown fields are still accessible via AdditionalProperties.

Static API + result envelope

Answers @bgavrilMS's "what happens on non-IMDS?" without overloading null:

public static Task<ComputeMetadataResult> GetComputeMetadataAsync(
    CancellationToken cancellationToken = default,
    bool forceRefresh = false);

public sealed class ComputeMetadataResult
{
    public ComputeMetadataStatus Status { get; }       // Available | NotAvailable | Error
    public AzureComputeMetadata  Metadata { get; }     // null unless Available
    public string                FailureReason { get; }
    public Exception             Exception { get; }
}

public enum ComputeMetadataStatus { Available, NotAvailable, Error }

Mirrors ManagedIdentitySourceResult.ImdsV1FailureReason — same pattern, same vocabulary across the ManagedIdentity surface.

Caching policy

  • Available → cache for process lifetime (compute metadata is effectively immutable for a VM).
  • Error → cache ~30s to avoid thundering-herd, then retry.
  • NotAvailable → cache for process lifetime (we're not on Azure; that's not changing).

Single in-flight task so concurrent callers share one IMDS hit.

Worked examples

// Azure SDK — auto-switch between Bearer and PoP
var r = await ManagedIdentityApplication.GetComputeMetadataAsync(ct);
if (r.Status == ComputeMetadataStatus.Available && r.Metadata.SupportsMtlsPop)
    builder.WithMtlsProofOfPossession();
// else fall back to Bearer

// Identity.Web — fail fast at startup with an actionable error
if (options.UseMtlsPop && r.Metadata?.SupportsMtlsPop != true)
    throw new InvalidOperationException(
        $"mTLS PoP requested but VM does not support it. " +
        $"Status={r.Status}, SecurityType={r.Metadata?.SecurityProfile.SecurityType}.");

// Direct property style
var m = (await ManagedIdentityApplication.GetComputeMetadataAsync()).Metadata;
m.SecurityProfile.IsTrustedLaunch;     // bool
m.SecurityProfile.IsSecureBootEnabled; // bool
m.OsType;                              // string

Relationship to GetManagedIdentitySource()

Keeping them separate. They answer orthogonal questions:

  • GetSource"where will my MI token come from?" (IMDS / AppService / Arc / …)
  • GetComputeMetadata"what can the hardware actually do?" (KeyGuard / vTPM / TrustedLaunch)

A Gen1 IMDS VM proves the point: Source = Imds but SupportsMtlsPop = false. Both signals are independently needed. We'll cross-link the two in XML docs so users land on the right one.

Implementation outline

  1. Public types: AzureComputeMetadata, AzureSecurityProfile, AzureVmSecurityType, ComputeMetadataResult, ComputeMetadataStatus — all immutable, internal ctors.
  2. Parser normalizes IMDS string "true"/"false" to bool and maps securityType to the enum (Unknown for missing / unrecognized).
  3. Static API returns the ComputeMetadataResult envelope.
  4. Process-wide cache with the policy above; single in-flight task.
  5. IMDS api-version pinned to one that supports securityType (2022-08-01+).
  6. PublicAPI.Unshipped.txt across all six TFMs.
  7. Tests: TrustedLaunch / ConfidentialVM / Gen1-no-profile / IMDS-unreachable / IMDS-500 / cache-hit / forceRefresh.

Metadata

Metadata

Assignees

No one assigned

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions