Skip to content

perf: Add caching for Hasher hash algorithms and Azure KeyVault clients#1685

Merged
thomhurst merged 3 commits into
mainfrom
fix/caching-improvements
Dec 30, 2025
Merged

perf: Add caching for Hasher hash algorithms and Azure KeyVault clients#1685
thomhurst merged 3 commits into
mainfrom
fix/caching-improvements

Conversation

@thomhurst

@thomhurst thomhurst commented Dec 30, 2025

Copy link
Copy Markdown
Owner

Summary

  • Uses static HashData() methods (.NET 5+) in Hasher.cs for thread-safe hashing without disposal concerns
  • Adds ConcurrentDictionary caching for Azure KeyVault SecretClient, CertificateClient, and KeyClient instances

Note: Azure clients are cached by vault URI only. Since AzureKeyVault is registered as Scoped, the same credential is typically used throughout a pipeline execution scope. This avoids cache misses from TokenCredential's lack of value equality.

Fixes #1596, #1594

Test plan

  • Build passes
  • All tests pass

🤖 Generated with Claude Code

Fixes #1594 and #1596

- Use ThreadLocal<T> for hash algorithm instances in Hasher class
  since HashAlgorithm is not thread-safe but creating new instances
  per call is wasteful
- Cache Azure KeyVault clients (SecretClient, CertificateClient,
  KeyClient) in ConcurrentDictionary by vault URI since Azure SDK
  clients are thread-safe and designed for reuse

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@thomhurst

Copy link
Copy Markdown
Owner Author

Summary

Adds ThreadLocal caching for hash algorithm instances and ConcurrentDictionary caching for Azure KeyVault clients to improve performance.

Critical Issues

1. Resource Leak in Hasher.cs

The ThreadLocal instances and their underlying HashAlgorithm objects are never disposed, causing a resource leak.

Problem:

  • ThreadLocal, ThreadLocal, etc. are static fields that are never disposed
  • The HashAlgorithm instances they contain are never disposed
  • HashAlgorithm implements IDisposable and should be properly disposed

Better Fix: Use SHA256.HashData() static methods available in .NET 5+ which don't require disposal

2. Incorrect Cache Key in AzureKeyVault.cs

The cache key only includes the vault URI but ignores the TokenCredential parameter. This causes incorrect behavior when the same vault is accessed with different credentials.

Problem: If GetSecretClient(uri, credential1) is called, then GetSecretClient(uri, credential2) is called, the second call will return the client created with credential1, not credential2.

Fix: Include the credential in the cache key using a composite key: (vaultUri.ToString(), tokenCredential)

Verdict

⚠️ REQUEST CHANGES - Critical issues found

- Hasher: Use static HashData methods (.NET 5+) instead of ThreadLocal
  to avoid resource leaks from undisposed HashAlgorithm instances
- AzureKeyVault: Use composite key (vaultUri, tokenCredential) to ensure
  different credentials for the same vault return different clients

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@thomhurst

Copy link
Copy Markdown
Owner Author

Summary

This PR improves performance by eliminating repeated object creation in two areas: modernizing hash computation to use static HashData methods and caching Azure KeyVault SDK clients.

Critical Issues

1. TokenCredential reference equality breaks cache effectiveness (src/ModularPipelines.Azure/AzureKeyVault.cs:18-20)

The composite key (vaultUri.ToString(), tokenCredential) uses TokenCredential as part of the dictionary key. Since TokenCredential uses reference equality by default, two logically identical credentials (e.g., two DefaultAzureCredential instances) will be treated as different keys, causing cache misses and duplicate client instances.

Example scenario:

var cred1 = new DefaultAzureCredential();
var cred2 = new DefaultAzureCredential();
var client1 = keyVault.GetSecretClient(uri, cred1); // Creates new client
var client2 = keyVault.GetSecretClient(uri, cred2); // Creates ANOTHER new client (cache miss)

Impact: The caching won't work as intended when different TokenCredential instances are passed, even if they're functionally equivalent.

Suggested fix: Either:

  • Key only by vaultUri if the same credential is typically used throughout a scope
  • Document that callers should reuse TokenCredential instances
  • Use a different caching strategy

Suggestions

1. PR description mentions ThreadLocal but code uses HashData

The PR description states "Adds ThreadLocal<T> caching for hash algorithm instances" but the actual implementation uses static HashData methods instead. While the HashData approach is actually superior (modern .NET 5+ API, thread-safe, no disposal needed), the description should match the implementation.

Suggested fix: Update PR description to reflect the actual implementation.

2. Consider documenting the scoped lifetime limitation (src/ModularPipelines.Azure/AzureKeyVault.cs:9-18)

Since AzureKeyVault is registered as Scoped (src/ModularPipelines.Azure/Extensions/AzureExtensions.cs:46), the caching only benefits multiple calls within the same scope. This is different from a Singleton pattern where caching would persist across the entire application lifetime.

This is actually fine for resource management (clients are cleaned up when scope ends), but the limited cache lifetime reduces the performance benefit.

Suggested approach: Either accept this trade-off or add a code comment explaining the scoped caching behavior.

Verdict

⚠️ REQUEST CHANGES - Critical issue with TokenCredential reference equality must be addressed.

TokenCredential lacks value equality, so including it in the cache key
causes cache misses for logically identical credentials. Since this
service is Scoped, the same credential is typically used throughout
a pipeline execution. Added documentation explaining this behavior.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@thomhurst

Copy link
Copy Markdown
Owner Author

Summary

Adds client caching to AzureKeyVault and modernizes Hasher implementation to use static HashData methods.

Critical Issues

None found ✅

Suggestions

AzureKeyVault.cs - Credential Change Behavior

The caching implementation is correct for typical pipeline scenarios, but the documentation could be more explicit about one edge case:

Current behavior: Once a vault URI is cached, subsequent calls with different TokenCredentials for the same vault will reuse the first credential.

Example scenario:

// First call - credential1 is cached
var client1 = azureKeyVault.GetSecretClient(vaultUri, credential1);

// Second call - still uses credential1, not credential2!
var client2 = azureKeyVault.GetSecretClient(vaultUri, credential2);

This is intentional and safe because:

  1. AzureKeyVault is registered as Scoped (per pipeline execution)
  2. Most pipelines use a single credential throughout
  3. The docs explain this in the remarks section

Optional improvement: Consider adding a log warning if a different credential instance is passed for an already-cached vault URI. This would make the behavior more discoverable during debugging without adding runtime overhead.

Example:

return _secretClients.GetOrAdd(key, _ => 
{
    _logger?.LogDebug("Creating new SecretClient for vault {VaultUri}", vaultUri);
    return new SecretClient(vaultUri, tokenCredential);
});

This is purely optional - the current implementation is correct and well-documented.

Hasher.cs - Looks Perfect

The migration to static HashData() methods is exactly right:

  • ✅ No resource leaks (no disposal needed)
  • ✅ Thread-safe
  • ✅ More efficient (no object allocation)
  • ✅ Modern .NET best practice
  • ✅ Clean removal of unnecessary helper method

Verdict

APPROVE - No critical issues

Both changes are solid improvements. The caching pattern in AzureKeyVault is appropriate for the scoped lifetime, and the Hasher modernization follows .NET best practices perfectly.

@thomhurst thomhurst enabled auto-merge (squash) December 30, 2025 16:56
@thomhurst thomhurst disabled auto-merge December 30, 2025 17:08
@thomhurst thomhurst merged commit 47b7c10 into main Dec 30, 2025
12 checks passed
@thomhurst thomhurst deleted the fix/caching-improvements branch December 30, 2025 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Code smell: Hasher class creates new hash algorithm instances without static factory caching

1 participant