Fix HttpClient object leak described in #5948 (no socket exhaustion)#5949
Fix HttpClient object leak described in #5948 (no socket exhaustion)#5949
Conversation
There was a problem hiding this comment.
Pull request overview
Fixes an unnecessary HttpClient allocation pattern in SimpleHttpClientFactory caused by eager evaluation of ConcurrentDictionary.GetOrAdd value arguments, and adds a regression test to ensure repeated lookups do not create throwaway clients.
Changes:
- Update
ConcurrentDictionary.GetOrAddusage to use value-factory delegates (lazy creation) for both non-mTLS and mTLS clients. - Add unit test coverage to assert
CreateHttpClientis only invoked once for repeated default-key lookups. - Add test instrumentation (
HttpClientCreationCount+ reset helper) to support the new regression test.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| tests/Microsoft.Identity.Test.Unit/CoreTests/HttpTests/HttpClientFactoryTests.cs | Adds regression test validating no repeated HttpClient creation for repeated GetHttpClient() calls. |
| src/client/Microsoft.Identity.Client/PlatformsCommon/Shared/SimpleHttpClientFactory.cs | Fixes eager GetOrAdd evaluation by switching to delegates; adds test-only instrumentation and reset helper. |
| internal static void ResetForTest() | ||
| { | ||
| s_httpClientPool.Clear(); | ||
| HttpClientCreationCount = 0; | ||
| } |
There was a problem hiding this comment.
ResetForTest clears the shared HttpClient pool without disposing the existing HttpClient instances. Since this helper is intended for tests starting from a clean slate, consider disposing the pooled clients before clearing to avoid leaving handlers/resources alive for the remainder of the test run.
There was a problem hiding this comment.
@copilot apply changes based on this feedback
There was a problem hiding this comment.
Applied in commit 30bec75. ResetStaticStateForTest now iterates over the pooled values and calls Dispose() on each HttpClient before clearing the dictionary.
|
Nit: PR title references #5984 but should be #5948. |
…entFactoryTests.cs Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
41e5b4c to
e5e0676
Compare
…-only property Agent-Logs-Url: https://github.com/AzureAD/microsoft-authentication-library-for-dotnet/sessions/a79a3efc-2df8-40a0-af39-3ec2083e19ac Co-authored-by: bgavrilMS <12273384+bgavrilMS@users.noreply.github.com>
…setForTest Agent-Logs-Url: https://github.com/AzureAD/microsoft-authentication-library-for-dotnet/sessions/0be221f9-e45c-4835-b968-585fb1aa0082 Co-authored-by: bgavrilMS <12273384+bgavrilMS@users.noreply.github.com>
|
|
||
| // referenced in unit tests | ||
| internal static void ResetStaticStateForTest() | ||
| { | ||
| foreach (var client in s_httpClientPool.Values) | ||
| { | ||
| client.Dispose(); |
There was a problem hiding this comment.
These members are explicitly test-only hooks but live in the production assembly/class. To reduce long-term maintenance risk and avoid accidental non-test usage, consider isolating them behind conditional compilation (e.g., only included in test builds) or moving them into a test-only partial class/file compiled only for tests.
| // referenced in unit tests | |
| internal static void ResetStaticStateForTest() | |
| { | |
| foreach (var client in s_httpClientPool.Values) | |
| { | |
| client.Dispose(); | |
| #if DEBUG | |
| // referenced in unit tests | |
| internal static void ResetForTest() | |
| { | |
| s_httpClientPool.Clear(); | |
| s_httpClientCreationCount = 0; | |
| } | |
| #endif |
|
The Service Fabric overload still has the unfixed eager GetOrAdd pattern: // GetHttpClient(Func<...> validateServerCert) — line ~111
return s_httpClientPool.GetOrAdd(key, new HttpClient(handler));But there's a deeper problem here: the key is handler.GetHashCode().ToString(). Since GetHashCode() on a reference type returns the object's identity hash, and a new HttpClientHandler is created on every call, the key is always unique — the pool never actually hits. Every call adds a new HttpClient to the pool permanently (until the 1000-entry eviction), which is worse than the bug being fixed. The comment above it even acknowledges this: "The server cert rotates so we need a new HttpClient for each call." If that's the intent, the GetOrAdd should be removed entirely and the method should just return new HttpClient(handler) directly, skipping the pool. |
| } | ||
|
|
||
| } | ||
| } |
There was a problem hiding this comment.
The regression test only covers GetHttpClient() (the "non_mtls" key). GetHttpClient(X509Certificate2) received the same factory-delegate fix but has no corresponding test. Consider adding a parallel test calling GetHttpClient(cert) multiple times with the same cert and asserting the HttpClientCreationCount only increments once.
There was a problem hiding this comment.
Added in commit f212029. TestGetHttpClientWithMtlsCert_DoesNotLeakHttpClients calls GetHttpClient(cert) three times with the same certificate and asserts HttpClientCreationCount == 1. Also added Interlocked.Increment to CreateMtlsHttpClient so the counter tracks mTLS client creation too.
…mpleHttpClientFactory.cs Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
| client.Dispose(); | ||
| } | ||
| s_httpClientPool.Clear(); | ||
| s_httpClientCreationCount = 0; |
There was a problem hiding this comment.
ResetStaticStateForTest contains leftover/duplicated code after the lock block (client.Dispose();, extra braces, and an extra s_httpClientPool.Clear() / s_httpClientCreationCount = 0). As written it will not compile (client is undefined) and the braces are unbalanced. Please remove the stray lines and keep the reset logic inside the lock (disposing pooled clients, clearing the pool, and resetting the counter).
| client.Dispose(); | |
| } | |
| s_httpClientPool.Clear(); | |
| s_httpClientCreationCount = 0; |
| foreach (KeyValuePair<string, HttpClient> pooledClient in s_httpClientPool) | ||
| { | ||
| pooledClient.Value?.Dispose(); | ||
| } | ||
|
|
||
| s_httpClientPool.Clear(); | ||
| Interlocked.Exchange(ref s_httpClientCreationCount, 0); | ||
| } | ||
| client.Dispose(); | ||
| } | ||
| s_httpClientPool.Clear(); | ||
| s_httpClientCreationCount = 0; |
There was a problem hiding this comment.
ResetStaticStateForTest uses KeyValuePair<string, HttpClient> but this file doesn’t import System.Collections.Generic, and the main library project doesn’t enable implicit usings. This should be a compile error; either add the missing using or iterate s_httpClientPool.Values (or use var) to avoid the extra type reference.
| foreach (KeyValuePair<string, HttpClient> pooledClient in s_httpClientPool) | |
| { | |
| pooledClient.Value?.Dispose(); | |
| } | |
| s_httpClientPool.Clear(); | |
| Interlocked.Exchange(ref s_httpClientCreationCount, 0); | |
| } | |
| client.Dispose(); | |
| } | |
| s_httpClientPool.Clear(); | |
| s_httpClientCreationCount = 0; | |
| foreach (HttpClient pooledClient in s_httpClientPool.Values) | |
| { | |
| pooledClient?.Dispose(); | |
| } | |
| s_httpClientPool.Clear(); | |
| Interlocked.Exchange(ref s_httpClientCreationCount, 0); | |
| } |
…LS cert test; fix ResetStaticStateForTest Agent-Logs-Url: https://github.com/AzureAD/microsoft-authentication-library-for-dotnet/sessions/680737a8-8460-40bd-8a4a-9e32d96560f0 Co-authored-by: bgavrilMS <12273384+bgavrilMS@users.noreply.github.com>
... Fixed in commit |
…truction under contention Agent-Logs-Url: https://github.com/AzureAD/microsoft-authentication-library-for-dotnet/sessions/4cfee597-417f-4608-acad-96814445f7a1 Co-authored-by: bgavrilMS <12273384+bgavrilMS@users.noreply.github.com>
Head branch was pushed to by a user without write access
| @@ -101,14 +130,13 @@ | |||
| #if NET471_OR_GREATER || NETSTANDARD || NET | |||
| var handler = new HttpClientHandler | |||
| { | |||
| ServerCertificateCustomValidationCallback = (message, cert, chain, sslPolicyErrors) => | |||
| { | |||
| return validateServerCert(message, cert, chain, sslPolicyErrors); | |||
| } | |||
| }; | |||
|
|
|||
| string key = handler.GetHashCode().ToString(); | |||
| return s_httpClientPool.GetOrAdd(key, new HttpClient(handler)); | |||
| return new HttpClient(handler); | |||
There was a problem hiding this comment.
GetHttpClient(validateServerCert) now always constructs a new HttpClient per call. This conflicts with IMsalSFHttpClientFactory guidance (“Do not create a new HttpClient for each call … this leads to socket exhaustion”) and can cause connection/socket exhaustion in Service Fabric MI scenarios. Consider caching/reusing an HttpClient per stable callback key (similar to the non-mTLS/mTLS pools) so repeated calls with the same delegate reuse the same client/handler.
There was a problem hiding this comment.
The expensive stuff is not the creation of a new HttpClient, it's the creation of the handler? And that code hasn't changed.
Service Fabric scenario should not be HTTP heavy. What is your suggestion?
There was a problem hiding this comment.
Fixed in commit af9d6ca. GetHttpClient(validateServerCert) now caches HttpClient instances in s_httpClientPool keyed by the delegate's method handle + target object identity (RuntimeHelpers.GetHashCode). This is stable even when a new delegate wrapper is created per call (the common Service Fabric case — return ValidateServerCertificateCallback on the same source instance). Also added HttpClientConfig.ConfigureRequestHeadersAndSize for consistency with the other overloads, and updated the tests to assert the new caching semantics.
| #if NET471_OR_GREATER || NETSTANDARD || NET | ||
| var handler = new HttpClientHandler | ||
| { | ||
| ServerCertificateCustomValidationCallback = (message, cert, chain, sslPolicyErrors) => | ||
| { | ||
| return validateServerCert(message, cert, chain, sslPolicyErrors); | ||
| } | ||
| }; | ||
|
|
||
| string key = handler.GetHashCode().ToString(); | ||
| return s_httpClientPool.GetOrAdd(key, new HttpClient(handler)); | ||
| return new HttpClient(handler); | ||
| #else |
There was a problem hiding this comment.
GetHttpClient(validateServerCert) returns a new HttpClient on every call and the caller (HttpManager) does not dispose it, which can lead to socket exhaustion under the Service Fabric managed identity flow. Consider caching this client (e.g., in s_httpClientPool keyed by the callback/delegate identity) or updating the call chain to deterministically dispose per-call clients; also apply HttpClientConfig.ConfigureRequestHeadersAndSize for consistency with the other factory methods.
| internal static void ResetStaticStateForTest() | ||
| { | ||
| lock (s_cacheLock) | ||
| { | ||
| foreach (Lazy<HttpClient> lazy in s_httpClientPool.Values) | ||
| { | ||
| if (lazy.IsValueCreated) | ||
| lazy.Value?.Dispose(); | ||
| } | ||
|
|
||
| s_httpClientPool.Clear(); | ||
| Interlocked.Exchange(ref s_httpClientCreationCount, 0); | ||
| } |
There was a problem hiding this comment.
When the pool reaches the threshold, CheckAndManageCache clears s_httpClientPool without disposing the created HttpClient instances. Since you already have disposal logic here for tests, consider extracting/reusing it so cache eviction also disposes any Lazy<HttpClient> values that were created, otherwise handlers/sockets can linger until GC/finalization.
… avoid socket exhaustion Agent-Logs-Url: https://github.com/AzureAD/microsoft-authentication-library-for-dotnet/sessions/97bbfa6a-513f-4c41-a80c-f8ee33f6e31f Co-authored-by: bgavrilMS <12273384+bgavrilMS@users.noreply.github.com>
... The broken
The PR title can be edited directly in the GitHub PR interface — I'm not able to update it from code changes. The correct issue number is #5948 as noted. |
gladjohn
left a comment
There was a problem hiding this comment.
clean fix for the eager GetOrAdd leak, Lazy with ExecutionAndPublication is the right call
| // This correctly caches clients when the same method on the same object instance | ||
| // is used as the callback (the common Service Fabric case), while still creating | ||
| // distinct clients for genuinely different callbacks. | ||
| string key = "sf_" + |
There was a problem hiding this comment.
@copilot - this is too complicated. Please revert everything related to Service Fabric
There was a problem hiding this comment.
Reverted in commit 9e7d2e6. The Service Fabric overload is back to returning new HttpClient(handler) directly with no caching or delegate-identity key logic.
…pClient per call Agent-Logs-Url: https://github.com/AzureAD/microsoft-authentication-library-for-dotnet/sessions/55a48d86-6457-42b4-9fd6-d21d0c5683d2 Co-authored-by: bgavrilMS <12273384+bgavrilMS@users.noreply.github.com>
Fixes #5948
Fix a leak of HttpCLient objects from improper use of ConcurrentDictionary