Conversation
|
Tagging subscribers to this area: @agocke |
…superlinear behavior Fixes exponential time complexity in GetVersionResilientTypeHashCode for deeply nested generic type instantiations (e.g., polymorphic recursion). The hash code is now lazily computed and stored in m_cachedVersionResilientHashCode on MethodTableAuxiliaryData. A stored value of 0 indicates the field hasn't been initialized yet. The memory is zero-initialized by the loader heap, so no explicit initialization is required for new instantiations. Before fix: n=31 took ~15s (exponential growth) After fix: n=31 takes ~4ms (constant time) Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/3a3fba93-8075-497c-bb07-a3fe274858a9 Co-authored-by: davidwrighton <10779849+davidwrighton@users.noreply.github.com>
Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/3a3fba93-8075-497c-bb07-a3fe274858a9 Co-authored-by: davidwrighton <10779849+davidwrighton@users.noreply.github.com>
|
@copilot This increases the memory needed on the loader heap. Please measure the memory increase for this scenario, and present the increase both in bytes and by percentage. Also, this change allows us to consider more capable hash combining algorithms. Please measure how effective the current hash combining algorithm is for generic instatiations. |
|
@copilot arrays are also MethodTable and can use the cache. |
… slot Place the int field immediately after the union to fill the 4-byte implicit alignment padding that already exists between the union (4 bytes) and m_pLoaderModule (8-byte aligned pointer) on 64-bit platforms. sizeof(MethodTableAuxiliaryData) remains 24 bytes before and after this change on 64-bit release builds, so the memory overhead is 0 bytes / 0%. Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/c5ceb726-3055-4d02-a92f-e0d52438fab9 Co-authored-by: davidwrighton <10779849+davidwrighton@users.noreply.github.com>
Memory impact (commit d9e3911): On 32-bit, all pointers are 4 bytes so no alignment padding exists; the struct would grow by 4 bytes (12 → 16 bytes, 33%). Hash algorithm effectiveness: I simulated the hash algorithm on the 32 distinct generic types produced by the polymorphic recursion benchmark and found zero collisions with good bit distribution across the full 32-bit hash space. I also checked common .NET generic instantiations ( |
Arrays are also backed by MethodTable (TypeHandle::IsArray() returns !IsTypeDesc() && AsMethodTable()->IsArray()), so they have the same MethodTableAuxiliaryData and can benefit from the same caching mechanism. Restructure GetVersionResilientTypeHashCode to handle both arrays and non-array types in the single !type.IsTypeDesc() branch, checking/storing the cache before and after the type-specific hash computation. Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/1e4f951d-0276-47ac-8fc3-7d5609071844 Co-authored-by: davidwrighton <10779849+davidwrighton@users.noreply.github.com>
Done in commit f8a7695. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Pull request overview
This PR adds memoization for GetVersionResilientTypeHashCode(TypeHandle) by caching the computed hash code on MethodTableAuxiliaryData, addressing exponential recomputation in polymorphic recursion scenarios.
Changes:
- Added a lazily-initialized cached version-resilient hash code field to
MethodTableAuxiliaryData. - Updated
GetVersionResilientTypeHashCode(TypeHandle)to use the cache for all MethodTable-backed types (including arrays), computing and storing the hash once perMethodTable.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/coreclr/vm/versionresilienthashcode.cpp | Uses a per-MethodTable cached hash to avoid repeated recursive recomputation; unifies array/non-array MethodTable handling under the same cache path. |
| src/coreclr/vm/methodtable.h | Adds m_cachedVersionResilientHashCode to MethodTableAuxiliaryData to store the memoized result (intended to fit existing x64 alignment padding). |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
@copilot A lot of tests are failing on all 32-bit platforms (x64, arm, wasm). The extra field likely broke an hidden assumption in the code about the layout of MethodTableAuxiliaryData structure. Please investigate. |
…uxiliaryData The managed mirror of MethodTableAuxiliaryData in RuntimeHelpers.CoreCLR.cs was missing the new m_cachedVersionResilientHashCode field. On 32-bit platforms (x86, ARM32, WASM), pointer fields are 4 bytes and there is no alignment padding between the flags union and the pointer fields. The extra 4-byte field in the C++ struct shifted LoaderModule and ExposedClassObjectRaw by 4 bytes relative to what the managed struct expected, corrupting all access to these fields on 32-bit platforms. Fix by adding the corresponding private field to the managed struct so its layout exactly matches the C++ struct on all platforms: - 64-bit: same as before (fills existing alignment padding, size unchanged) - 32-bit: struct grows from 12 to 16 bytes on both sides, offsets now agree Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/71f12275-cf7d-46ac-a8cd-53247cc1990d Co-authored-by: jkotas <6668460+jkotas@users.noreply.github.com>
Fixed in commit ee4e1b8. The root cause was that |
...coreclr/System.Private.CoreLib/src/System/Runtime/CompilerServices/RuntimeHelpers.CoreCLR.cs
Outdated
Show resolved
Hide resolved
|
Note This review was generated by Copilot and may contain inaccuracies. A human reviewer should verify the findings. 🤖 Copilot Code Review — PR #126534Holistic AssessmentMotivation: This PR addresses a real and significant performance problem — exponential time complexity in Approach: Caching the hash code on Summary: ✅ LGTM. The change is well-structured, the thread-safety approach is correct for this scenario, the managed/native layout is consistent, and the caching logic handles all type paths correctly. The only minor observation is advisory (no blocking issues found). Detailed Findings✅ Correctness — Hash caching logic is soundThe caching logic in
✅ Thread Safety — Volatile pattern is appropriateThe
✅ Memory Layout — Native and managed structs match
✅ Zero-initialization — Loader heap guarantees initial sentinel valueThe loader heap ( 💡 Sentinel value 0 — Negligible edge case, well-documentedTypes with a version-resilient hash code of exactly 0 will not be cached and will be recomputed on every call. This is:
No action needed — this is the standard sentinel-value pattern used throughout the runtime.
|
|
/ba-g infrastructure timeouts |
GetVersionResilientTypeHashCoderecomputes hash codes recursively without memoization. With polymorphic recursion, each level doubles the number of unique generic instantiations, causing exponential blowup —n=31took ~15s before this fix.Description
src/coreclr/vm/methodtable.h: Addedint m_cachedVersionResilientHashCodetoMethodTableAuxiliaryData. The field is placed immediately after the 4-bytem_dwFlagsunion, filling the 4 bytes of implicit alignment padding that already exists before the 8-byte-alignedm_pLoaderModulepointer on 64-bit platforms. This means the struct size is unchanged on 64-bit (0 bytes / 0% increase). The field is lazily initialized —0means not yet set, and is zero-initialized for free by the loader heap so no explicit init is needed for new instantiations.src/coreclr/vm/versionresilienthashcode.cpp: InGetVersionResilientTypeHashCode, the separate early-return branch for array types has been removed. SinceTypeHandle::IsArray()is defined as!IsTypeDesc() && AsMethodTable()->IsArray(), arrays are always MethodTable-backed and share the sameMethodTableAuxiliaryData. Both array and non-array MethodTable-backed types are now handled in a single!type.IsTypeDesc()branch, with the cache check and store applied uniformly to all of them. The type-specific hash computation (ComputeArrayTypeHashCodevs. name/generic logic) is selected inside the branch after the cache check. Types whose hash code is exactly0are not cached (sentinel collision), but this is vanishingly rare given the non-zero initial state of the hash function (hash1 = 0x6DA3B944).src/coreclr/System.Private.CoreLib/src/System/Runtime/CompilerServices/RuntimeHelpers.CoreCLR.cs: Added the matching_cachedVersionResilientHashCodefield to the managed mirror ofMethodTableAuxiliaryData. This struct uses[StructLayout(LayoutKind.Sequential)]and must exactly mirror the C++ layout. Without this fix, the new C++ field shiftedLoaderModuleandExposedClassObjectRawby 4 bytes on 32-bit platforms (x86, ARM32, WASM) where there is no alignment padding — corrupting all accesses to those fields. On 64-bit, the C# compiler's natural alignment padding incidentally covered the gap, so 64-bit was unaffected.Memory Impact
On 64-bit release builds,
MethodTableAuxiliaryDataalready had 4 bytes of implicit alignment padding between the union andm_pLoaderModule. The new field fills that slot exactly:sizeof(MethodTableAuxiliaryData)(release x64)On 32-bit, all pointers are 4 bytes so no alignment gap exists; the struct grows by 4 bytes (12 → 16 bytes, 33%). The managed mirror in
RuntimeHelpers.CoreCLR.csis updated to match this growth on both sides.Hash Algorithm Effectiveness
The existing
ComputeGenericInstanceHashCodealgorithm was evaluated against:List<T>,Dictionary<K,V>,Task<T>,Nullable<T>,IEnumerable<T>withint/string/bool/double): 0 collisionsThe existing algorithm produces high-quality hashes for generic instantiations; the exponential blowup was purely a memoization problem, not a hash quality issue.
Performance
Benchmark from the issue (polymorphic recursion across two type parameters):
Constant time after the first instantiation at each depth.