Unoptimal hash code combine in TypeHashingAlgorithms.ComputeMethodHashCode #103070

filipnavara · 2024-06-05T09:34:17Z

There's a TODO in the code:

runtime/src/coreclr/tools/Common/TypeSystem/Common/TypeHashingAlgorithms.cs

Lines 229 to 234 in b438888

    
           public static int ComputeMethodHashCode(int typeHashCode, int nameOrNameAndGenericArgumentsHashCode) 
        
           { 
        
               // TODO! This hash combining function isn't good, but it matches logic used in the past 
        
               // consider changing to a better combining function once all uses use this function 
        
               return typeHashCode ^ nameOrNameAndGenericArgumentsHashCode; 
        
           }

Notably, this comes up as a sore spot in my profiles:

Changing to HashCode.Combine yields about >20% less hash collisions in this method alone:

This compilation speed improvement is quite noticable.

Unfortunately, there seems to be some dependency on the hash combining algorithm somewhere in the code because this produces an executable that fails to run at runtime:

Process terminated. Failed to create generic virtual method implementation

Declaring type: Microsoft.Extensions.Options.OptionsBuilder`1<Microsoft.Extensions.Options.StartupValidatorOptions>
Method name: Configure
Instantiation:
  Argument 00000000: Microsoft.Extensions.Options.IOptionsMonitor`1<Microsoft.Extensions.DependencyInjection.MetricsServiceExtensions+NoOpOptions>

   at System.RuntimeExceptionHelpers.FailFast(String, Exception, String, RhFailFastReason, IntPtr, IntPtr) + 0x2b0
   at Internal.Runtime.TypeLoader.TypeLoaderEnvironment.ResolveGenericVirtualMethodTarget(RuntimeTypeHandle, RuntimeMethodHandle) + 0x210
   at System.Runtime.TypeLoaderExports.<>c.<GVMLookupForSlotSlow>b__8_0(IntPtr context, IntPtr signature, Object contextObject, IntPtr& auxResult) + 0x31
   at System.Runtime.TypeLoaderExports.CacheMiss(IntPtr, IntPtr, RuntimeObjectFactory, Object) + 0x28
   at System.Runtime.TypeLoaderExports.GVMLookupForSlotSlow(Object, RuntimeMethodHandle) + 0x4e
   at System.Runtime.TypeLoaderExports.GVMLookupForSlot(Object, RuntimeMethodHandle) + 0x9a
   at Microsoft.Extensions.DependencyInjection.OptionsBuilderExtensions.ValidateOnStart[TOptions](OptionsBuilder`1) + 0x6e
   at Microsoft.Extensions.DependencyInjection.MetricsServiceExtensions.AddMetrics(IServiceCollection) + 0x47
   at Microsoft.Extensions.DependencyInjection.HttpClientFactoryServiceCollectionExtensions.AddHttpClient(IServiceCollection) + 0x26
   at MailClient.Program.SetUpDependencyInjection() + 0x2a9
   at MailClient.Program.Main(String[] args) + 0xd2

The text was updated successfully, but these errors were encountered:

dotnet-policy-service · 2024-06-05T09:34:33Z

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

huoyaoyuan · 2024-06-05T11:05:06Z

At which level are you replacing with HashCode.Combine? I can't imagine anything faster than XOR if you only replace the ComputeMethodHashCode method.

filipnavara · 2024-06-05T11:44:49Z

I can't imagine anything faster than XOR if you only replace the ComputeMethodHashCode method.

The problem is not the speed of the hashing itself. That's done only once per each method. The problem are the collisions that are caused by matching hashes and the simple XOR is extremely bad at that.

The profile above is with TypeHashingAlgorithms.ComputeMethodHashCode replaced to use HashCode.Combine. There are places in unmanaged CLR/R2R code that depend on the stable hash algorithm at the moment. I expected these places not to exist on NativeAOT since the relevant part of the code links to System.Private.TypeLoader.dll which uses the shared TypeHashingAlgorithms implementation, but apparently there is still some dependency.

jkotas · 2024-06-05T14:04:22Z

HashCode.Combine is randomized. It does not produce a stable hashcode.

NativeAOT file format depends on the stable hashcodes for these things. It should be fine to improve this combining function, but it needs be stable.

filipnavara · 2024-06-05T14:09:49Z

HashCode.Combine is randomized. It does not produce a stable hashcode.

Huh, that explains why it didn't work. I guess I should file a documentation issue since the current documentation doesn't mention this.

filipnavara · 2024-06-05T14:23:09Z

I tried using stable combination function:

        public static int ComputeMethodHashCode(int typeHashCode, int nameOrNameAndGenericArgumentsHashCode)
        {
            return unchecked((typeHashCode * (int)0xA5555529) + nameOrNameAndGenericArgumentsHashCode);
        }

There's likely still some dependency on the older algorithm. The app crashes at runtime at different place though:

System.NotSupportedException: 'Microsoft.Extensions.Options.OptionsMonitor`1[Microsoft.Extensions.Http.HttpClientFactoryOptions]' is missing native code or metadata. This can happen for code that is not compatible with trimming or AOT. Inspect and fix trimming and AOT related warnings that were generated when the app was published. For more information see https://aka.ms/nativeaot-compatibility
   at System.Reflection.Runtime.General.TypeUnifier.WithVerifiedTypeHandle(RuntimeConstructedGenericTypeInfo, RuntimeTypeInfo[]) + 0x7f
   at System.Reflection.Runtime.TypeInfos.RuntimeTypeInfo.MakeGenericType(Type[]) + 0x218
   at System.RuntimeType.MakeGenericType(Type[]) + 0x28
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.TryCreateOpenGeneric(ServiceDescriptor, ServiceIdentifier, CallSiteChain, Int32, Boolean) + 0xc0
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.TryCreateOpenGeneric(ServiceIdentifier, CallSiteChain) + 0xf9
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.CreateCallSite(ServiceIdentifier serviceIdentifier, CallSiteChain callSiteChain) + 0xb5
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.CreateArgumentCallSites(ServiceIdentifier, Type, CallSiteChain, ParameterInfo[], Boolean) + 0x143
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.CreateConstructorCallSite(ResultCache, ServiceIdentifier, Type, CallSiteChain) + 0x224
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.TryCreateExact(ServiceDescriptor, ServiceIdentifier, CallSiteChain, Int32) + 0xe9
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.TryCreateExact(ServiceIdentifier, CallSiteChain) + 0xd8
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.CreateCallSite(ServiceIdentifier serviceIdentifier, CallSiteChain callSiteChain) + 0xa0
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.GetCallSite(ServiceIdentifier, CallSiteChain) + 0x5d
   at Microsoft.Extensions.DependencyInjection.ServiceProvider.CreateServiceAccessor(ServiceIdentifier serviceIdentifier) + 0x5a
   at System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd(TKey, Func`2) + 0xd4
   at Microsoft.Extensions.DependencyInjection.ServiceProvider.GetService(ServiceIdentifier, ServiceProviderEngineScope) + 0x28
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.ServiceProviderEngineScope.GetService(Type) + 0x23
   at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService(IServiceProvider, Type) + 0x30
   at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService[T](IServiceProvider) + 0x21
   at Microsoft.Extensions.DependencyInjection.HttpClientFactoryServiceCollectionExtensions.<>c.<AddHttpClient>b__0_0(IServiceProvider serviceProvider) + 0xc
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.VisitFactory(FactoryCallSite, RuntimeResolverContext) + 0xd
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteVisitor`2.VisitCallSiteMain(ServiceCallSite, TArgument) + 0xd5
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.VisitRootCache(ServiceCallSite, RuntimeResolverContext) + 0x4b
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteVisitor`2.VisitCallSite(ServiceCallSite callSite, TArgument argument) + 0x91
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.Resolve(ServiceCallSite, ServiceProviderEngineScope) + 0x1c
   at Microsoft.Extensions.DependencyInjection.ServiceProvider.CreateServiceAccessor(ServiceIdentifier serviceIdentifier) + 0x151
   at System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd(TKey, Func`2) + 0xd4
   at Microsoft.Extensions.DependencyInjection.ServiceProvider.GetService(ServiceIdentifier, ServiceProviderEngineScope) + 0x28
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.ServiceProviderEngineScope.GetService(Type) + 0x23
   at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService(IServiceProvider, Type) + 0x30
   at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService[T](IServiceProvider) + 0x21
   at MailClient.Program.<>c.<SetUpDependencyInjection>b__290_9(IServiceProvider sp) + 0x1f
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.VisitFactory(FactoryCallSite, RuntimeResolverContext) + 0xd
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteVisitor`2.VisitCallSiteMain(ServiceCallSite, TArgument) + 0xd5
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.VisitRootCache(ServiceCallSite, RuntimeResolverContext) + 0x4b
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteVisitor`2.VisitCallSite(ServiceCallSite callSite, TArgument argument) + 0x91
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.Resolve(ServiceCallSite, ServiceProviderEngineScope) + 0x1c
   at Microsoft.Extensions.DependencyInjection.ServiceProvider.CreateServiceAccessor(ServiceIdentifier serviceIdentifier) + 0x151
   at System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd(TKey, Func`2) + 0xd4
   at Microsoft.Extensions.DependencyInjection.ServiceProvider.GetService(ServiceIdentifier, ServiceProviderEngineScope) + 0x28
   at Microsoft.Extensions.DependencyInjection.ServiceProvider.GetService(Type) + 0xb
   at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService(IServiceProvider, Type) + 0x30
   at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService[T](IServiceProvider) + 0x21
   at MailClient.Program.Main(String[] args) + 0x4f3

The hash performs mildly better than the XOR, worse than the first attempt with HashCode.Combine:

vcsjones · 2024-06-05T17:40:42Z

. I guess I should file a documentation issue since the current documentation doesn't mention this.

It is, but in the class remarks. https://learn.microsoft.com/en-us/dotnet/api/system.hashcode?view=net-8.0#remarks

filipnavara · 2024-06-06T16:11:21Z

@vcsjones Thanks! I just searched for "HashCode.Combine" directly.

I went a bit more down the rabbit hole analyzing this and I am not necessarily sure that changing the hash function is a good enough fix.

In this particular instance we are running into a huge instance of LockFreeReaderHashtable that is particularly big and also quite full. It uses an 2^n sized hashtable with open addressing and double hashing in case of collision. The table is resized when a predetermined fill threshhold is reached. Notably, this is different from regular Dictionary<K, V> that uses prime-sized hash table, open addressing, linear probing, and detects high collision rates and resizes.

I don't have the numbers off-hand, but in the pessimistic case where the hashtable is nearly full, and you search for an element that is NOT in the hashtable (yet), you can end up searching linearly up to 60% (resize threshhold) of the table before finding out that the element is not there.

NodeFactory overwhelmingly uses the LockFreeReaderHashtable in a GetOrAdd/GetOrCreateValue fashion which lazily creates the values if they are not found. That ends up hitting the pessimistic lookup case rather often and badly no matter the quality of the hash code functions.

Overall, this data structure with these parameters doesn't scale up well. We can likely tweak some parameters of the
Hashtable algorithm to get significant improvements but maybe a different approach is in order. Ideas that come to mind:

Keep track of the maximum collision steps when adding an item. When looking up an item, stop the open addressing lookup after the said number of steps.
Add a bloom filter if the hashtable gets too big to quickly answer that a given item is not in the set.

Both of these algorithms may be difficult to implement with the required concurrency requirements. That said, my benchmarks show the bottlenecks overwhelmingly happening in parts of the execution which are single-threaded. I am not necessarily good at algorithms, so any additional input is welcome.

jkotas · 2024-06-06T18:38:56Z

you can end up searching linearly up to 60% (resize threshhold)

This can only happen when the hashcodes are poorly distributed or when you are really unlucky.

I think fixing the hashcodes should take care of this problem.

filipnavara · 2024-06-07T09:17:29Z

This can only happen when the hashcodes are poorly distributed or when you are really unlucky.

They are definitely poorly distributed. There's no question about that. It may not be the only issue though and I don't have enough data to confirm/refute that yet.

I started tracking the collision numbers on Add (on a fairly large app build):

RuntimeDeterminedTypeKeyHashtable gets over 2000 collisions but they do go down during expansion which suggests that they are not identical hash codes.
MethodEntrypointHashtable get over 14000 collisions and they only go up.

For scale, the regular Dictionary uses a 100 as collision threshold for rehashing.

filipnavara · 2024-06-07T10:36:35Z

I may have found one of the root causes. WinForms generates a large number of [System.Windows.Forms]SourceGenerated.EnumValidator Validate methods. They differ with the parameters signature but that's never taken into account in the MethodDesc hash, so they all cause a collision.

jkotas · 2024-06-07T11:56:50Z

For scale, the regular Dictionary uses a 100 as collision threshold for rehashing.

Nit: This threshold is used to switch to more expensive randomized string hash code and only for string keys. It is done for security reasons (protection against DoS attacks).

filipnavara · 2024-06-07T12:13:35Z

Few other notable collisions (same hash, but also same type and name):

    104 [System.Windows.Forms]SourceGenerated.EnumValidator Validate = 165151430
    107 [S.P.CoreLib]System.Runtime.CompilerServices.Unsafe Add = -821482047
    119 [S.P.CoreLib]Internal.Metadata.NativeFormat.MdBinaryReader Read = -456562824
    132 [S.P.CoreLib]System.Runtime.CompilerServices.Unsafe As = 1267830177
    174 [S.P.CoreLib]System.Runtime.CompilerServices.Unsafe AsRef = -693195293
    196 [S.P.CoreLib]Internal.Runtime.MethodTable Of = -396193247
    232 [S.P.CoreLib]System.Runtime.CompilerServices.Unsafe As = -740372691
    392 [S.P.CompilerGenerated]Internal.CompilerGenerated.<Module> CalliMarshallingMethodThunk = 1205279480
  21913 [S.P.CompilerGenerated]Internal.CompilerGenerated.<Module> DynamicInvoke = 292316103

First column is the number of such MethodDescs created.

jkotas · 2024-06-07T12:33:06Z

For MethodEntrypointHashtable and similar hashtables that use MethodDesc as the key, the problem can be worked around by switching to object hashcode. Change these two lines

runtime/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/DependencyAnalysis/NodeFactory.cs

Lines 922 to 923 in fb1fe7f

    
           protected override int GetKeyHashCode(MethodDesc key) => key.GetHashCode(); 
        
           protected override int GetValueHashCode(IMethodNode value) => value.Method.GetHashCode();

to use RuntimeHelpers.GetHashCode. It is not the proper fix, but it would be better than nothing.

filipnavara · 2024-06-07T15:03:31Z

Is there runtime dependency on stability of the DynamicInvokeMethodThunk hash code?

This nearly removes the bottleneck:

--- a/src/coreclr/tools/Common/TypeSystem/IL/Stubs/DynamicInvokeMethodThunk.cs
+++ b/src/coreclr/tools/Common/TypeSystem/IL/Stubs/DynamicInvokeMethodThunk.cs
@@ -112,6 +112,11 @@ public override MethodSignature Signature
             }
         }

+        protected override int ComputeHashCode()
+        {
+            return base.ComputeHashCode() ^ _targetSignature.GetHashCode();
+        }
+
         public override string Name => "DynamicInvoke";

         public override string DiagnosticName => "DynamicInvoke";

I tried using RuntimeHelpers.GetHashCode but it seemingly didn't have the expected impact (measured by wall clock). I'll run it through profiler later on.

filipnavara added the area-NativeAOT-coreclr label Jun 5, 2024

github-project-automation bot added this to AppModel Jun 5, 2024

dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Jun 5, 2024

filipnavara mentioned this issue Jun 11, 2024

Fix DynamicInvokeMethodThunk hash code collisions #103274

Merged

jkotas closed this as completed in #103274 Jun 17, 2024

dotnet-policy-service bot removed the untriaged New issue has not been triaged by the area owner label Jun 17, 2024

github-actions bot locked and limited conversation to collaborators Jul 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unoptimal hash code combine in TypeHashingAlgorithms.ComputeMethodHashCode #103070

Unoptimal hash code combine in TypeHashingAlgorithms.ComputeMethodHashCode #103070

filipnavara commented Jun 5, 2024 •

edited

Loading

dotnet-policy-service bot commented Jun 5, 2024

huoyaoyuan commented Jun 5, 2024

filipnavara commented Jun 5, 2024

jkotas commented Jun 5, 2024

filipnavara commented Jun 5, 2024

filipnavara commented Jun 5, 2024 •

edited

Loading

vcsjones commented Jun 5, 2024

filipnavara commented Jun 6, 2024 •

edited

Loading

jkotas commented Jun 6, 2024

filipnavara commented Jun 7, 2024

filipnavara commented Jun 7, 2024

jkotas commented Jun 7, 2024

filipnavara commented Jun 7, 2024

jkotas commented Jun 7, 2024

filipnavara commented Jun 7, 2024

Unoptimal hash code combine in TypeHashingAlgorithms.ComputeMethodHashCode #103070

Unoptimal hash code combine in TypeHashingAlgorithms.ComputeMethodHashCode #103070

Comments

filipnavara commented Jun 5, 2024 • edited Loading

dotnet-policy-service bot commented Jun 5, 2024

huoyaoyuan commented Jun 5, 2024

filipnavara commented Jun 5, 2024

jkotas commented Jun 5, 2024

filipnavara commented Jun 5, 2024

filipnavara commented Jun 5, 2024 • edited Loading

vcsjones commented Jun 5, 2024

filipnavara commented Jun 6, 2024 • edited Loading

jkotas commented Jun 6, 2024

filipnavara commented Jun 7, 2024

filipnavara commented Jun 7, 2024

jkotas commented Jun 7, 2024

filipnavara commented Jun 7, 2024

jkotas commented Jun 7, 2024

filipnavara commented Jun 7, 2024

filipnavara commented Jun 5, 2024 •

edited

Loading

filipnavara commented Jun 5, 2024 •

edited

Loading

filipnavara commented Jun 6, 2024 •

edited

Loading