Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PerformanceCounters causes hang #587

Closed
AlexanderKot opened this issue Mar 3, 2017 · 7 comments
Closed

PerformanceCounters causes hang #587

AlexanderKot opened this issue Mar 3, 2017 · 7 comments

Comments

@AlexanderKot
Copy link

Hello
I am using StackExchange.Redis with NHibernate.Caches.Redis
I have experimented with behavior of our system if it will loose connection with Redis (lunch application and then stop Redis server).
On my environment (w7x64, Vs 2015U3, IIS Express 10.0.14358, .Net 4.6) I have stable hang on accessing PerformanceCounter when preparing exception after connection was lost.
After this happens IISExpress is locked and do not accept new requests.

I cannot reproduce the same behavior on my colleagues PCs.
I have found similar bugs:
Particular/NServiceBus#2047
SignalR/SignalR#3414

I have tried solution used in SignalR:
SignalR/SignalR@23e5b1b
I have added EnsureValidCulture() method to StackExchange.Redis and call it in TryGetSystemCPU method.
Unfortunately, it does not help.

There exists workaround currently:
ConnectionMultiplexer.IncludeDetailInExceptions must be false
This flag is set true by default.

Do somebody has any ideas?

PS
Part of call stack from my environment:
[Managed to Native Transition]
mscorlib.dll!Microsoft.Win32.RegistryKey.InternalGetValue(string name, object defaultValue, bool doNotExpand, bool checkSecurity) Unknown
mscorlib.dll!Microsoft.Win32.RegistryKey.GetValue(string name) Unknown
System.dll!System.Diagnostics.PerformanceMonitor.GetData(string item) Unknown
System.dll!System.Diagnostics.PerformanceCounterLib.GetPerformanceData(string item) Unknown
System.dll!System.Diagnostics.PerformanceCounterLib.CategoryTable.get() Unknown
System.dll!System.Diagnostics.PerformanceCounterLib.CounterExists(string category, string counter, ref bool categoryExists) Unknown
System.dll!System.Diagnostics.PerformanceCounterLib.CounterExists(string machine, string category, string counter) Unknown
System.dll!System.Diagnostics.PerformanceCounter.InitializeImpl() Unknown
System.dll!System.Diagnostics.PerformanceCounter.PerformanceCounter(string categoryName, string counterName, string instanceName, bool readOnly) Unknown
System.dll!System.Diagnostics.PerformanceCounter.PerformanceCounter(string categoryName, string counterName, string instanceName) Unknown

StackExchange.Redis.dll!StackExchange.Redis.PerfCounterHelper.TryGetSystemCPU(out float value)

@niemyjski
Copy link

Any news on this?

@Kesmy
Copy link

Kesmy commented Mar 31, 2017

I have the exact same issue. I thought it was just my machine, but I guess there are at least three of us.

@NickCraver
Copy link
Collaborator

I believe this was fixed in #589 - @mgravell might you have time for a build?

@AlexanderKot
Copy link
Author

Hello
Unfortunately this fix do not help in my case.
UnauthorizedAccessException not raised and _disabled flag not seted. Calling thread is locked somewhere inside .Net internals:
mscorlib.dll!Microsoft.Win32.RegistryKey.InternalGetValue(string name, object defaultValue, bool
(inside this method)

I have found some related topics, but not yet tried this:
https://stackoverflow.com/questions/4209366/what-would-make-performancecountercategory-exists-hang-indefinitely
https://stackoverflow.com/questions/2868068/performancecounter-nextvalue-hangs-on-some-machines
https://support.microsoft.com/en-us/help/300956/how-to-manually-rebuild-performance-counter-library-values

@mgravell
Copy link
Collaborator

mgravell commented Apr 3, 2017 via email

@niemyjski
Copy link

Yes, that sounds like a good plan :). The less work I have to do the better, if I need diagnostics I'll enable it and take the hit.

@AlexanderKot
Copy link
Author

For me IncludeDetailInExceptions = false by default is only reasonable and safe solution in current situation.
I have already assigned this value in our project.
Bug itself is interesting and seems can have many reasons (see CurrentCulture discussion in SignalR thread, printers, rights etc).
So it will happens again (nobody grantee that this will not happen in production) and it will be hard to reproduce it.
PS
Probably, if it is needed solution for obtaining this counters in any environment, it is possible try collect this counters in some child process, and then kill this process if hang occurred, after some timeout and do not try again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants