-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Millions/Billions of HangFire queries being highlight in SSMS Activity Monitor #2120
Comments
Please try installing the latest version of Microsoft.Data.SqlClient package (latest stable version at the moment is 5.0.1) and use the following connection factory in your Hangfire configuration logic in order to use .UseSqlServerStorage(
() => new Microsoft.Data.SqlClient.SqlConnection(connectionString)); Please note that Microsoft.Data.SqlClient package uses encryption by default since version 4.0.0, so you might be required in some cases to add |
Thanks for the reply @odinserj. However, before I make this change, could you please explain what it's doing and why it'll fix the issue? Also, why do I need to remove the SlidingInvisibilityTimeout and QueuePollInterval that other posts mention should be added? |
I was pursuing this issue for years as it happens from time to time in specific environments under specific conditions. I've added protection to almost every layer of retries in Hangfire to avoid running them in an uncontrolled fashion, e.g. without any delay but nothing worked. But for such high number of queries there should be a retry loop with no delay between attempts for some reason. This summer I noticed that with
No, you don't need to remove them at all. |
OK, I've left in |
Unfortunately the same issue is still present. With the suggested changes, it's still hitting billions of calls. We've been monitoring it for a few weeks. Occasionally it's like 15,000, but peaks to greater than one billion again: GlobalConfiguration.Configuration.UseSqlServerStorage(() => new Microsoft.Data.SqlClient.SqlConnection(RP_Globals.ConnectionString), new SqlServerStorageOptions()
{
SlidingInvisibilityTimeout = TimeSpan.FromMinutes(5),
QueuePollInterval = TimeSpan.Zero
}); |
Thanks for your feedback @djdd87, working on it. By the way, do you notice high network incoming traffic to your SQL Server when facing this issue? |
Also, does the CPU consumption shown by SSMS matches the same metric in the Task Manager dialog? |
Thanks for getting back to me! I'll speak to our production team and see if I can get that information. What does setting SlidingInvisibilityTimeout and QueuePollInterval actually do? It's not business critical for us that our CRON tasks run exactly at 06:00:00, i.e. they could run at 06:00:15 and it'll still work perfectly. Am I right in thinking that QueuePollInterval set to 0 just constantly spams SQL server as fast as it can and I could just change it to 15 seconds if I'm not bothered if it starts 06:00:00 or 06:00:15? |
Thank you, will appreciate any additional information regarding this problem.
You can increase the value to 15 seconds, in this case I expect the probability of such an issue can be decreased. But nevertheless something strange sometimes happens with the connection pool itself so that a lot of open connections (much higher than the maximum configured number) are observed from the one side, and timeout exceptions with inability to get a connection from the pool on the other side. Current fetching implementation can be simplified and waits on SQL Server's side can be avoided, so I will make these changes in 1.7.33. |
I have installed version 1.8.12, the latest one, and the issue persists. By setting QueuePollInterval = TimeSpan.MaxValue, the queries are executed every 7-8 seconds. I don't have any queues installed and only have one recurring job. I would like to completely disable polling because it's unnecessary. Is it possible? |
I'm in a similar boat. Regardless of QueuePollInterval setting value, I can see at least 3 polls per second. I've tried many different combinations of intervals and values, and it seems to ignore the values and poll every 200ms. I've tried the Microsoft SQL Client and no relief there either. My setup is an ASP.NET (MVC5) application, and SimpleInjector IOC containers. I'm not sure what version of Hangfire first induced the issue, but it still exists in current releases. I've ensured schemas are latest version. No other issues are being experienced, other than unable to control the frequency of database polling (SqlServer). |
We've just moved to redis instead. |
Ironic that the interval is exactly 200ms, regardless of what I set it to. Not sure why the value is ignored, even with primarily default settings. |
@fiidim can you post here your settings related to Hangfire and Hangfire.SqlServer? |
Queries used for sliding invisibility timeout-based fetching are unified now, and we can use the same implementation for sub- and over second polling delay configured. Possibly relates to #2120.
I made some headway on this. Seems that the |
Further strange behaviour occurs when using a |
Yes, historically semaphore that limits polling only to a single worker was applied only to sub-second intervals due to their different implementation. Since now implementation is the same for sub-second and above-second intervals, I have changed this logic and since 1.8.14 released today, semaphore is applied in both cases. Please upgrade to the newest version and let me know if it solves the issue. |
Yes! That worked perfectly (1.8.14). I am very pleased with the behaviour! Thank you for taking a look and resolving my issue. |
Hi, I'm trying to get to the bottom of some performance issues on our production environment and SSMS is highlighting the following query in the activity monitor under expensive queries:
We are using the following settings, which are implied by similar issues already:
The query count seems to deviate between about 23,000 and >1.2 billion (as per the above screenshot). Why is this and have I misconfigured something? Here's a screenshot of all our recurring jobs (we only use HangFire for these recurring jobs):
The text was updated successfully, but these errors were encountered: