-
Notifications
You must be signed in to change notification settings - Fork 166
Parallelise statistics DB collector #41
Comments
Some early thoughts on what we can do here. We do have order-dependent bits: for example, if a connection is opened and immediately closed, we'd emit two stats events. They have to be processed in order. However, such events for separate connections can be processed in parallel. Our generic work pool won't be a good fit here: it has no hashing of any kind. So we need work pool with hashing. We identify things by
Our list of worker processes will be fixed in size, so we can use either consistent hashing or modulo-based one. ETS may still be our ultimate bottleneck. It has basic concurrency controls. We need to profile to see if all of the above would make much difference, or would need combining with a different key/value store backend. |
Another idea: we could use one collector per vhost. Aggregation then becomes more involved. Cluster-wide supervision may get more complicated as well. |
Re one collector per vhost -- that doesn't buy you much if you only use one. I'm running into rate issues at the moment with only one. |
This is true. One collector per vhost could still be combined with other parallelisation strategies, though. |
Agreed, we're running hundreds of vhosts on some clusters, so for us it |
Some initial results from profiling the stats db:
The first point essentially means that having a Mgmt UI up makes the problem worse, having multiple Mgmt UI up would make things even worse. So, one thing we can look at, is to split the event handling and the metric queries into two separate processes. Next, do the same thing for the GC of the A final point is about the hibernation in the Any further thoughts from anyone? Next step for me is to get a better handle on the timings split between metric queries, GC and event handling. And how they can be made into independent processes (ultimately making ETS be the bottleneck, that's my plan anyway). Additionally, the suggestion of splitting into per-vhost collection would be good to get in as well. |
Some enhancements available in #89 which depends on rabbitmq/rabbitmq-management-agent#8 |
Sorry, moved QA discussions to the pull request. |
The problem in #89 is that it creates one ETS table per queue per stat (i.e. messages, messages_ready...), so it quickly reaches the system limit. We are validating now the alternative of creating one ETS table per event (i.e. queue_stats, vhost_stats, etc). The current implementation is functionally tested but the performance is a bit behind, as the new ETS tables are very large. We are investigating how to aggregate the data within ETS tables. |
Issues in #89 solved by aggregating the ETS tables as explained in #101. Depends also on rabbitmq/rabbitmq-management-agent#10 |
Hi @michaelklishin , we have a new version in: #101 Tested with 56484 queues on a MacBook Branch #41
Branch stable
Additional testing by @Gsantomaggio https://gist.github.com/Gsantomaggio/0b32a0eb9a08e2316051 |
@dcorbacho things look better in terms of query efficiency and with some modifications to account for higher concurrency of the collectors, rabbithole tests pass most of the time. However, sometimes there are failure, typically in channel listing (I may be wrong about this) and there's a gen server crash logged:
|
Yes, so it seems to affect all ETS table GC processes:
|
(for the record, not much new info there) |
I can confirm that whenever there are no crashes during a rabbithole test suite run, all tests pass (I'm about to push my test suite changes). Here's how I run the suite multiple times in a row:
|
See rabbitmq/rabbitmq-management#41. With the parallel collector there, there's a natural race condition between certain events recorded by the management DB and queries. While sleeps aren't great, developing an awaiting version of every read function in the client is an overkill. While at it, don't assert on more volatile node metrics.
The previous crash is solved by using ordered_set for the key indexes, thus if an entry has been deleted while in the loop, the GC can still get the next element (ets:next/2 always succeed in ordered sets). Fixed in f9fd8b2 While testing options for this fix, I tried ets:select/3 and ets:select/1 which are guaranteed to always succeed. However, ets:select/1 would fail occasionally while running the rabbit_hole suite with a badarg. This happens in Erlang 17.5 but not 18.x. After checking with one member of the OTP team, such fix doesn't seem to be in the release notes. They'll check it and may create a regression test from our call sequence. @michaelklishin, we may have found another OTP bug ¯_(ツ)_/¯ |
Any timeframe for a release with this fix? |
@jippi no promises. There are several milestones released. |
okay, is it an intended thing that the release 3.6.2 M5 got all the files named |
@jippi err, I meant to say that |
@michaelklishin got it ! I've upgraded my cluster to M5 and will continue to monitor it for quirks ! :) thanks for the help and speedy replies - much appreciated |
@jippi Please report back and let me know how the I suspect this issue was introduced somehow in 3.6.x - as we ran into it immediately with no other changes beside the upgrade, as you did - but @michaelklishin does not seem to share my opinion. There has been a lot of work on the |
@noahhaon ran for ~15h now without issues.. it's been through the most busy time of day (morning) without breaking a sweat - so i would say it totally fixed the issue for me :) running M5 I saw the issue across all 3.6.x releases, and no problems at all under 3.5.7 (or below) cc @michaelklishin TL;DR the issue seem to have been resolved |
Thank you for the update, Christian! On Tue, Apr 26, 2016 at 1:23 AM, Christian Winther <notifications@github.com
MK Staff Software Engineer, Pivotal/RabbitMQ |
@michaelklishin one thing i just noticed - 100k messages got (successfully) consumed, but doesn't show up as any kind of |
Please post questions to rabbitmq-users or Stack Overflow. RabbitMQ uses GitHub issues for specific actionable items engineers can work on, not questions. Thank you. Messages could expire due to TTL, and so on. Unless this can be reproduced, I doubt it is a bug. |
3.6.2 RC1 is out. Let's move all support questions there, this thread is already long and discusses all kinds of things. |
This issue is worth an update. We've fixed a couple of leaks in the So we will be looking into a completely new plugin for |
We have followed both the work arounds on version 3.5.7 but did not see a difference. We set the rates_mode=none and stats collection to 60,000 with no improvement. Currently we have disabled management plugin to work around the issue. Is there any other way to work around the issue? |
I have same issue on version 3.6.5 but have not found the right solution yet :( |
FWIW, I have a crontab terminating stats every 24hrs. The node node assigned to stats has been up for 21d now instead of the usual 5-7d. |
The command is documented. Let's not turn issues into a yet another support channel.
|
same problem here (3.6.5, Erlang 18.3.4), stats slowly grow until memory is finished. And i try to lower the stats_event_max_backlog and the collect_statistics_interval. For now I will use @nickjones solution. |
Is there a permanent fix for this anywhere apart from @nickjones work around. We have been facing this issue for a while. |
@beasurajitroy this is not a support venue. Please direct questions to rabbitmq-users. #236 is a more fundamental solution and it will be available in 3.6.7. I don't know if it's "permanent enough", as there are always way to improve stats storage and aggregation but it avoids the problem with a single node taking all the stats-related load. |
@nickjones May i know what exactly is the process name of the process that you reset every 24hrs in crontab? |
@chiarishow May i ask what process do you terminate when you followed @nickjones workaround? Thank you. |
@michaelklishin Is it not possible to share the command here?
|
With a high number of stats-emitting things in a cluster (primarily connections, channels, and queues), statistics collector process gets overwhelmed and begins to use a higher-than-usual amount of RAM and (intentionally) drop stats. This has no correlation with message rates and only triggered by a large number of stats emitting entities.
There are two workarounds:
rates_mode = none
)We should look into parallelising it, or at least the parts that are not order-dependent.
A longer term plan is to make the stats DB distributed, store and aggregate results from multiple nodes. This is out of scope of this issue.
UPD: this has a couple of known issues leading to high[-er] RAM usage in
3.6.2
. Even though they are fixed in later releases, the single stats node architecture can only go so far, so we've started working on #236.The text was updated successfully, but these errors were encountered: