-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce default concurrency #3892
Comments
I'd be curious to see the hard data. I don't think it would hurt to lower it, and my guess is that anyone that needs more than 15 will knowingly tune it to a much higher number in an environment where they have more processing power (ie not most Heroku dynos.) |
Usually using between 5 and 10 myself. |
My experience with Rails (maybe similar to your workload, maybe not) is that you need about 6 threads per hyperthreaded CPU core to approximately fully saturate the CPU. Background threads might trend fewer if CPU heavy, or potentially even more threads if very I/O-heavy. But 5 or so is probably about right for many/most Ruby tasks. So 25 would be enough to fully saturate a medium AWS instance. 15 would still run decently, but probably leave a bit of CPU idle - around 5%-10% with the workloads I run. Obviously depends on the task. If you're just calculating giant Mandelbrot sets or late digits of Pi, 5-6 threads would saturate it just fine :-) |
Don't think I've ever seen a need for >10. 4-6 workers per process, with 1 process per core, is far more common in my experience. |
Heroku has plenty of processing power thank you very much. The “performance” and “private” large dyno has 14GB of ram and 8 dedicated VCPUs (8 hyperthreads backed by 4 real cores on top of a hypervisor). Though I do realize you said “most”. FWIW I think 25 is pretty high. Puma default is 16. Even then most people tune it down on the web. It would be helpful to get some kid of a standardized metric around when it is helpful to add extra sidekiq workers on a box. |
I can't tell here if you mean per process, or total. For total across multiple processes, 25 is probably great. Per process, yeah, 5 is reasonable, 10 is high and 25 is very high. Given the GIL, it's very hard to get CRuby to productively use more than 10-ish threads for real tasks -- and in cases where you can, it's because something like EventMachine or Node.js would have been a better choice than Ruby threads. |
I could be talked down to 10. I think 5 is too low; most business apps are I/O heavy, allowing pretty decent concurrency even with GIL. |
10-15 is also a reasonable insurance policy against pathological cases that really wish they were evented, but got written with threads anyway. |
10 sounds about right to me, and even 15 would be better than 25. |
Is there a nice way to use |
@zachmccormick number of processors is irrelevant for MRI (i.e. what the vast majority of the community uses AFAICT); each Sidekiq process can be handled by only 1 processor due to the GIL, since Sidekiq achieves concurrency via threads. You need to run multiple Sidekiq processes to achieve true parallel processing. |
Ah I see - didn't realize that! Thanks! |
Sidekiq doesn't fork or scale processes, only threads. You need to start multiple Sidekiqs yourself, using the tool/init of your choice. Sidekiq Enterprise has a multi-process https://github.com/mperham/sidekiq/wiki/Ent-Multi-Process |
@amcaplan @zachmccormick The number of processors matters if you add processes. But as @mperham says, you'd have to do that yourself - Sidekiq won't do that automatically. Also, while the GIL means each Ruby thread blocks all others when running Ruby, there are some non-Ruby operations (e.g. network or disk I/O, database, some parts of garbage collection, many things done with native extensions) which can happen on a background thread and don't block your Ruby process. Those things can happen in parallel if you have more than one processor, but not if you don't. That's a lot of what I was talking about above with "saturating" a processor with 6+ threads per process - that makes sure that even when most of your Ruby code is blocked, something is running and making forward progress. |
Hehe, I gave a talk about this once (https://speakerdeck.com/amcaplan/threads-and-processes-lightning-talk-given-at-rails-israel-2015), maybe the slides will be useful to future issue watchers... Obviously it's a bit oversimplified, but pretty good as a round estimate. Most Rails jobs I've seen hover around that figure. Also worth noting the first 10 minutes of this 2015 talk by @schneems (a personal favorite - first time we were in the same room!) where playing with the setting led them to change concurrency from 30 to 4. |
Yup! That's a great summary. But since the CPU percentage for a given task can vary, there's a bit of an asymptote as far as how many threads are necessary to saturate... |
It's obvious that glibc's memory bloat, as discussed on my blog, gets worse as concurrency increases. I think reducing concurrency from 25 to 10 will reduce memory usage AND bloat, giving us a double win in memory. Pro tip: you can get the old behavior by adding |
The most significant change in this version is that the default concurrency has been lowered from 25 to 10 (sidekiq/sidekiq#3892). This doesn't affect omnibus-gitlab because the concurrency is controlled via a setting that defaults to 25 anyway and is passed in via the `-c` command-line parameter. However, source installations (including the GDK) will have to either specify the concurrency in `sidekiq.yml` or use the `-c` option. Full list of changes: https://github.com/mperham/sidekiq/blob/master/Changes.md
By default sidekiq 3.x spawns 25 threads per worker. This causes some sporadic job failures in Heroku in staging because of the current 20 connections max limit we have there. Why is it good to lower concurrency _anyway_? Because higher concurrency doesn't mean higher throughput. This has been discussed in an issue by Mike Perham here: sidekiq/sidekiq#3892 and as a result most recent version of sidekiq actually have the default concurrency set to 10 indeed.
Today Sidekiq uses a default concurrency of 25. These means Sidekiq will spawn 25 worker threads and execute up to 25 jobs concurrently in a process.
glibc has a major memory fragmentation issue which gets worse with more threads, causing many people to move to jemalloc.
I also happen to think that, with time and experience but no hard data, that 25 is pretty aggressive and most apps can peg a CPU with a lower number of cores. Developers testing locally on macOS rarely need such large concurrency.
I'd suggest we reduce the default concurrency from 25 to 15 in Sidekiq 5.2.0. This will save memory and reduce fragmentation and bloat on Linux. Anyone who wants to retain the old value can add
-c 25
to their command line.WDYT?
The text was updated successfully, but these errors were encountered: