Better available CPU count determination on Linux #1383

ac000 · 2024-08-12T14:55:19Z

This pull-request enhances how we determine the number of CPU's available to Unit to run on under Linux.

In a recent discussion it became clear there are cases whereby we can end up creating way too many router threads (these are what handle client connections).

We currently try to create a number of router threads (not incl the main one) that match the number of on-line CPU's. This is currently determined with

n = sysconf(_SC_NPROCESSORS_ONLN);

While that generally works, there are cases where it can give the wrong answer, for us anyway. I'll use the nproc(1) utility to demonstrate

$ nproc 
4
# 4 cpu system, 4 on-line cpu's
# Let's limit nproc to just 2 cpu's
$ taskset -c 0-1 nproc 
2
# Let's ignore the cpu allowed mask
$ taskset -c 0-1 nproc --all
4

nproc with no arguments will (on Linux) use sched_getaffinity(2) to determine the number of CPU's that it's allowed to run on. Passing --all makes it use sysconf(3) as above.

That last example is what currently happens with Unit, even if it's been limited in the number of CPU's it's actually allowed to run on it will still create a number of router threads matching the result of the sysconf(3) call.

This pull-request thus comprises two patches to remedy this situation

The first adds a new auto check for the sched_getaffinity(2) system call on Linux.

The second uses this to determine the number of router threads to create, on Linux we still use sysconf(3) as a fallback.

This could also help with containers, whereby they are being limited in the number of CPU's they can use on high cpu count machines.

ac000 · 2024-08-12T14:57:48Z

@hongzhidao Welcome back!

hongzhidao · 2024-08-13T11:01:18Z

Hi @ac000,
It looks like a new feature but does not change the existing behavior.

$ taskset -c 0-1 nproc

It's possible to change the available CPU numbers in runtime, right?
If yes, I feel we can take account into adding the feature on runtime, I mean users can change it on the fly.
It seems it has anything to do with the feature you're doing. #1385

ac000 · 2024-08-13T14:52:38Z

Hi @ac000, It looks like a new feature but does not change the existing behavior.

$ taskset -c 0-1 nproc

It's possible to change the available CPU numbers in runtime, right? If yes, I feel we can take account into adding the feature on runtime, I mean users can change it on the fly. It seems it has anything to do with the feature you're doing. #1385

Yes, you can change the processes CPU allowed mask at runtime.

However, doing automatic dynamic runtime reconfiguration is a much larger topic.

It may not be too much work to hook it up so it gets run a reconfigure time, currently when doing a reconfigure

nxt_router_conf_create() is called which uses rtcf->threads or nxt_ncputo create the desired number of threads. However nxt_lib_start()` is not called which is where we determine the number of available CPUs.

It's probably not too much effort to factor out the CPU detection code into its own function and have it run from nxt_router_conf_create() or something...

But none of that really changes this patch which just makes the current situation better and would still be needed for doing the dynamic reconfiguration.

hongzhidao · 2024-08-13T15:03:18Z

But none of that really changes this patch which just makes the current situation better

Agree.

My thought is more conservative. I consider the following points.

If in the future we implement this.

However, doing automatic dynamic runtime reconfiguration is a much larger topic.

It means it will be more powerful and can do the same thing as the current change.
2. Support the CPU affinity make it sense to extend the "threads" option in the "settings" object.
3. It's not clear yet for me that what happens if the available CPU numbers is 0.

ac000 · 2024-08-13T17:28:44Z

But none of that really changes this patch which just makes the current situation better

Agree.

My thought is more conservative. I consider the following points.
1. If in the future we implement this.
However, doing automatic dynamic runtime reconfiguration is a much larger topic.

It means it will be more powerful and can do the same thing as the current change.

It would still want this change though... so we do the right thing by default in more cases...

Support the CPU affinity make it sense to extend the "threads" option in the "settings" object.

Possibly, or it might make sense for a different setting... However having the ability to specify cpu affinity within Unit is a separate unrelated change from this.

This patch isn't really about CPU affinity per se...

The traditional way to determine the number of available CPUs was with _SC_NPROCESSORS_ONLN.

On Linux sched_getaffinity(2) is a better solution as it does take the per-process (well it's actually per-thread) CPU allowed masks into account.

There is also pthread_getaffinity_np(3) which does a similar thing and could perhaps be used on the BSDs to achieve the same.

Say you have a 128 CPU machine and you have a number of docker instances running Unit, with each being limited to two CPUs via --cpuset-cpus="2"

What will currently happen is that each instance of unit will start 128 router threads, after this patch, each instance will start only 2.

It's not clear yet for me that what happens if the available CPU numbers is 0.

That would be an invalid setting

$ taskset 0x0 bash
taskset: failed to set pid 518501's affinity: Invalid argument

ac000 · 2024-08-13T17:39:41Z

Personally I think both this and #1385 are both separate valid changes with each stadning up on their own merit.

Better available CPU count determination on Linux #1383 It's good to try and do the right thing by default where possible.
router: Make the number of router threads configurable #1385 However that will not always be possible so let people fine tune things.

ac000 · 2024-08-13T21:13:16Z

Put it this way. at start up we create nr router threads == nr runnable cpus, however there are currently cases where this equation breaks down.

Currently we have the following situation

128 CPU box, Unit has the whole thing

nr router threads == nr runnable cpus

              128 == 128

128 CPU box, Unit is constrained to just 2 CPUs

nr router threads == nr runnable cpus

              128 == 2

There's the problem... forget about setting CPU affinities or things changing at runtime (that problem is the same with or without this patch).

hongzhidao · 2024-08-13T21:33:53Z

Personally I think both this and #1385 are both separate valid changes with each stadning up on their own merit.

Better available CPU count determination on Linux #1383 It's good to try and do the right thing by default where possible.

router: Make the number of router threads configurable #1385 However that will not always be possible so let people fine tune things.

Makes sense.

src/nxt_lib.c

ac000 · 2024-08-13T23:36:25Z

Couple of minor code-style tweaks

$ git range-diff 4b7e5136...446db7ac
1:  4b7e5136 ! 1:  446db7ac lib: Better available cpu count determination on Linux
    @@ src/nxt_lib.c: nxt_lib_start(const char *app, char **argv, char ***envp)
     +#if (NXT_HAVE_LINUX_SCHED_GETAFFINITY)
     +    if (n > 0) {
     +        int        err;
    -+        size_t     setsize;
    ++        size_t     size;
     +        cpu_set_t  *set;
     +
     +        set = CPU_ALLOC(n);
     +        if (set == NULL) {
     +            return NXT_ERROR;
     +        }
    -+        setsize = CPU_ALLOC_SIZE(n);
     +
    -+        err = sched_getaffinity(0, setsize, set);
    ++        size = CPU_ALLOC_SIZE(n);
    ++
    ++        err = sched_getaffinity(0, size, set);
     +        if (err == 0) {
    -+            n = CPU_COUNT_S(setsize, set);
    ++            n = CPU_COUNT_S(size, set);
     +        }
     +
     +        CPU_FREE(set);

This will help to better determine the number of router threads to create in certain situations. Unlike sysconf(_SC_NPROCESSORS_ONLN) this takes into account per-process cpu allowed masks as set by sched_setaffinity(2)/cpusets etc. So while a system may have 64 on-line cpu's, Unit itself may be limited to using just four of them in which case we should create four extra router threads, not sixty-four! Signed-off-by: Andrew Clayton <a.clayton@nginx.com>

At startup, the unit router process creates a number of threads, it tries to create the same number of threads (not incl the main thread) as there are 'cpus' in the system. On Linux the number of available cpus is determined via a call to sysconf(_SC_NPROCESSORS_ONLN); in a lot of cases this produces the right result, i.e. on a four cpu system this will return 4. However this can break down if unit has been restricted in the cpus it's allowed to run on via something like cpuset()'s and/or sched_setaffinity(2). For example, on a four 'cpu' system, starting unit will create an extra 4 router threads $ /opt/unit/sbin/unitd $ ps -efL | grep router andrew 234102 234099 234102 0 5 17:00 pts/10 00:00:00 unit: router andrew 234102 234099 234103 0 5 17:00 pts/10 00:00:00 unit: router andrew 234102 234099 234104 0 5 17:00 pts/10 00:00:00 unit: router andrew 234102 234099 234105 0 5 17:00 pts/10 00:00:00 unit: router andrew 234102 234099 234106 0 5 17:00 pts/10 00:00:00 unit: router Say we want to limit unit to two cpus, i.e. $ taskset -a -c 2-3 /opt/unit/sbin/unitd $ ps -efL | grep router andrew 235772 235769 235772 0 5 17:08 pts/10 00:00:00 unit: router andrew 235772 235769 235773 0 5 17:08 pts/10 00:00:00 unit: router andrew 235772 235769 235774 0 5 17:08 pts/10 00:00:00 unit: router andrew 235772 235769 235775 0 5 17:08 pts/10 00:00:00 unit: router andrew 235772 235769 235776 0 5 17:08 pts/10 00:00:00 unit: router So despite limiting unit to two cpus $ grep Cpus_allowed_list /proc/235772/status Cpus_allowed_list: 2-3 It still created 4 threads, probably not such an issue in this case, but if we had a 64 'cpu' system and wanted to limit unit two cpus, then we'd have 64 threads vying to run on two cpus and with our spinlock implementation this can cause a lot of thread scheduling and congestion overhead. Besides, our intention is currently to create nr router threads == nr cpus. To resolve this, on Linux at least, this patch makes use of sched_getaffinity(2) to determine what cpus unit is actually allowed to run on. We still use the result of sysconf(_SC_NPROCESSORS_ONLN); as a fallback, we also use its result to allocate the required cpuset size (where sched_getaffinity() will store its result) as the standard cpu_set_t only has space to store 1023 cpus. So with this patch if we try to limit unit to two cpus we now get $ taskset -a -c 2-3 /opt/unit/sbin/unitd $ ps -efL | grep router andrew 236887 236884 236887 0 3 17:20 pts/10 00:00:00 unit: router andrew 236887 236884 236888 0 3 17:20 pts/10 00:00:00 unit: router andrew 236887 236884 236889 0 3 17:20 pts/10 00:00:00 unit: router This also applies to the likes of docker, if you run docker with the --cpuset-cpus="" option, unit will now create a number of router threads that matches the cpu count specified. Perhaps useful if you are running a number of unit docker instances on a high cpu count machine. Link: <nginx#1042> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>

ac000 · 2024-08-19T22:30:03Z

Rebased with master

ac000 requested a review from hongzhidao August 12, 2024 14:55

hongzhidao mentioned this pull request Aug 13, 2024

router: Make the number of router threads configurable #1385

Merged

hongzhidao reviewed Aug 13, 2024

View reviewed changes

src/nxt_lib.c Outdated Show resolved Hide resolved

ac000 force-pushed the ncpu branch from 4b7e513 to 446db7a Compare August 13, 2024 23:35

hongzhidao approved these changes Aug 15, 2024

View reviewed changes

callahad added this to the 1.33 milestone Aug 19, 2024

ac000 added 2 commits August 19, 2024 23:29

ac000 force-pushed the ncpu branch from 446db7a to 2444d45 Compare August 19, 2024 22:29

ac000 merged commit 2444d45 into nginx:master Aug 19, 2024
24 checks passed

ac000 deleted the ncpu branch August 19, 2024 22:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better available CPU count determination on Linux #1383

Better available CPU count determination on Linux #1383

ac000 commented Aug 12, 2024

ac000 commented Aug 12, 2024

hongzhidao commented Aug 13, 2024

ac000 commented Aug 13, 2024

hongzhidao commented Aug 13, 2024 •

edited

Loading

ac000 commented Aug 13, 2024

ac000 commented Aug 13, 2024

ac000 commented Aug 13, 2024

hongzhidao commented Aug 13, 2024

ac000 commented Aug 13, 2024

ac000 commented Aug 19, 2024

Better available CPU count determination on Linux #1383

Better available CPU count determination on Linux #1383

Conversation

ac000 commented Aug 12, 2024

ac000 commented Aug 12, 2024

hongzhidao commented Aug 13, 2024

ac000 commented Aug 13, 2024

hongzhidao commented Aug 13, 2024 • edited Loading

ac000 commented Aug 13, 2024

ac000 commented Aug 13, 2024

ac000 commented Aug 13, 2024

hongzhidao commented Aug 13, 2024

ac000 commented Aug 13, 2024

ac000 commented Aug 19, 2024

hongzhidao commented Aug 13, 2024 •

edited

Loading