-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to manually set the number of router threads #1042
Comments
On Sat, 30 Dec 2023 06:46:06 -0800 Paweł B ***@***.***> wrote:
Hello,
I am running an application using Nginx Unit in Kubernetes environment.
The `unit: router` process automatically runs in the number of threads, which equals the number of CPU cores on the physical server.
### src/nxt_router.c
```
if (rtcf->threads == 0) {
rtcf->threads = nxt_ncpu;
}
```
in the code I also found
### src/nxt_router.c
```
static nxt_conf_map_t nxt_router_conf[] = {
{
nxt_string("listeners_threads"),
NXT_CONF_MAP_INT32,
offsetof(nxt_router_conf_t, threads),
},
};
```
but I was not able to set this from the configuration file
Is it possible to set this value ?
Looks like it isn't currently hooked up to the configuration system...
|
There needs to be a way to configure the number of threads to use in particular as unit is using sched_yield using spinlocks with no sleep which have extremely bad performance when overcommitted. Running unit on a host with 64 cores (including hyperthreads) with a cpu limit of 2 under load will cause the system to spend 90% or more of its time rescheduling its threads waiting for the lock while the process which is actually holding the lock only has a very small chance of running and releasing it. A sample profile of an application under this load looks like this:
pretty much all of the cpu usage ends up in the userspace part of pthread_yield and the kernel scheduler. In general using spinlocks from userspace is very dangerous as userspace cannot portably determine whether the lock holder is running and go to sleep if not and thus a preempted lock holder can be blocked by the process spinning on the spinlock from releasing the lock as you also cannot give the lock holder priority for the rescheduling triggered by sched_yield. The discrepancy between available cpu for the process group and the number of threads that can contend for the locks which can exist in kubernetes/cfs_quota-using systems will amplify these problems massively and having no option to reduce the thread count is a large problem. An environment variable that allows configuring the thread count would be a good solution. |
I hear you regarding the use of spinlocks. We had an internal discussion a while back about the use of sched_yield(2) which at the very least it seems should only be used in conjunction with real-time scheduling. It would be interesting to see what effect this patch has, whether it's good, bad or indifferent (yes, it may likely fubar things)... diff --git ./src/nxt_spinlock.c ./src/nxt_spinlock.c
index 940be724..78636fc2 100644
--- ./src/nxt_spinlock.c
+++ ./src/nxt_spinlock.c
@@ -82,8 +82,6 @@ nxt_thread_spin_lock(nxt_thread_spinlock_t *lock)
goto again;
}
}
-
- nxt_thread_yield();
}
}
Then there is this code /* It should be adjusted with the "spinlock_count" directive. */
static nxt_uint_t nxt_spinlock_count = 1000;
void
nxt_thread_spin_init(nxt_uint_t ncpu, nxt_uint_t count)
{
switch (ncpu) {
case 0:
/* Explicit spinlock count. */
nxt_spinlock_count = count;
break;
case 1:
/* Spinning is useless on UP. */
nxt_spinlock_count = 0;
break;
default:
/*
* SMP.
*
* TODO: The count should be 10 on a virtualized system
* since virtualized CPUs may share the same physical CPU.
*/
nxt_spinlock_count = 1000;
break;
}
} in (I'm not sure in what cases ncpu would be 0) This function is called once from Unit with the arguments of (nr_cpus, 0) It may be worth trying to lower (I guess) |
decreasing nxt_spinlock_count would likely have a detrimental effect as it would cause the sched_yield to be called more often which triggers more kernel rescheduling which is very costly especially on larger systems. The yielding approach is somewhat outdated, its idea is that the spinning thread defers its timeslice and then hopefully the process holding the actual lock runs and releases the lock. What spinlocks need to do is go to sleep when the holder is not currently running, so that the spinning waiters don't prevent the locker from running and freeing the lock but that is not actually trivial to do from userspace. There was a solution to this proposed for linux using restart-able sequences not very long ago and I think it was even merged, see this https://lwn.net/Articles/944895/ but it would require a very recent kernel to do and is of course not portable. |
Testing our spinlock implementation in an albeit simple test program where a number of threads are trying to increment a counter. By decreasing |
Just to clarify
What is meant by 'cpu limit of 2'? How many router threads were there? Is this bare metal or a VM? Was this with an application or just static file serving? Is the above profile for a particular lock or an amalgamation of all the spin-locks? |
A containerized workload, the host has 64-256 cores (including hyperthreads), the application has access to two of them (using cgroups cfs quotas) the profile is just a snapshot of what was currently running, could be any of the locks. |
Ah, OK, so the system may have 64 'cpus', but you're limiting unit to only use two of those and On Linux at least (need to check other systems) we could use sched_getaffinity(2) instead to get the number of 'cpus' that are allowed to run on. |
affinity is only one of the options, more commonly cpu bandwidth is limited using cgroups instead of pinning to cores there are two versions v2 uses /sys/fs/cgroup/cpu.max which contains timeslice the process can run and period the timeslice applies to, the available cpu is timeslice divided by the period (the result can be fractional). the older variant cgroup v1 uses two pseudofiles cfs_quota_us and cfs_period_us in the cgroup filesystem |
Yes, but trying to determine the number of threads to run based on the things like cpu quota is a little different problem than just getting the number of threads to match the available number of cpus. Maybe at some point it makes sense to make it configurable but at the very least we should try to get it basically right by default... |
its the same problem, in particular with your choice of lock and implementation you must never start more threads than can actually run, the method on how your application is limited does not matter. making it configurable is preferable though as it allows users to override the bad defaults when you don't properly detect the environment, its also like a 3 line patch if you use an environment variable. |
Yes, this is a whole bigger discussion.
I have a patch to make use of sched_getaffinity(2) on Linux. This will help if Unit is being run under cpuset(7)'s or started via something like taskset(1). It will also help with things like docker where the That's a simple easy win. Determining the number of threads to create based on cpu quotas is a slightly different problem that will take longer to develop and test etc (if it's something we even wish to do...).
Hooking it up to the config system is relatively simple and we may do that, but we should first try to get things right by default. I'll bring the question of making it configurable up with the wider team... |
At startup, the unit router process creates a number of threads, it tries to create the same number of threads (not incl the main thread) as there are 'cpus' in the system. On Linux the number of available cpus is determined via a call to sysconf(_SC_NPROCESSORS_ONLN); in a lot of cases this produces the right result, i.e. on a four cpu system this will return 4. However this can break down if unit has been restricted in the cpus it's allowed to run on via something like cpuset()'s and/or sched_setaffinity(2). For example, on a four 'cpu' system, starting unit will create an extra 4 router threads $ /opt/unit/sbin/unitd $ ps -efL | grep router andrew 234102 234099 234102 0 5 17:00 pts/10 00:00:00 unit: router andrew 234102 234099 234103 0 5 17:00 pts/10 00:00:00 unit: router andrew 234102 234099 234104 0 5 17:00 pts/10 00:00:00 unit: router andrew 234102 234099 234105 0 5 17:00 pts/10 00:00:00 unit: router andrew 234102 234099 234106 0 5 17:00 pts/10 00:00:00 unit: router Say we want to limit unit to two cpus, i.e. $ taskset -a -c 2-3 /opt/unit/sbin/unitd $ ps -efL | grep router andrew 235772 235769 235772 0 5 17:08 pts/10 00:00:00 unit: router andrew 235772 235769 235773 0 5 17:08 pts/10 00:00:00 unit: router andrew 235772 235769 235774 0 5 17:08 pts/10 00:00:00 unit: router andrew 235772 235769 235775 0 5 17:08 pts/10 00:00:00 unit: router andrew 235772 235769 235776 0 5 17:08 pts/10 00:00:00 unit: router So despite limiting unit to two cpus $ grep Cpus_allowed_list /proc/235772/status Cpus_allowed_list: 2-3 It still created 4 threads, probably not such an issue in this case, but if we had a 64 'cpu' system and wanted to limit unit two cpus, then we'd have 64 threads vying to run on two cpus and with our spinlock implementation this can cause a lot of thread scheduling and congestion overhead. Besides, our intention is currently to create nr router threads == nr cpus. To resolve this, on Linux at least, this patch makes use of sched_getaffinity(2) to determine what cpus unit is actually allowed to run on. We still use the result of sysconf(_SC_NPROCESSORS_ONLN); as a fallback, we also use its result to allocate the required cpuset size (where sched_getaffinity() will store its result) as the standard cpu_set_t only has space to store 1023 cpus. So with this patch if we try to limit unit to two cpus we now get $ taskset -a -c 2-3 /opt/unit/sbin/unitd $ ps -efL | grep router andrew 236887 236884 236887 0 3 17:20 pts/10 00:00:00 unit: router andrew 236887 236884 236888 0 3 17:20 pts/10 00:00:00 unit: router andrew 236887 236884 236889 0 3 17:20 pts/10 00:00:00 unit: router This also applies to the likes of docker, if you run docker with the --cpuset-cpus="" option, unit will now create a number of router threads that matches the cpu count specified. Perhaps useful if you are running a number of unit docker instances on a high cpu count machine. Link: <nginx#1042> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
At startup, the unit router process creates a number of threads, it tries to create the same number of threads (not incl the main thread) as there are 'cpus' in the system. On Linux the number of available cpus is determined via a call to sysconf(_SC_NPROCESSORS_ONLN); in a lot of cases this produces the right result, i.e. on a four cpu system this will return 4. However this can break down if unit has been restricted in the cpus it's allowed to run on via something like cpuset()'s and/or sched_setaffinity(2). For example, on a four 'cpu' system, starting unit will create an extra 4 router threads $ /opt/unit/sbin/unitd $ ps -efL | grep router andrew 234102 234099 234102 0 5 17:00 pts/10 00:00:00 unit: router andrew 234102 234099 234103 0 5 17:00 pts/10 00:00:00 unit: router andrew 234102 234099 234104 0 5 17:00 pts/10 00:00:00 unit: router andrew 234102 234099 234105 0 5 17:00 pts/10 00:00:00 unit: router andrew 234102 234099 234106 0 5 17:00 pts/10 00:00:00 unit: router Say we want to limit unit to two cpus, i.e. $ taskset -a -c 2-3 /opt/unit/sbin/unitd $ ps -efL | grep router andrew 235772 235769 235772 0 5 17:08 pts/10 00:00:00 unit: router andrew 235772 235769 235773 0 5 17:08 pts/10 00:00:00 unit: router andrew 235772 235769 235774 0 5 17:08 pts/10 00:00:00 unit: router andrew 235772 235769 235775 0 5 17:08 pts/10 00:00:00 unit: router andrew 235772 235769 235776 0 5 17:08 pts/10 00:00:00 unit: router So despite limiting unit to two cpus $ grep Cpus_allowed_list /proc/235772/status Cpus_allowed_list: 2-3 It still created 4 threads, probably not such an issue in this case, but if we had a 64 'cpu' system and wanted to limit unit two cpus, then we'd have 64 threads vying to run on two cpus and with our spinlock implementation this can cause a lot of thread scheduling and congestion overhead. Besides, our intention is currently to create nr router threads == nr cpus. To resolve this, on Linux at least, this patch makes use of sched_getaffinity(2) to determine what cpus unit is actually allowed to run on. We still use the result of sysconf(_SC_NPROCESSORS_ONLN); as a fallback, we also use its result to allocate the required cpuset size (where sched_getaffinity() will store its result) as the standard cpu_set_t only has space to store 1023 cpus. So with this patch if we try to limit unit to two cpus we now get $ taskset -a -c 2-3 /opt/unit/sbin/unitd $ ps -efL | grep router andrew 236887 236884 236887 0 3 17:20 pts/10 00:00:00 unit: router andrew 236887 236884 236888 0 3 17:20 pts/10 00:00:00 unit: router andrew 236887 236884 236889 0 3 17:20 pts/10 00:00:00 unit: router This also applies to the likes of docker, if you run docker with the --cpuset-cpus="" option, unit will now create a number of router threads that matches the cpu count specified. Perhaps useful if you are running a number of unit docker instances on a high cpu count machine. Link: <nginx#1042> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
Here's a patch to enable configuring the number of router threads... diff --git ./src/nxt_conf_validation.c ./src/nxt_conf_validation.c
index 04091745..86199f84 100644
--- ./src/nxt_conf_validation.c
+++ ./src/nxt_conf_validation.c
@@ -240,6 +240,7 @@ static nxt_int_t nxt_conf_vldt_js_module_element(nxt_conf_validation_t *vldt,
static nxt_conf_vldt_object_t nxt_conf_vldt_setting_members[];
static nxt_conf_vldt_object_t nxt_conf_vldt_http_members[];
+static nxt_conf_vldt_object_t nxt_conf_vldt_router_members[];
static nxt_conf_vldt_object_t nxt_conf_vldt_websocket_members[];
static nxt_conf_vldt_object_t nxt_conf_vldt_static_members[];
static nxt_conf_vldt_object_t nxt_conf_vldt_forwarded_members[];
@@ -309,6 +310,11 @@ static nxt_conf_vldt_object_t nxt_conf_vldt_setting_members[] = {
.type = NXT_CONF_VLDT_OBJECT,
.validator = nxt_conf_vldt_object,
.u.members = nxt_conf_vldt_http_members,
+ }, {
+ .name = nxt_string("router"),
+ .type = NXT_CONF_VLDT_OBJECT,
+ .validator = nxt_conf_vldt_object,
+ .u.members = nxt_conf_vldt_router_members,
#if (NXT_HAVE_NJS)
}, {
.name = nxt_string("js_module"),
@@ -377,6 +383,16 @@ static nxt_conf_vldt_object_t nxt_conf_vldt_http_members[] = {
};
+static nxt_conf_vldt_object_t nxt_conf_vldt_router_members[] = {
+ {
+ .name = nxt_string("threads"),
+ .type = NXT_CONF_VLDT_INTEGER,
+ },
+
+ NXT_CONF_VLDT_END
+};
diff --git ./src/nxt_conf_validation.c ./src/nxt_conf_validation.c
index 04091745..86199f84 100644
--- ./src/nxt_conf_validation.c
+++ ./src/nxt_conf_validation.c
@@ -240,6 +240,7 @@ static nxt_int_t nxt_conf_vldt_js_module_element(nxt_conf_va
lidation_t *vldt,
static nxt_conf_vldt_object_t nxt_conf_vldt_setting_members[];
static nxt_conf_vldt_object_t nxt_conf_vldt_http_members[];
+static nxt_conf_vldt_object_t nxt_conf_vldt_router_members[];
static nxt_conf_vldt_object_t nxt_conf_vldt_websocket_members[];
static nxt_conf_vldt_object_t nxt_conf_vldt_static_members[];
static nxt_conf_vldt_object_t nxt_conf_vldt_forwarded_members[];
@@ -309,6 +310,11 @@ static nxt_conf_vldt_object_t nxt_conf_vldt_setting_member
s[] = {
.type = NXT_CONF_VLDT_OBJECT,
.validator = nxt_conf_vldt_object,
.u.members = nxt_conf_vldt_http_members,
+ }, {
+ .name = nxt_string("router"),
+ .type = NXT_CONF_VLDT_OBJECT,
+ .validator = nxt_conf_vldt_object,
+ .u.members = nxt_conf_vldt_router_members,
#if (NXT_HAVE_NJS)
}, {
.name = nxt_string("js_module"),
@@ -377,6 +383,16 @@ static nxt_conf_vldt_object_t nxt_conf_vldt_http_members[]
= {
};
+static nxt_conf_vldt_object_t nxt_conf_vldt_router_members[] = {
+ {
+ .name = nxt_string("threads"),
+ .type = NXT_CONF_VLDT_INTEGER,
+ },
+
+ NXT_CONF_VLDT_END
+};
+
+
static nxt_conf_vldt_object_t nxt_conf_vldt_websocket_members[] = {
{
.name = nxt_string("read_timeout"),
diff --git ./src/nxt_router.c ./src/nxt_router.c
index 43209451..2676402b 100644
--- ./src/nxt_router.c
+++ ./src/nxt_router.c
@@ -1412,7 +1412,7 @@ nxt_router_conf_send(nxt_task_t *task, nxt_router_temp_conf_t *tmcf,
static nxt_conf_map_t nxt_router_conf[] = {
{
- nxt_string("listeners_threads"),
+ nxt_string("threads"),
NXT_CONF_MAP_INT32,
offsetof(nxt_router_conf_t, threads),
},
@@ -1630,7 +1630,7 @@ nxt_router_conf_create(nxt_task_t *task, nxt_router_temp_conf_t *tmcf,
nxt_conf_value_t *js_module;
#endif
nxt_conf_value_t *root, *conf, *http, *value, *websocket;
- nxt_conf_value_t *applications, *application;
+ nxt_conf_value_t *applications, *application, *router_conf;
nxt_conf_value_t *listeners, *listener;
nxt_socket_conf_t *skcf;
nxt_router_conf_t *rtcf;
@@ -1640,6 +1640,7 @@ nxt_router_conf_create(nxt_task_t *task, nxt_router_temp_conf_t *tmcf,
nxt_router_app_conf_t apcf;
nxt_router_listener_conf_t lscf;
+ static const nxt_str_t router_path = nxt_string("/settings/router");
static const nxt_str_t http_path = nxt_string("/settings/http");
static const nxt_str_t applications_path = nxt_string("/applications");
static const nxt_str_t listeners_path = nxt_string("/listeners");
@@ -1673,11 +1674,14 @@ nxt_router_conf_create(nxt_task_t *task, nxt_router_temp_conf_t *tmcf,
rtcf = tmcf->router_conf;
mp = rtcf->mem_pool;
- ret = nxt_conf_map_object(mp, root, nxt_router_conf,
- nxt_nitems(nxt_router_conf), rtcf);
- if (ret != NXT_OK) {
- nxt_alert(task, "root map error");
- return NXT_ERROR;
+ router_conf = nxt_conf_get_path(root, &router_path);
+ if (router_conf != NULL) {
+ ret = nxt_conf_map_object(mp, router_conf, nxt_router_conf,
+ nxt_nitems(nxt_router_conf), rtcf);
+ if (ret != NXT_OK) {
+ nxt_alert(task, "rooter_conf map error");
+ return NXT_ERROR;
+ }
}
if (rtcf->threads == 0) { Configure like {
"listeners": {
},
"settings": {
"router": {
"threads": 2
}
},
} Hopefully something like this can get in the next release of Unit... |
Unit generally creates an extra number of router threads (to handle client connections, not incl the main thread) to match the number of available CPUs. There are cases when this can go wrong, e.g on a high CPU count machine and Unit is being effectively limited to a few CPUs via the cgroups cpu controller. So Unit may create a large number of router threads when they are only going to effectively run on a couple of CPUs or so. There may be other cases where you would like to tweak the number of router threads, depending on your workload. As it turns out it looks like it was intended to be made configurable but was just never hooked up to the config system. This adds a new '/settings/router/threads' config option which can be set like { "listen": { ... }, "settings": { "router": { "threads": 2 } }, ... } Before this patch (on a four cpu system) $ ps -efL | grep router andrew 419832 419829 419832 0 5 Aug12 pts/10 00:00:00 unit: router andrew 419832 419829 419833 0 5 Aug12 pts/10 00:00:00 unit: router andrew 419832 419829 419834 0 5 Aug12 pts/10 00:00:00 unit: router andrew 419832 419829 445145 0 5 03:31 pts/10 00:00:00 unit: router andrew 419832 419829 445146 0 5 03:31 pts/10 00:00:00 unit: router After, with a threads setting of 2. $ ps -efL | grep router andrew 419832 419829 419832 0 3 Aug12 pts/10 00:00:00 unit: router andrew 419832 419829 419833 0 3 Aug12 pts/10 00:00:00 unit: router andrew 419832 419829 419834 0 3 Aug12 pts/10 00:00:00 unit: router Closes: nginx#1042 Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
Unit generally creates an extra number of router threads (to handle client connections, not incl the main thread) to match the number of available CPUs. There are cases when this can go wrong, e.g on a high CPU count machine and Unit is being effectively limited to a few CPUs via the cgroups cpu controller. So Unit may create a large number of router threads when they are only going to effectively run on a couple of CPUs or so. There may be other cases where you would like to tweak the number of router threads, depending on your workload. As it turns out it looks like it was intended to be made configurable but was just never hooked up to the config system. This adds a new '/settings/router/threads' config option which can be set like { "listen": { ... }, "settings": { "router": { "threads": 2 } }, ... } Before this patch (on a four cpu system) $ ps -efL | grep router andrew 419832 419829 419832 0 5 Aug12 pts/10 00:00:00 unit: router andrew 419832 419829 419833 0 5 Aug12 pts/10 00:00:00 unit: router andrew 419832 419829 419834 0 5 Aug12 pts/10 00:00:00 unit: router andrew 419832 419829 445145 0 5 03:31 pts/10 00:00:00 unit: router andrew 419832 419829 445146 0 5 03:31 pts/10 00:00:00 unit: router After, with a threads setting of 2. $ ps -efL | grep router andrew 419832 419829 419832 0 3 Aug12 pts/10 00:00:00 unit: router andrew 419832 419829 419833 0 3 Aug12 pts/10 00:00:00 unit: router andrew 419832 419829 419834 0 3 Aug12 pts/10 00:00:00 unit: router Closes: nginx#1042 Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
At startup, the unit router process creates a number of threads, it tries to create the same number of threads (not incl the main thread) as there are 'cpus' in the system. On Linux the number of available cpus is determined via a call to sysconf(_SC_NPROCESSORS_ONLN); in a lot of cases this produces the right result, i.e. on a four cpu system this will return 4. However this can break down if unit has been restricted in the cpus it's allowed to run on via something like cpuset()'s and/or sched_setaffinity(2). For example, on a four 'cpu' system, starting unit will create an extra 4 router threads $ /opt/unit/sbin/unitd $ ps -efL | grep router andrew 234102 234099 234102 0 5 17:00 pts/10 00:00:00 unit: router andrew 234102 234099 234103 0 5 17:00 pts/10 00:00:00 unit: router andrew 234102 234099 234104 0 5 17:00 pts/10 00:00:00 unit: router andrew 234102 234099 234105 0 5 17:00 pts/10 00:00:00 unit: router andrew 234102 234099 234106 0 5 17:00 pts/10 00:00:00 unit: router Say we want to limit unit to two cpus, i.e. $ taskset -a -c 2-3 /opt/unit/sbin/unitd $ ps -efL | grep router andrew 235772 235769 235772 0 5 17:08 pts/10 00:00:00 unit: router andrew 235772 235769 235773 0 5 17:08 pts/10 00:00:00 unit: router andrew 235772 235769 235774 0 5 17:08 pts/10 00:00:00 unit: router andrew 235772 235769 235775 0 5 17:08 pts/10 00:00:00 unit: router andrew 235772 235769 235776 0 5 17:08 pts/10 00:00:00 unit: router So despite limiting unit to two cpus $ grep Cpus_allowed_list /proc/235772/status Cpus_allowed_list: 2-3 It still created 4 threads, probably not such an issue in this case, but if we had a 64 'cpu' system and wanted to limit unit two cpus, then we'd have 64 threads vying to run on two cpus and with our spinlock implementation this can cause a lot of thread scheduling and congestion overhead. Besides, our intention is currently to create nr router threads == nr cpus. To resolve this, on Linux at least, this patch makes use of sched_getaffinity(2) to determine what cpus unit is actually allowed to run on. We still use the result of sysconf(_SC_NPROCESSORS_ONLN); as a fallback, we also use its result to allocate the required cpuset size (where sched_getaffinity() will store its result) as the standard cpu_set_t only has space to store 1023 cpus. So with this patch if we try to limit unit to two cpus we now get $ taskset -a -c 2-3 /opt/unit/sbin/unitd $ ps -efL | grep router andrew 236887 236884 236887 0 3 17:20 pts/10 00:00:00 unit: router andrew 236887 236884 236888 0 3 17:20 pts/10 00:00:00 unit: router andrew 236887 236884 236889 0 3 17:20 pts/10 00:00:00 unit: router This also applies to the likes of docker, if you run docker with the --cpuset-cpus="" option, unit will now create a number of router threads that matches the cpu count specified. Perhaps useful if you are running a number of unit docker instances on a high cpu count machine. Link: <nginx#1042> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
Unit generally creates an extra number of router threads (to handle client connections, not incl the main thread) to match the number of available CPUs. There are cases when this can go wrong, e.g on a high CPU count machine and Unit is being effectively limited to a few CPUs via the cgroups cpu controller. So Unit may create a large number of router threads when they are only going to effectively run on a couple of CPUs or so. There may be other cases where you would like to tweak the number of router threads, depending on your workload. As it turns out it looks like it was intended to be made configurable but was just never hooked up to the config system. This adds a new '/settings/listen_threads' config option which can be set like { "listen": { ... }, "settings": { "listen_threads": 2, ... }, ... } Before this patch (on a four cpu system) $ ps -efL | grep router andrew 419832 419829 419832 0 5 Aug12 pts/10 00:00:00 unit: router andrew 419832 419829 419833 0 5 Aug12 pts/10 00:00:00 unit: router andrew 419832 419829 419834 0 5 Aug12 pts/10 00:00:00 unit: router andrew 419832 419829 445145 0 5 03:31 pts/10 00:00:00 unit: router andrew 419832 419829 445146 0 5 03:31 pts/10 00:00:00 unit: router After, with a threads setting of 2 $ ps -efL | grep router andrew 419832 419829 419832 0 3 Aug12 pts/10 00:00:00 unit: router andrew 419832 419829 419833 0 3 Aug12 pts/10 00:00:00 unit: router andrew 419832 419829 419834 0 3 Aug12 pts/10 00:00:00 unit: router Closes: nginx#1042 Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
At startup, the unit router process creates a number of threads, it tries to create the same number of threads (not incl the main thread) as there are 'cpus' in the system. On Linux the number of available cpus is determined via a call to sysconf(_SC_NPROCESSORS_ONLN); in a lot of cases this produces the right result, i.e. on a four cpu system this will return 4. However this can break down if unit has been restricted in the cpus it's allowed to run on via something like cpuset()'s and/or sched_setaffinity(2). For example, on a four 'cpu' system, starting unit will create an extra 4 router threads $ /opt/unit/sbin/unitd $ ps -efL | grep router andrew 234102 234099 234102 0 5 17:00 pts/10 00:00:00 unit: router andrew 234102 234099 234103 0 5 17:00 pts/10 00:00:00 unit: router andrew 234102 234099 234104 0 5 17:00 pts/10 00:00:00 unit: router andrew 234102 234099 234105 0 5 17:00 pts/10 00:00:00 unit: router andrew 234102 234099 234106 0 5 17:00 pts/10 00:00:00 unit: router Say we want to limit unit to two cpus, i.e. $ taskset -a -c 2-3 /opt/unit/sbin/unitd $ ps -efL | grep router andrew 235772 235769 235772 0 5 17:08 pts/10 00:00:00 unit: router andrew 235772 235769 235773 0 5 17:08 pts/10 00:00:00 unit: router andrew 235772 235769 235774 0 5 17:08 pts/10 00:00:00 unit: router andrew 235772 235769 235775 0 5 17:08 pts/10 00:00:00 unit: router andrew 235772 235769 235776 0 5 17:08 pts/10 00:00:00 unit: router So despite limiting unit to two cpus $ grep Cpus_allowed_list /proc/235772/status Cpus_allowed_list: 2-3 It still created 4 threads, probably not such an issue in this case, but if we had a 64 'cpu' system and wanted to limit unit two cpus, then we'd have 64 threads vying to run on two cpus and with our spinlock implementation this can cause a lot of thread scheduling and congestion overhead. Besides, our intention is currently to create nr router threads == nr cpus. To resolve this, on Linux at least, this patch makes use of sched_getaffinity(2) to determine what cpus unit is actually allowed to run on. We still use the result of sysconf(_SC_NPROCESSORS_ONLN); as a fallback, we also use its result to allocate the required cpuset size (where sched_getaffinity() will store its result) as the standard cpu_set_t only has space to store 1023 cpus. So with this patch if we try to limit unit to two cpus we now get $ taskset -a -c 2-3 /opt/unit/sbin/unitd $ ps -efL | grep router andrew 236887 236884 236887 0 3 17:20 pts/10 00:00:00 unit: router andrew 236887 236884 236888 0 3 17:20 pts/10 00:00:00 unit: router andrew 236887 236884 236889 0 3 17:20 pts/10 00:00:00 unit: router This also applies to the likes of docker, if you run docker with the --cpuset-cpus="" option, unit will now create a number of router threads that matches the cpu count specified. Perhaps useful if you are running a number of unit docker instances on a high cpu count machine. Link: <nginx#1042> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
Hello,
I am running an application using Nginx Unit in Kubernetes environment.
The
unit: router
process automatically runs in the number of threads, which equals the number of CPU cores on the physical server.src/nxt_router.c
in the code I also found
src/nxt_router.c
but I was not able to set this from the configuration file
Is it possible to set this value ?
The text was updated successfully, but these errors were encountered: