Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow setting the priority of spawned subprocesses #2216

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

clemenswasser
Copy link

This allows setting the priority/niceness of all spawned subprocesses via a new -p PRIORITY parameter.
The flag accepts POSIX style integers (typical range is highest: -20 to lowest: 19), on Windows they get translated into PriorityClasses.

Fixes: #2176

@clemenswasser
Copy link
Author

clemenswasser commented Nov 20, 2022

When not specifying a priority, now no default priority is set. With the previous commit, a ninja process with a lower than default priority would try to spawn processes with the default priority of 0.

@giraldeau
Copy link

I just tested this branch on Linux and I can say that this patch does the work and looks good.

Setting low priority to build processes is great and doing it from ninja makes it easier and portable compared to using some launcher, like nice on Linux.

@@ -123,6 +127,12 @@ bool Subprocess::Start(SubprocessSet* set, const string& command) {
if (err != 0)
Fatal("posix_spawn: %s", strerror(err));

if(priority != INT_MIN) {
err = setpriority(PRIO_PROCESS, pid_, priority);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a race condition where the spawned process may have already completed before this code is run. The call would return an error and this would crash Ninja unexpectedly.

Can you try to set the priority before the posix_spawn() call with posix_spawnattr_setschedparam() instead?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow, using posix_spawnattr_setschedparam doesn't seem to work. When checking in htop I don't see any change in niceness of the spawned processes.

src/subprocess.h Outdated
@@ -52,7 +52,7 @@ struct Subprocess {

private:
Subprocess(bool use_console);
bool Start(struct SubprocessSet* set, const std::string& command);
bool Start(struct SubprocessSet* set, const std::string& command, int priority);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document the meaning of priority here (including range of valid values and their interpretation, in particular for Posix and Windows).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the documentation I added sufficient?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thank you.

src/ninja.cc Outdated
" -k N keep going until N jobs fail (0 means infinity) [default=1]\n"
" -l N do not start new jobs if the load average is greater than N\n"
" -n dry run (don't run commands but act like they succeeded)\n"
" -p PRIORITY set the priority/niceness of spawned processes\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document possible values and their meaning + perform range checks in the parsing code below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this highly dependent on your scheduler implementation? On Linux this varies based on the scheduling policy -- in Android builds we run ninja under SCHED_BATCH, which doesn't allow changing this.

It looks like the range checks would need to use sched_get_priority_min and sched_get_priority_max to check the current policy.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is so platform and scheduler dependent, should I still do the range checks?

@digit-google
Copy link
Contributor

Thank you, this looks very promising :-)

err = posix_spawnattr_setschedparam(&attr, &sp);
if(err != 0)
Fatal("setpriority: %s", strerror(errno));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct, but I believe this will be ignored if you do not add POSIX_SPAWN_SETSCHEDPARAM to the flags value below.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the latest commit, I set the POSIX_SPAWN_SETSCHEDPARAM flag, but when setting any priority other than 0 I get fatal: posix_spawn: Invalid argument

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. Maybe initializing the value by calling pthread_getschedparam() instead of zero-ing the struct will solve it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly, this doesn't fix it 😢 I've now also looked into the man pages for some of these functions but didn't find anything obviously wrong with my implementation...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see this latest commit you are referencing. Can you upload a new version of this commit that sets the flag properly and initialized the params structure with pthread_getschedparam() so we can have a look at it?

Otherwise, the GLibc sources for posix_spawn() do not seem to return EINVAL explicitly, which suggest that one system call is returning this value.

This maybe the call to __sched_setparam() in that source file, which according to the manpage should return EINVAL only if "the argument param does not make sense for the current scheduling policy" !?

[1] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/spawni.c;h=e8ed2babb9b1396903466f631da0571504896aa5;hb=143ef68b2aded7c794956beddad495af8c7d3251

[2] https://man7.org/linux/man-pages/man2/sched_setparam.2.html

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, the manpage for sched_get_priority_min() says:

Linux allows the static priority value range 1 to 99 for SCHED_FIFO and SCHED_RR and the priority 0 for SCHED_OTHER and SCHED_BATCH. Scheduling priority ranges for the various policies are not alterable.

So it looks like 0 is the only valid value for SCHED_OTHER or SCHED_BATCH. I think the solution is simply to restrict the priority passed to the spawn call to the valid range for the current policy in the code. I.e. something like:

    int policy = -1;
    err = pthread_getschedparam(pthread_self(), &policy, &sp);
    if (err != 0)
      Fatal("pthread_getschedparam: %s", strerror(err));

   int min_priority = sched_get_priority_min(policy);
   int max_priority = sched_get_priority_max(policy);
   if (priority < min_priority)
       priority = min_priority;
   else if (priority > max_priority)
       priority = max_priority;

With this, all tests pass, though I assume the priority value is essentially ignored as it is always clamped to 0 :-/

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you are right __pthread_setcancelstate returns EINVAL, but isn't the actual cause for the error of posix_spawn. This seems to happen in the clone3 syscall.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if we can't set the priority to something other than 0, why can I set the priority to other values with the nice tool (GNU coreutils)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in conclusion, there is a drastic difference between setpriority() and sched_setparam(), and I'm not sure to know why. This Linux source file seems to indicate that the kernel scheduler differentiates between the two:

https://github.com/torvalds/linux/blob/dccb07f2914cdab2ac3a5b6c98406f765acab803/kernel/sched/core.c#L7935

So I guess we should forget about spawn_attr_set_schedparam(), sorry for giving you this idea. I recommend you use getpriority() / setpriority() before the spawn() to save then change the niceness of the current (Ninja process) before the fork (so it is inherited by the child one), then call setpriority() just after that to restore the original niceness. And add one big comment to explain the reason!

What do you think?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've implemented that, but somehow restoring the previous priority after the posix_spawn results in a "Permission denied" error 😢

@clemenswasser clemenswasser force-pushed the priority branch 2 times, most recently from b25f446 to 98594f5 Compare May 7, 2024 13:03
@giraldeau
Copy link

pthread_setschedparam() is not intended for normal process, but for real-time and other secialized scheduling algorithm. To change niceness for normal process, we should use setpriority() system call.

@clemenswasser
Copy link
Author

Ok, I've now reverted to using setpriority but ignore ESRCH as it could be returned when the process exited before the priority could be set and therefore avoids the previously mentioned race window.

@clemenswasser clemenswasser force-pushed the priority branch 2 times, most recently from 51c2682 to 833033f Compare May 7, 2024 15:34
@digit-google
Copy link
Contributor

Thank you, thank you for your patience. The latest commit (833033f) looks good to me. @jhasse, wdyt?

@jhasse jhasse added this to the 1.13.0 milestone May 7, 2024
@jhasse
Copy link
Collaborator

jhasse commented May 7, 2024

LGTM, I'll have a close look in a few months, but in general this is a good feature for 1.13.0.

For the CI failure the package that provides ps would need to be installed in the Fedora Docker.

@giraldeau
Copy link

giraldeau commented May 10, 2024

I did test the feature on Linux. For me the main goal is to launch build jobs in the background, such that my desktop apps are responsive, and it works fine. I don't see a use case for increasing the priority, but I tried it nontheless (requires admin privilege) and it works. I was worried that maybe the minus sign for the negative number could confuse the argument parser, but it's not the case and everything is fine.

Thanks!

# launch ninja with priority 99 and -99
# properly parsed and clamped
ps -eo ni,pid,cmd
 0  461034 ninja -p 99
19  461035 [sh] <defunct>
19  461036 [sh] <defunct>
19  461038 [sh] <defunct>
19  461040 [sh] <defunct>

  0  461466 ninja -p -99
-20  461467 [sh] <defunct>
-20  461470 [sh] <defunct>
-20  461471 [sh] <defunct>
ninja/build$ ninja test 
[0/1] Running tests...
Test project /home/uqam/git/ninja/build
    Start 1: NinjaTest
1/1 Test #1: NinjaTest ........................   Passed    1.74 sec

@giraldeau
Copy link

Tests are failing on Fedora because of missing utility /bin/ps.

[ RUN      ] SubprocessTest.Priority
/__w/ninja/ninja/src/subprocess_test.cc:172: Failure
Expected equality of these values:
  " NI\n 10\n"
  subproc->GetOutput()
    Which is: "/bin/sh: line 1: ps: command not found\n"
With diff:
@@ -1,2 +1,1 @@
- NI
- 10\n
+/bin/sh: line 1: ps: command not found\n
[  FAILED  ] SubprocessTest.Priority (1 ms)

We need to install the package procps-ng in the build image, or maybe change the test such that it works without using ps.

@jhasse
Copy link
Collaborator

jhasse commented Aug 6, 2024

add the package here:

run: dnf install -y ninja-build cmake gtest-devel re2c clang util-linux

@clemenswasser
Copy link
Author

I've now opted to use the /proc fs directly. procps-ng uses the same anyway, and with that we don't require a new build time dependency.

@mcprat
Copy link
Contributor

mcprat commented Aug 22, 2024

I'd like to give posix_spawnattr_setschedparam() another try...

@giraldeau
Copy link

I'd like to give posix_spawnattr_setschedparam() another try...

pthread_setschedparam() is not intended for normal process, but for real-time and other secialized scheduling algorithm. To change niceness for normal process, we should use setpriority() system call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow setting the priority of spawned processes
6 participants