Skip to content

Commit

Permalink
Implement jobserver pool in Ninja.
Browse files Browse the repository at this point in the history
This allows Ninja to implement a jobserver-style pool of job slots,
to better coordinate parallel jobs between spawned processes which
compete for CPU cores/threads. With this feature, there is no need
for being invoked from GNU Make or a script like
misc/jobserver_pool.py.

NOTE: This implementation is basic and doesn't support broken
      protocol clients that release more tokens than they acquired.
      If your build includes these, expect severe build performance
      degradation.

To enable this use --jobserver or --jobserver=MODE on the
command-line, where MODE is one of the following values:

  0     Do not enable the feature (the default)
  1     Enable the feature, using best mode for the current system.
  pipe  Implement the pool with an anonymous pipe (Posix only).
  fifo  Implement the pool with a FIFO file (Posix only).
  sem   Implement the pool with a Win32 semaphore (Windows only).

NOTE: The `fifo` mode is only implemented since GNU Make 4.4
      and many older clients may not support it.

Alternatively, set the NINJA_JOBSERVER environment variable to
one of these values to activate it without a command-line option.

Note that if MAKEFLAGS is set in the environment, Ninja assumes
that it is already running in the context of another jobserver
and will not try to create its own pool.
  • Loading branch information
digit-google committed Oct 18, 2024
1 parent 334282a commit 146c55d
Show file tree
Hide file tree
Showing 5 changed files with 230 additions and 41 deletions.
3 changes: 3 additions & 0 deletions .github/workflows/linux.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ jobs:
run: |
./ninja_test
../../misc/output_test.py
../../misc/jobserver_test.py
- name: Build release ninja
run: ninja -f build-Release.ninja
working-directory: build
Expand All @@ -35,6 +36,7 @@ jobs:
run: |
./ninja_test
../../misc/output_test.py
../../misc/jobserver_test.py
build:
runs-on: [ubuntu-latest]
Expand Down Expand Up @@ -170,6 +172,7 @@ jobs:
./ninja all
python3 misc/ninja_syntax_test.py
./misc/output_test.py
./misc/jobserver_test.py
build-aarch64:
name: Build Linux ARM64
Expand Down
47 changes: 32 additions & 15 deletions doc/manual.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -190,31 +190,49 @@ you don't need to pass `-j`.)
GNU Jobserver support
~~~~~~~~~~~~~~~~~~~~~
Since version 1.13., Ninja builds can follow the
Since version 1.13., Ninja builds support the
https://https://www.gnu.org/software/make/manual/html_node/Job-Slots.html[GNU Make jobserver]
client protocol (on Posix systems). This is useful when Ninja
is invoked as part of a larger build system controlled by a top-level
GNU Make instance, as it allows better coordination between
concurrent build tasks.
protocol (on Posix systems). If supports both client and
server modes.
This feature is automatically enabled under the following
conditions:
Client mode is useful when Ninja is invoked as part of a larger
build system controlled by a top-level GNU Make instance, as it
allows better coordination between concurrent build tasks.
Server mode is useful when Ninja is the top-level build tool that
invokes sub-builds recursively in a similar setup.
To enable server mode, use `--jobserver` or `--jobserver-mode=MODE`
on the command line, or set `NINJA_JOBSERVER=MODE` in your
environment, where `MODE` can be one of the following values:
`0`: Do not enable the feature (the default)
`1`: Enable the feature, using the best mode for the current system.
`pipe`: Enable the feature, implemented with an anonymous pipe (Posix only).
`fifo`: Enable the feature, implemented with a FIFO file path (Posix only).
`sem`: Enable the feature, implemented with a Win32 semaphore (Windows only).
Note that `--jobserver` is equivalent to `--jobserver-mode=1`.
Otherwise, the client feature is automatically enabled for builds
(not tools) under the following conditions:
- Dry-run (i.e. `-n` or `--dry-run`) is not enabled.
- Neither `-j1` (no parallelism) or `-j0` (infinite parallelism)
are specified on the Ninja command line.
- `-j1` (no parallelism) is not used on the command line.
Note that `-j0` means "infinite" parallelism and does not
disable client mode.
- The `MAKEFLAGS` environment variable is defined and
describes a valid jobserver mode using `--jobserver-auth` or
even `--jobserver-fds`.
In this case, Ninja will use the jobserver pool of job slots
In this case, Ninja will use the shared pool of job slots
to control parallelism, instead of its default implementation
of `-j<count>`.
Note that load-average limitations (i.e. when using `-l<count>`)
are still being enforced in this mode.
Note that other parallelism limitations, (such as `-l<count>`) are *still*
being enforced in this mode however.
Environment variables
~~~~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -244,9 +262,8 @@ The default progress status is `"[%f/%t] "` (note the trailing space
to separate from the build rule). Another example of possible progress status
could be `"[%u/%r/%f] "`.
If `MAKEFLAGS` is defined in the environment, if may alter how
Ninja dispatches parallel build commands. See the GNU Jobserver support
section for details.
`NINJA_JOBSERVER` and `MAKEFLAGS` may impact how Ninja dispatches
parallel jobs, as described in the "GNU Jobserver support" section.
Extra tools
~~~~~~~~~~~
Expand Down
62 changes: 62 additions & 0 deletions misc/jobserver_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -241,6 +241,68 @@ def test_client_passes_MAKEFLAGS(self):
prefix_args=[sys.executable, "-S", _JOBSERVER_POOL_SCRIPT, "--check"]
)

def _run_pool_test(self, mode: str) -> None:
task_count = 10
build_plan = generate_build_plan(task_count)
extra_env = {"NINJA_JOBSERVER": mode}
with BuildDir(build_plan) as b:
# First, run the full 10 tasks with with 10 tokens, this should allow all
# tasks to run in parallel.
b.ninja_run([f"-j{task_count}", "all"], extra_env=extra_env)
max_overlaps = compute_max_overlapped_spans(b.path, task_count)
self.assertEqual(max_overlaps, 10)

# Second, use 4 tokens only, and verify that this was enforced by Ninja.
b.ninja_clean()
b.ninja_run(["-j4", "all"], extra_env=extra_env)
max_overlaps = compute_max_overlapped_spans(b.path, task_count)
self.assertEqual(max_overlaps, 4)

# Finally, verify that --token-count=1 serializes all tasks.
b.ninja_clean()
b.ninja_run(["-j1", "all"], extra_env=extra_env)
max_overlaps = compute_max_overlapped_spans(b.path, task_count)
self.assertEqual(max_overlaps, 1)

def test_jobserver_pool_with_default_mode(self):
self._run_pool_test("1")

def test_server_passes_MAKEFLAGS(self):
self._test_MAKEFLAGS_value(ninja_args=["--jobserver"])

def _verify_NINJA_JOBSERVER_value(
self, expected_value, ninja_args=[], env_vars={}, msg=None
):
build_plan = r"""
rule print
command = echo NINJA_JOBSERVER="[$$NINJA_JOBSERVER]"
build all: print
"""
env = dict(os.environ)
env.update(env_vars)

with BuildDir(build_plan) as b:
extra_env = {"NINJA_JOBSERVER": "1"}
ret = b.ninja_spawn(["--quiet"] + ninja_args + ["all"], extra_env=extra_env)
self.assertEqual(ret.returncode, 0)
self.assertEqual(
ret.stdout.strip(), f"NINJA_JOBSERVER=[{expected_value}]", msg=msg
)

def test_server_unsets_NINJA_JOBSERVER(self):
env_jobserver_1 = {"NINJA_JOBSERVER": "1"}
self._verify_NINJA_JOBSERVER_value("", env_vars=env_jobserver_1)
self._verify_NINJA_JOBSERVER_value("", ninja_args=["--jobserver"])

@unittest.skipIf(_PLATFORM_IS_WINDOWS, "These test methods do not work on Windows")
def test_jobserver_pool_with_posix_pipe(self):
self._run_pool_test("pipe")

@unittest.skipIf(_PLATFORM_IS_WINDOWS, "These test methods do not work on Windows")
def test_jobserver_pool_with_posix_fifo(self):
self._run_pool_test("fifo")


if __name__ == "__main__":
unittest.main()
1 change: 1 addition & 0 deletions src/build.h
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,7 @@ struct BuildConfig {
/// means that we do not have any limit.
double max_load_average;
DepfileParserOptions depfile_parser_options;
Jobserver::Config::Mode jobserver_mode = Jobserver::Config::kModeNone;
};

/// Builder wraps the build process: starting commands, updating status.
Expand Down
158 changes: 132 additions & 26 deletions src/ninja.cc
Original file line number Diff line number Diff line change
Expand Up @@ -215,29 +215,39 @@ struct Tool {

/// Print usage information.
void Usage(const BuildConfig& config) {
fprintf(stderr,
"usage: ninja [options] [targets...]\n"
"\n"
"if targets are unspecified, builds the 'default' target (see manual).\n"
"\n"
"options:\n"
" --version print ninja version (\"%s\")\n"
" -v, --verbose show all command lines while building\n"
" --quiet don't show progress status, just command output\n"
"\n"
" -C DIR change to DIR before doing anything else\n"
" -f FILE specify input build file [default=build.ninja]\n"
"\n"
" -j N run N jobs in parallel (0 means infinity) [default=%d on this system]\n"
" -k N keep going until N jobs fail (0 means infinity) [default=1]\n"
" -l N do not start new jobs if the load average is greater than N\n"
" -n dry run (don't run commands but act like they succeeded)\n"
"\n"
" -d MODE enable debugging (use '-d list' to list modes)\n"
" -t TOOL run a subtool (use '-t list' to list subtools)\n"
" terminates toplevel options; further flags are passed to the tool\n"
" -w FLAG adjust warnings (use '-w list' to list warnings)\n",
kNinjaVersion, config.parallelism);
fprintf(
stderr,
"usage: ninja [options] [targets...]\n"
"\n"
"if targets are unspecified, builds the 'default' target (see manual).\n"
"\n"
"options:\n"
" --version print ninja version (\"%s\")\n"
" -v, --verbose show all command lines while building\n"
" --quiet don't show progress status, just command output\n"
"\n"
" -C DIR change to DIR before doing anything else\n"
" -f FILE specify input build file [default=build.ninja]\n"
"\n"
" -j N run N jobs in parallel (0 means infinity) [default=%d on "
"this system]\n"
" -k N keep going until N jobs fail (0 means infinity) [default=1]\n"
" -l N do not start new jobs if the load average is greater than N\n"
" -n dry run (don't run commands but act like they succeeded)\n"
"\n"
" -d MODE enable debugging (use '-d list' to list modes)\n"
" -t TOOL run a subtool (use '-t list' to list subtools)\n"
" terminates toplevel options; further flags are passed to the tool\n"
" -w FLAG adjust warnings (use '-w list' to list warnings)\n"
"\n"
" --jobserver-mode MODE\n"
" Start a GNU Make jobserver protocol pool.\n"
" MODE can be one of the following values: %s\n"
"\n"
" --jobserver\n"
" Convenience flag, equivalent to --jobserver-mode=1\n\n",
kNinjaVersion, config.parallelism,
Jobserver::Config::GetValidNativeModesListAsString(", ").c_str());
}

/// Choose a default value for the -j (parallelism) flag.
Expand Down Expand Up @@ -1372,8 +1382,8 @@ int NinjaMain::RunBuild(int argc, char** argv, Status* status) {
Builder builder(&state_, config_, &build_log_, &deps_log_, &disk_interface_,
status, start_time_millis_);

// Detect jobserver context and inject Jobserver::Client into the builder
// if needed.
// If MAKEFLAGS is set, only setup a Jobserver client if needed.
// (this means that an empty MAKEFLAGS value disables the feature).
std::unique_ptr<Jobserver::Client> jobserver_client;

// Determine whether to use a Jobserver client in this build.
Expand Down Expand Up @@ -1502,15 +1512,23 @@ int ReadFlags(int* argc, char*** argv,
Options* options, BuildConfig* config) {
DeferGuessParallelism deferGuessParallelism(config);

enum { OPT_VERSION = 1, OPT_QUIET = 2 };
enum {
OPT_VERSION = 1,
OPT_QUIET = 2,
OPT_JOBSERVER = 3,
OPT_JOBSERVER_MODE = 4
};
const option kLongOptions[] = {
{ "help", no_argument, NULL, 'h' },
{ "version", no_argument, NULL, OPT_VERSION },
{ "verbose", no_argument, NULL, 'v' },
{ "quiet", no_argument, NULL, OPT_QUIET },
{ "jobserver", no_argument, NULL, OPT_JOBSERVER },
{ "joberver-mode", required_argument, NULL, OPT_JOBSERVER_MODE },
{ NULL, 0, NULL, 0 }
};

const char* jobserver_mode = nullptr;
int opt;
while (!options->tool &&
(opt = getopt_long(*argc, *argv, "d:f:j:k:l:nt:vw:C:h", kLongOptions,
Expand Down Expand Up @@ -1579,6 +1597,12 @@ int ReadFlags(int* argc, char*** argv,
case OPT_VERSION:
printf("%s\n", kNinjaVersion);
return 0;
case OPT_JOBSERVER:
jobserver_mode = "1";
break;
case OPT_JOBSERVER_MODE:
jobserver_mode = optarg ? optarg : "1";
break;
case 'h':
default:
deferGuessParallelism.Refresh();
Expand All @@ -1589,6 +1613,22 @@ int ReadFlags(int* argc, char*** argv,
*argv += optind;
*argc -= optind;

// If an explicit --jobserver has not been used, lookup the NINJA_JOBSERVER
// environment variable. Ignore it if parallelism was set explicitly on the
// command line though (and warn about it).
if (jobserver_mode == nullptr) {
jobserver_mode = getenv("NINJA_JOBSERVER");
}
if (jobserver_mode) {
auto ret = Jobserver::Config::ModeFromString(jobserver_mode);
config->jobserver_mode = ret.second;
if (!ret.first && !config->dry_run &&
config->verbosity > BuildConfig::QUIET) {
Warning("Invalid jobserver mode '%s': Must be one of: %s", jobserver_mode,
Jobserver::Config::GetValidNativeModesListAsString(", ").c_str());
}
}

return -1;
}

Expand Down Expand Up @@ -1628,6 +1668,72 @@ NORETURN void real_main(int argc, char** argv) {
exit((ninja.*options.tool->func)(&options, argc, argv));
}

// Determine whether to setup a Jobserver pool. This depends on
// --jobserver or --jobserver-mode=MODE being passed on the command-line,
// or NINJA_JOBSERVER=MODE being set in the environment.
//
// This must be ignored if a tool is being used, or no/infinite
// parallelism is being asked.
//
// At the moment, this overrides any MAKEFLAGS definition in
// the environment.
std::unique_ptr<Jobserver::Pool> jobserver_pool;

do {
if (options.tool) // Do not setup pool when a tool is used.
break;

if (config.parallelism == 1 || config.parallelism == INT_MAX) {
// No-parallelism (-j1) or infinite parallelism (-j0) was specified.
break;
}

if (config.jobserver_mode == Jobserver::Config::kModeNone) {
// --jobserver was not used, and NINJA_JOBSERVER is not set.
break;
}

if (config.verbosity >= BuildConfig::VERBOSE)
status->Info("Creating jobserver pool for %d parallel jobs",
config.parallelism);

std::string err;
jobserver_pool = Jobserver::Pool::Create(
static_cast<size_t>(config.parallelism), config.jobserver_mode, &err);
if (!jobserver_pool.get()) {
if (config.verbosity > BuildConfig::QUIET)
status->Warning("Jobserver pool creation failed: %s", err.c_str());
break;
}

std::string makeflags = jobserver_pool->GetEnvMakeFlagsValue();

// Set or override the MAKEFLAGS environment variable in
// the current process. This ensures it is passed to sub-commands
// as well.
#ifdef _WIN32
// TODO(digit): Verify that this works correctly on Win32.
// this code assumes that _putenv(), unlike Posix putenv()
// does create a copy of the input string, and that the
// resulting environment is passed to processes launched
// with CreateProcess (the documentation only mentions
// _spawn() and _exec()).
std::string env = "MAKEFLAGS=" + makeflags;
_putenv(env.c_str());
#else // !_WIN32
setenv("MAKEFLAGS", makeflags.c_str(), 1);
#endif // !_WIN32

} while (0);

// Unset NINJA_JOBSERVER unconditionally in subprocesses
// to avoid multiple sub-pools to be started by mistake.
#ifdef _WIN32
_putenv("NINJA_JOBSERVER=");
#else // !_WIN32
unsetenv("NINJA_JOBSERVER");
#endif // !_WIN32

// Limit number of rebuilds, to prevent infinite loops.
const int kCycleLimit = 100;
for (int cycle = 1; cycle <= kCycleLimit; ++cycle) {
Expand Down

0 comments on commit 146c55d

Please sign in to comment.