-
Notifications
You must be signed in to change notification settings - Fork 29.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Pluggable thread pool #22631
WIP: Pluggable thread pool #22631
Conversation
Here I will track Node-specific benefits of a pluggable threadpool, as I become aware of them. Benefit: Unified threadpool for V8 and libuv operationsA pluggable threadpool enables using the same threadpool for V8 operations and libuv operations (finally -- cf. #11855). In #14001 @matthewloring initially tried to use the libuv threadpool for the NodePlatform implementation ( This "only the event loop can queue work" restriction no longer holds once the threadpool is owned by Node.js. This suggests that the proposed Node.js threadpool "queue work" API should be thread-safe, and that within the NodePlatform threadpool hooks we should route calls to the Node.js threadpool. Instead of two distinct threadpools of size Extending this benefit to the
|
FWIW, I don't think |
Summary: This commit outlines the general API for the node threadpool. The current node threadpool is "plugged in" to libuv, but not V8. Thoughts: I think the current API will generally suffice going forward. For example, separate I/O and CPU pools can be implemented by sub-classing TaskQueue and introducing multiple separate queues for I/O-bound and CPU-bound Tasks. Routing logic: I plug this TP into libuv during the call to Start(). I would like to refactor out a 'LibuvExecutor' class similar to NodePlatform and have them both use similar task routing logic. I have not yet routed the v8::Platform's TP into this TP. Tests: I introduced (passing) unit tests in test/cctest/test_threadpool.cc. In addition, this version passes much of the core test suite. Some failures, due e.g. to the lack of uv_cancel support comparable to that of libuv's default executor in this iteration.
In bc0f42e I sketched a minimal See the commit message for details. |
src/node_threadpool.cc
Outdated
#include <stdio.h> | ||
#define LOG_0(fmt) fprintf(stderr, fmt) | ||
#define LOG_1(fmt, a1) fprintf(stderr, fmt, a1) | ||
#define LOG_2(fmt, a1, a2) fprintf(stderr, fmt, a1, a2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you just use __VA_ARGS__
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, didn't know about that. Thank you.
@cjihrig I'm planning to give it a try, i'm interested why it shouldn't ? |
Summary: In this commit I rewired NodePlatform to use the threadpool::Threadpool I introduced in the previous commit. Approach: I touched the existing NodePlatform implementation as little as possible. Thus I wire WorkerThreadsTaskRunner to Post to a threadpool::Threadpool. Existing problems: Not all existing behaviors supported by WorkerThreadsTaskRunner are supported by the current threadpool::Threadpool implementation. Where they do not exist I replaced them with no-ops. Node currently runs tests correctly (I tried a few) but segfaults during cleanup. I believe this is because of the lack of support for a "BlockingDrain" API. The CreatePlatform API is externalized. I do not know what this is for. This somewhat complicates my plan to have it accept a threadpool::Threadpool as an argument. Maybe I should overload this function and retain the existing n_threads API too? Next steps: 1. Refactor out a LibuvExecutor in node_threadpool, analogous to the WorkerThreadsTaskRunner. 2. threadpool::Threadpool must support the union of the APIs needed by its various consumers (notably WorkerThreadsTaskRunner). 3. Possibly we could refactor out the WorkerThreadsTaskRunner as an optimization (?), but since it handles Delayed tasks as well as "do them right now" Tasks it is a useful front-end. I don't intend such a refactoring to be part of the eventual PR. 4. Consider overloading MultiIsolatePlatform/NodePlatform to retain the existing 'n_threads' API. When used, this should create and use a threadpool::Threadpool private to the NodePlatform.
In 7200111 I added the rewiring logic I mentioned here. Now both V8 and libuv tasks are routed to the same Node.js-land threadpool. Hooray! Of course this is A Bad Idea without multiple queues to avoid more I/O-bound vs. CPU-bound conflicts. Will look into that soon. On the notes in 7200111 about |
No functional change in this commit. I added a standalone LibuvExecutor that I plug into libuv. Analogous to the NodePlatform's WorkerThreadsTaskRunner, this design decouples the duties of the Threadpool from the interface with libuv (V8).
No functional change
Tasks can be in one of these states: - QUEUED - ASSIGNED - COMPLETED
In a previous commit I was too eager in deleting threads from the WorkerThreadsTaskRunner. The DelayedTaskScheduler is the responsibility of the WorkerThreadsTaskRunner, not the internal Threadpool. Thus we should start and stop it correctly. This was the cause of the segfault I mentioned earlier.
This completes the NodePlatform rewiring begun in a previous commit. This BlockingDrain will wait on both V8 Tasks and libuv Tasks. It waits on all Tasks in the Threadpool, even though NodePlatform only cares about BlockingDrain'ing the V8 Tasks.
Highlights in the commits I just pushed:
Still need a Cancel for libuv. I added state tracking in 8771ad8 which is a step towards this. |
This mostly-matches the old behavior, except that instead of using a self-managed TP the NodePlatform uses a private threadpool::Threadpool instance. It's not clear whether an embedder would like to plug in their own Threadpool, so play it safe for now. Hopefully I can get better insight into desired behavior from other community members.
No functional change
1. Use RAII for Threadpool. Don't have a separate Initialize phase. This was previously useful because the Threadpool knew about libuv. Now that there is a LibuvExecutor, we can use RAII. 2. Have Threadpool accept a size. When absent, try: - UV_THREADPOOL_SIZE (default libuv TP size) - # cores (default --v8-pool-size behavior)
Feature: Ability to cancel Tasks Post'ed to the Threadpool. Need: A LibuvExecutor would like this. Approach: Fact: Threadpool::Post accepts a unique_ptr. Fact: In principle we can easily cancel Tasks that have not yet been queued. Fact: But it's hard to cancel a Task if we gave away our pointer to the Threadpool. Threadpool::Post now returns a shared_ptr to a TaskState object. You can TaskState.Cancel() and it might work. This is the behavior offered by the default libuv threadpool as well.
This lets Threadpool operate at a higher level of abstraction. While I was in there, I switched to using smart pointers for the TaskQueue shared by the Workers.
This permits the use of the existing Threadpool class as a building block for more sophisticated NodeThreadpools. The default NodeThreadpool is just a pass-thru for a Threadpool.
Splitting by I/O and CPU is a form of PartitionedNodeThreadpool. A PartitionedNodeThreadpool basically follows the "threadpool handles" proposed by saghul for libuv, but implemented in the Node-land executor.
According to my traces, If anyone out there knows about npm internals, I would appreciate some insight on this. Are my traces wrong? |
This tool handles the output of PrintStats. It produces graphs of TP queue lengths sampled over time. It also prints a summary of task counts groupe by origin and type. In the plot of pool queue lengths, it includes a per-TP plot of "# CPU tasks" in the queue. When running with NODE_THREADPOOL_TYPE=SPLIT_BY_ORIGIN, this lets us visualize the extent to which the libuv TP is working on both CPU and I/O.
Problem: Apparently some uses of node go through Exit without fully cleaning up, so printing stats in the PartitionedNodeThreadpool d'tor was not giving us stats. This was true, for example, for 'npm install'. Solution: DrainAndPrintStats in both node.cc::Start and node.cc::Exit, and only keep the first one if we see it twice. This should not be merged. Just performance profiling. And dumping stats in SignalExit is unsafe.
The most recently pushed changes introduce At the moment, if you invoke my version of node with For example, on a toy application it produces this summary:
At the moment I am looking for applications that use the threadpool for anything besides FS, since on FS-only workloads this PR won't have any effect.
If so, comment here or contact me by email! |
This commit should be reverted before merging the PR. The Node.js PR should follow a separate PR bumping the libuv version to one with my libuv PR.
I've just included the latest version of my libuv PR as part of this PR in order to:
Commits like 64ce6e1 should be removed before merging this PR. |
Updates the libuv PR, which I rebased on v1.x (~v1.23.1).
I've started a CI run to test this PR -- which should let me know, among other things, how my Windows changes in libuv went. |
@davisjam Still working on this one? |
@Trott I still haven't heard from @bnoordhuis on the libuv side of things. |
#include <algorithm> | ||
|
||
// TODO(davisjam): DO NOT MERGE. Only for debugging. | ||
// TODO(davisjam): There must be a better way to do this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to look at debug_utils.h
for some ideas?
@davisjam Do you think you could get this PR and the libuv one up to date in terms of merge conflicts, so that it might be a bit easier to test this out? As for feedback on the concepts introduced here:
|
@addaleax Great feedback, thank you.
Do you have any suggestions for the class(es) and knob(s) you think we should expose? |
@davisjam I think it would be okay if, at least for an initial implementation, we could do something similar to what we had in libuv, i.e. try to keep a certain number/percentage of threads free from potentially long-running tasks? I think that’s already possible with what you have here, right? |
Ping @davisjam |
ping @davisjam 👋 Any updates on this? |
I'm cleaning out a few old PRs 🧹. I'm closing this due to inactivity. Please re-open if needed! |
DO NOT MERGE (yet? ever?).
This is an experimental PR based on the "pluggable thread pool" concept.
This PR will eventually contain the changes necessary to use a pluggable thread pool in Node.js, as well as one or more prototype pluggable thread pools and a description of the performance impact.
This PR won't compile as presented here because on my development machine I am using my PluggableThreadPool branch of libuv. If you want to try it yourself, check out that branch and overwrite
deps/uv
with it.I am opening this PR rather prematurely in an attempt to solicit feedback from the community about the general idea, as well as particular suggestions on Node.js-level threadpool designs.
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes