module: allow module.register from workers #53200

dygabo · 2024-05-29T13:51:18Z

this is a fix for #53182

by implementing a guard on the native side, we can ensure that the internal customization hooks thread is instantiated only once but not necessarily from the main thread.
https://github.com/ShogunPanda/hooks-repro example is not hanginng

the second commit reenables the tests that were skipped because module.register() was temporarily not supported on worker threads.

One is still failing on workers. Currently analyzing.

This implements the fix based on solution number 2. from this comment

alternative to #53183

nodejs-github-bot · 2024-05-29T13:51:24Z

Review requested:

@nodejs/loaders

dygabo · 2024-05-29T13:51:55Z

@nodejs/loaders @mcollina @ShogunPanda

ShogunPanda · 2024-05-29T13:56:44Z

I don't think you need to modify on the C++.
Can't you just use a one-size SharedArrayBuffer that is created on the main thread and forwarded to each worker?
This way you can use Atomics.Store on it.

lib/internal/modules/esm/hooks.js

lib/internal/modules/esm/loader.js

dygabo · 2024-05-29T14:55:55Z

in current form this should fix all the regressions that I was made aware of. module.register should seemlesly work as before. We have one worker for the hooks instead of N.

ShogunPanda · 2024-05-29T15:07:09Z

@dygabo

Have you tried the following case:

hooks.mjs

const now = Date.now();

export function initialize() {
  process._rawDebug(`[${now}] Hooks initialized ${Date.now()}.`);
}

export function resolve(specifier, context, next) {
  process._rawDebug(`[${now}] Hooks resolving ${specifier}.`);
  return next(specifier);
}

t1.js

const { register } = require("node:module");
const { pathToFileURL } = require("node:url");
const { Worker } = require("worker_threads");

register("./hooks.mjs", pathToFileURL(__filename));

new Worker("./t2.js");

t2.js

const { Worker } = require("worker_threads");

new Worker("./t3.js");

t3.js

import("../lib/local.js");

Note that there is no --import anywhere.

dygabo · 2024-05-29T15:19:01Z

@dygabo

Have you tried the following case:

@ShogunPanda I tried that now. Works IMO exactly like v22.1 worked (and like 22.2.0 for that matter). What is the expectation?

ShogunPanda · 2024-05-29T15:20:25Z

Well, that the hooks are executed. Which is not the case even on 22.1. I think this should be fixed as well as part of this.

dygabo · 2024-05-29T15:23:36Z

Well, that the hooks are executed. Which is not the case even on 22.1. I think this should be fixed as well as part of this.

the reason for that is unrelated to the single hooks thread. If that should work I am for creating a separate issue and deal with that on its own. What do you think?

test/es-module/test-esm-loader-threads.mjs

ShogunPanda · 2024-05-29T15:35:39Z

Looks fine to me.
Anyway, I'm working on a different solution that will work for both cases. I hope to have it ready by EOD tomorrow and we can pick the one that seems better.

GeoffreyBooth · 2024-05-29T15:45:43Z

Can this get a new test or two from https://github.com/ShogunPanda/hooks-repro, the ones that we expect to pass now thanks to this change?

In particular I want a test that includes new Worker, both with and without --import, so that we test the two worker flows in every CI pipeline, and locally when devs run make test, so that we don’t rely on that one CI pipeline that runs all the tests in a worker.

mcollina

lgtm

ShogunPanda · 2024-05-29T15:59:30Z

@dygabo As said, it looks fine to me but I don't really like the C++ change. If you can find a JS it would be better but I'm not gonna block for this thing only.

ShogunPanda

LGTM!

lib/internal/modules/esm/hooks.js

lib/internal/modules/esm/loader.js

Flarna · 2024-05-29T17:50:17Z

src/node_worker.cc

@@ -46,6 +46,7 @@ namespace node {
 namespace worker {

 constexpr double kMB = 1024 * 1024;
+std::atomic_bool Worker::internalExists{false};


Please rename this and other variables/parameters named "internal" but actually refer to loader hooks thread in this PR or some followup.

good point, that should be done. Will check and make a separate commit for that renaming

renamed to HooksWorker instead of InternalWorker

Flarna · 2024-05-29T17:54:13Z

lib/internal/modules/esm/hooks.js

    const lock = new SharedArrayBuffer(SHARED_MEMORY_BYTE_LENGTH);
    this.#lock = new Int32Array(lock);

-    if (isMainThread) {
+    if (!hasHooksThread()) {


Is this really thread safe?
If serveral workers are started in parallel (maybe even from serverl workers already running in parallel) more then one might get false here and as a result more then one starts a new InternalWorker.
The atomic just ensures that it is set/read consistent but the depending code around is not synced.

I know this race isn't easy to hit but I think a critical section/mutex would be needed here.

An alternative might be to change hasHooksThread() to check and set the flag atomic. But this might result in another race that the worker which set the flag and will actually start the hooks thread might be slower then others in doing so.

There is no race condition. There are two scenarios:

.register() is called in main, and the parent thread is inherited by other threads

.register() is called in the worker thread, and a new loader is used

that did have the potential for a race. The last commit addresses that. I left HasHooksThread in the code for now because that would be the tool to solve another usecase once we agree on the behaviour. Similar to the usecase @ShogunPanda pointed out yesterday: main thread start without hooks, it starts a Worker with hooks (using execArgv and --import). This will create the hooks thread but that would only propagate for the cases where we have implicit propagation of the arguments. Should mainthread use the hooks for the subsequent imports (if any?).

Flarna · 2024-05-29T18:11:05Z

Does this impact the use case to communicate with worker described here?

aduh95 · 2024-05-29T18:12:06Z

lib/internal/modules/esm/hooks.js

+    const alreadyKnown = ArrayPrototypeSome(ArrayPrototypeMap(['initialize', 'resolve', 'load'], (hookName) => {
+      if (this.#chains[hookName]) {
+        return ArrayPrototypeFilter(this.#chains[hookName], (el) => el.url === url).length === 1;
+      }
+    }), (el) => el);
+
+    if (alreadyKnown) {
+      return undefined;
+    }
+


I don't think we should add that, it should be the responsibility of the user to not register a loader twice

I agree.
Also, the way it's written now does not checks for the data. Which means that if a hook is registered twice with different data the latter won't be applied.

considering the data could and should be added to the filtering. The reason why I added it was to prevent automatic reregistration due to not only user re-registering unintendedly the same hooks but also because of the automatic spreading that happens in node code. Think of the main usecase: node --import registerHooks.mjs app.js:

this gets passed to the hooks thread itself on initialization => two times registered
it also automatically gets passed to any new Worker() (in some conditions. Means each such worker will reregister. And now combined with a scenario where the app has short-lived ephemeral threads => longer and longer chain.

Alternative ideas are welcome. But I think it has a usecase that is also beyond what the user app does.

Let's remove it and discuss it in a separate PR. In the mean time, it's up to the user to ensure they don't register the same loader twice.

@aduh95 I think the issue is that if you start your app with a flag, like node --import hooks.js app.js, those flags get automatically inherited by workers unless the user launches them via new Worker('./worker.js', { execArgv: [] }. So the default case of new Worker('./worker.js') would mean that hooks.js gets run again, and those hooks re-registered, for every worker thread that the user creates.

I think the naïve user using a library like node --import tsx app.js won’t have any idea that they need to pass execArgv: [] to avoid tsx getting registered a second time; nor should they need to know that. Perhaps tsx can somehow be smart enough to know that it’s already registered and therefore avoid calling register a second time; is that what you’re proposing?

I added logic to consider different data with same url to be its own loader hook in the chain. Tests are still missing, ongoing work.

Flarna · 2024-05-29T18:15:12Z

Are hooks added by workers removed automatically if the worker is ended? If not this seems like a leak.

What happens if the worker which eventually created the hooks thread terminates? Does it end also the hooks thread?

What happens if the hooks thread exists/crashes? As far as I know before this PR the main thread got notified and as a result the process exited. Is this still the case or does this just end the worker which created the hooks thread?

Flarna · 2024-05-29T18:27:17Z

Is there a reason why test-esm-virtual-json is still disabled for workers?

dygabo · 2024-05-29T19:06:02Z

I will look into the comments tomorrow and add more tests. Will also check for alternative solutions to the c++ change.

mcollina

Unfortunately, this is not solving #53195, because in that use case we want explicitly to have a loader thread per worker (or at least per worker pool).

GeoffreyBooth · 2024-06-03T15:53:15Z

That would mean that we always have the hooks thread and all the operations would go through it, only through the default hook. It would be easier to implement but if not acceptable, then API extension is needed.

I think one of the requirements needs to be that the hooks thread is never created until the first register call, so that the default case of an app with no customization hooks doesn’t get any slower.

mcollina · 2024-06-03T16:13:56Z

lib/internal/modules/esm/hooks.js

+    const alreadyKnown = ArrayPrototypeSome(ArrayPrototypeMap(['initialize', 'resolve', 'load'], (hookName) => {
+      if (this.#chains[hookName]) {
+        return ArrayPrototypeFilter(
+          this.#chains[hookName], (el) => el.url === url && el.data === data).length === 1;


I'm not sure this comparison of data is correct and this might actually never be triggered. Can you add a test?

I think one good test would be (all filenames for illustration only, please use whatever names correspond to the appropriate fixtures):

Node is run via node --import=register.js app.js

register.js contains a register call that registers some hooks, and one of these hooks prints something on initialization

app.js contains new Worker('./worker.js') to create a worker thread without any specific customization (no execArgv)

Verify that the “print on initialization” doesn’t happen a second time

The point of this test is to ensure that even though new Worker without execArgv inherits the --import flag from the initial Node process, and even though register.js runs twice, the hooks don’t get registered twice. This filter check prevents double registration of the same hooks.

I'm not sure this comparison of data is correct and this might actually never be triggered. Can you add a test?

Of course. This must be a deep equality compare. Will update. Tests need generally to be added for quite a few things. Just wanted to make sure there is agreement on the approach and featureset/constraints before.

@dygabo In my (soon to be published) PR I used isDeepStrictEqual from internal/util/comparisons, which is the one internally used from assert.

GeoffreyBooth · 2024-06-03T16:38:23Z

@dygabo Please rebase this on top of the branch from #52706, as that should resolve the conflicts. When this branch is ready to land, it can just include the commits from the previous PR and we’ll get everything landed in one PR/commit.

ShogunPanda · 2024-06-04T05:40:45Z

@GeoffreyBooth Since #53183 landed, shouldn't he just rebase on top of main so we won't need a revert-revert PR before?

dygabo · 2024-06-04T05:43:06Z

I would do a rebase on tip of main that includes both. Wdyt?

`module.register` not supported when called from worker threads with this commit it only works for the main thread.

Co-authored-by: Geoffrey Booth <webadmin@geoffreybooth.com>

ShogunPanda · 2024-06-04T13:18:40Z

FTR - I have created #53332 which includes and extends this PR.
@dygabo thanks for the amazing work on this.

dygabo · 2024-06-13T16:15:34Z

Status update:

today I tried to implement a solution for the new requirements and bring it to a runnable state but it's not yet there.
In fact this seems to be more difficult now to implement and that's why I would like to start a discussion (maybe here, maybe on a separate issue on the loaders repo?).

The current state as mentioned before is:

HooksThread is only allowed once per process.
this is now implemented cleanly, a second try to start the HooksThread while one already exists will throw an ERR_HOOKS_THREAD_EXISTS exception
the issue is that if the main thread did not instantiate the hooks thread (because it does not need hooks) than this would be done somewhere down the tree
at this point we might have the case that more than one thread will try to start the HooksThread. And the second one will fail. As a failover it will need the hooksPort but that is not available yet and there is no thread safe way of propagating it during HooksThread careation to all existing workers at that moment. With current mechanisms the hooksPort is configured from the point where the HooksThread was created for all children but we have no easy way of passing this information to unrelated arbitrary threads.

So something like this where T1 and T2 call module.register and the main thread does not have a hooks thread:

        MT
       /  \
      /    \
     T1    T2(calls register)
    / \      
   T3  HT

would not work well for T2.

The idea that I tried to prove was:

assume that the main thread does not start the HooksThread because it does not need it
instead it propagates a communication channel for each spawned thread. So there will be one hooksPortServiceProvider port available for each thread to request the registration of a hooksPort (1)
the receiving end of each of those channels would be a message handler running on the main thread. The main thread can then, in the message handler check if one HooksThread already exists.
if yes => forward the port from the worker that needs a connection to the existing HooksThread
if not => create the HooksThread and then forward the port to it as in the positive case
the action in (1) above would happen when the HooksProxy gets instantiated on the user Worker. That is instantiated only if the Worker calls module.register() which means it needs customization hooks and the workers that don't need it would not be connected to the HooksThread
each thread that needs hooks will do the same, the first one that needs them would end up triggering the thread creation, all the others would just get a MessageChannel configured with one port in their own isolate and the second port transfered to the HooksThread. But the HooksThread is always created by the main thread (also when it itself does not need it).

          MT
         / | \
        /  |  \
       /   |   \
      T1  T2   HT
    /
   T3

I think that this might work from a design point of view, but it takes more time to implement than I would have expected and I don't want to be the blocker here. If the approach is considered valid (even though it is IMO complex but the requirements are too) and if someone wants to look into it we could further discuss it and they could either push to my branch or start a branch on top of this and add that functionality.

Having more of this on native side, implemented in C++ might be an option too but also one that exceeds my available bandwidth so I would not even go into details.

Also if there are other ideas for working solutions, let's discuss them. This one tends to be difficult in maintenance because of the messaging complexity and the synchronous AtomicWait solution also creates implementation complexity. So let's have a discussion and try to decide the path forward.

I would also consider the current version (N workers => N customization hooks threads) an idea worth keeping because at least it seems many users are happy with it.

The single thread approach has one additional drawback: one failed hook initiated by one thread would imply that all threads that need hooks must fail. With the current approach there's a better isolation and that is IMO important.

If in the above example T1, T2 call module.register() and T3 inherits the hooks from its parent, if an action on one of them crashes the HooksThread, all of them have to exit. Main Thread would survive because it's not affected by the hooks. If T4 would be a sibling of T1 and T2 that does not use hooks, it will survive as well.
I think this dynamic is hard to follow for a user and makes the feature hard to reason about.

Also as discussed in the last loaders meeting, there is no way of guaranteeing complete isolation if hooks from different threads and different contributing loaders would run on the same thread. They might affect each other unintendedly in weird ways that would make troubleshooting very hard. That might happen even with the chain isolation implemented by @ShogunPanda which is IMO a good solution for logical isolation but it is not side-effect-free.

Let's please start the discussion and see where it goes. Current state on my side: it's complicated :)

GeoffreyBooth · 2024-06-13T16:31:48Z

@dygabo Thanks for the update. Maybe I missed something, but did the BroadcastChannel idea not work as a potential solution?

ShogunPanda · 2024-06-13T18:38:53Z

I'm typing from the phone but I'll also give you a small update here which might be a game-changer for this feature.

I have a local working POC, which I plan to translate into a PR in few days which will enable inter-worker (thread) communication.
The idea is that, at any given time, a thread can request a channel to another thread (no matter if main, parent of children at any level) just by using its id.

By retaining the current hooksPort architecture, this means that, once the hooks thread is created, a BroadcastChannel can be used to notify all threads and then they can use the new feature to handshake the hooksPort.

In the future this feature would also allow us to completely remove the need for the hooksPort as each thread can connect directly to the hooks thread itself.
For instance, since we already have the main thread with id 0, we could reserve id 1 for the hooks thread and then assign all other threads id from 2 and up.

Just quick thoughts tho, I need to think more deeply about the implications.

dygabo · 2024-06-14T06:43:58Z

did the BroadcastChannel idea not work as a potential solution?

I tried to use the BoradcastChannel idea from @ShogunPanda but kindof hit the wall because of two reasons:

                   MT
                  / | \
         ---------  |  ---------
        /           |           \
       T1          T2  [...]    TN
    /   |   \       |          /  \
 T1.1 T1.2  T1.3  T2.1 [...] TN.1 TN.2
   |                               |
  HT                            needs HT

We might have a race on creating the HooksThread. On the above example consider MT and all T1...TN don't need customization hooks. They all are different OS threads created here which results in a pthread_create call in libuv.
The mutex protected part of the thread creation process is Worker::New. On js side we are here with the Worker constructor called from here in case of the HooksThread.

Now let's consider T1.1 and TN.2 are racing for the HooksThread creation (they both call module.register() at the same time). So one will get the worker instance here and it is happy. The other one will throw and in the catch block would try to use the preset hooksPort. But there is no hooksPort yet because the Worker object is created but the thread does not run yet. Still the LOAD_SCRIPT message of the T1.1 would have to run and then it has to ensure synchronous propagation of the hooksPort to all running threads before any of them get to the catch block of the thrown ERR_HOOKS_THREAD_EXISTS.

I could not find a way to implement this in a threadsafe way by using the BroadcastChannel as it currently is implemented. Even if we could transfer object via BroadcastChannel(which is difficult because there are N receivers of the broadcast messages, hard to decide which one will get the transferList) we would have the problem that the mutex protected part is just Worker::New and not the whole Worker creation and initialization phase. I'm not saying it is not solvable. Just that it takes more effort and time that I cannot currently allocate for it.

By retaining the current hooksPort architecture, this means that, once the hooks thread is created, a BroadcastChannel can be used to notify all threads and then they can use the new feature to handshake the hooksPort.

I am quite happy to hear of your idea @ShogunPanda and I hope it covers all the cases. The only part that I am a bit skeptical about is the handshake of the hooksPort mentioned above could be achieved thread safe in case of races like described above. On paper, having the main thread orchestrate the HooksThread creation seems to be IMO the only race-free way of achieving this with our current utilities.

I would also be very happy if the implementation of this feature would have as a beneficial side-effect a new and easy worker thread communication mechanism like you describe it.

ShogunPanda · 2024-06-17T13:19:19Z

@dygabo My PR is now live: check #53488

nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels May 29, 2024

dygabo marked this pull request as draft May 29, 2024 13:52

aduh95 reviewed May 29, 2024

View reviewed changes

lib/internal/modules/esm/hooks.js Outdated Show resolved Hide resolved

aduh95 reviewed May 29, 2024

View reviewed changes

lib/internal/modules/esm/loader.js Outdated Show resolved Hide resolved

dygabo marked this pull request as ready for review May 29, 2024 14:37

GeoffreyBooth reviewed May 29, 2024

View reviewed changes

test/es-module/test-esm-loader-threads.mjs Outdated Show resolved Hide resolved

test/es-module/test-esm-loader-threads.mjs Outdated Show resolved Hide resolved

mcollina approved these changes May 29, 2024

View reviewed changes

ShogunPanda approved these changes May 29, 2024

View reviewed changes

Flarna reviewed May 29, 2024

View reviewed changes

aduh95 reviewed May 29, 2024

View reviewed changes

GeoffreyBooth mentioned this pull request May 29, 2024

Revert "module: have a single hooks thread for all workers" #53183

Merged

dygabo marked this pull request as draft May 30, 2024 00:17

mcollina requested changes May 30, 2024

View reviewed changes

mcollina reviewed Jun 3, 2024

View reviewed changes

dygabo force-pushed the allow-hooks-thread-from-worker branch from 14d3115 to f7cf9b6 Compare June 4, 2024 07:21

dygabo and others added 12 commits June 4, 2024 09:26

module: have a single hooks thread for all workers

da93a14

`module.register` not supported when called from worker threads with this commit it only works for the main thread.

module: allow module.register from workers

fd5ffdf

reenable tests on workers

f715679

fix unfinished rename

34bc398

fixup! reenable tests on workers

a33afea

fixup! module: allow module.register from workers

332a22d

remove debugging output code

c621394

lint

89b8d6b

Apply suggestions from code review

8fe957e

Co-authored-by: Geoffrey Booth <webadmin@geoffreybooth.com>

make worker instantiation thread safe + some other review findings

75dc437

ongoing work for hook ownership

f94be0a

rename to

d9169d4

dygabo force-pushed the allow-hooks-thread-from-worker branch from f7cf9b6 to d9169d4 Compare June 4, 2024 07:27

ShogunPanda mentioned this pull request Jun 4, 2024

module: allow multiple chain #53332

Closed

mhdawson mentioned this pull request Jun 10, 2024

Node.js Technical Steering Committee (TSC) Meeting 2024-06-12 nodejs/TSC#1575

Closed

GeoffreyBooth removed the tsc-agenda Issues and PRs to discuss during the meetings of the TSC. label Jun 10, 2024

dygabo mentioned this pull request Jun 17, 2024

worker: add connect and setConnectionsListener #53488

Closed

GeoffreyBooth mentioned this pull request Jun 17, 2024

Hooks thread direction nodejs/loaders#201

Closed

module: allow module.register from workers #53200

Are you sure you want to change the base?

module: allow module.register from workers #53200

Conversation

dygabo commented May 29, 2024 • edited Loading

nodejs-github-bot commented May 29, 2024

dygabo commented May 29, 2024

ShogunPanda commented May 29, 2024

dygabo commented May 29, 2024

ShogunPanda commented May 29, 2024 • edited Loading

dygabo commented May 29, 2024

ShogunPanda commented May 29, 2024

dygabo commented May 29, 2024

ShogunPanda commented May 29, 2024

GeoffreyBooth commented May 29, 2024

mcollina left a comment

Choose a reason for hiding this comment

ShogunPanda commented May 29, 2024

ShogunPanda left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Flarna commented May 29, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Flarna commented May 29, 2024

Flarna commented May 29, 2024

dygabo commented May 29, 2024 • edited Loading

mcollina left a comment

Choose a reason for hiding this comment

GeoffreyBooth commented Jun 3, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

GeoffreyBooth commented Jun 3, 2024

ShogunPanda commented Jun 4, 2024

dygabo commented Jun 4, 2024

ShogunPanda commented Jun 4, 2024

dygabo commented Jun 13, 2024

GeoffreyBooth commented Jun 13, 2024

ShogunPanda commented Jun 13, 2024

dygabo commented Jun 14, 2024 • edited Loading

ShogunPanda commented Jun 17, 2024

dygabo commented May 29, 2024 •

edited

Loading

ShogunPanda commented May 29, 2024 •

edited

Loading

dygabo commented May 29, 2024 •

edited

Loading

dygabo commented Jun 14, 2024 •

edited

Loading