Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

worker_threads: worker.postMessage does not get executed without exiting synchronous call stack #25630

Closed
sbalko opened this issue Jan 22, 2019 · 8 comments
Labels
doc Issues and PRs related to the documentations. worker Issues and PRs related to Worker support.

Comments

@sbalko
Copy link

sbalko commented Jan 22, 2019

  • Version: 11.7.0
  • Platform: Linux threadripper 4.15.0-43-generic Tracking / Assuring Compatibility #46-Ubuntu SMP Thu Dec 6 14:45:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
  • Subsystem: worker_threads

I am playing with node's worker_threads, running Emscripten-generated multi-threaded code in this way. From what I can see, the worker's

parentPort.on("message", function(msg) {
...
});

will not get triggered, if the main thread calls worker.postMessage(...) without "unwinding the stack" (ie., returning control to node's event loop). Unfortunately, this is exactly the kind of code that Emscripten generates: the main thread will use Atomics.wait(...) to wait for its child threads (workers) to complete, but it will not actually return control to node.

Note that the Worker.postMessage(...) semantics is insofar different from how postMessage works in browsers, which gets executed immediately, ie., WITHOUT returning control to the browser.

This may be related to #21417.

@bnoordhuis bnoordhuis added the worker Issues and PRs related to Worker support. label Jan 22, 2019
@bnoordhuis
Copy link
Member

Can you post a minimal but complete test case? Sending from the main thread should work without returning control to the event loop.

(Receiving it in the worker is a different matter, of course; JS code is not interruptible.)

@addaleax
Copy link
Member

addaleax commented Feb 6, 2019

ping @sbalko – can you provide a test case, or steps to reproduce?

@kripken
Copy link

kripken commented May 12, 2019

Attached are two testcases (sorry they aren't minimal, but the relevant diff between a.out.js and b.out.js is very small, see below).

Both testcases do a pthread_create, which sends a message to an existing Worker to load a script file and start to run a function there. The difference is that the first, a.out.js, does a while (1) {} at the end of function _main(), after sending that message. b.out.js, on the other hand, avoids the synchronous infinite loop and does a call to _emscripten_set_main_loop which creates a callback that is run once per second, forever (using setTimeout or requestAnimationFrame). So in b.out.js we return to the main event loop on the main thread.

The relevant part of the diff of a.out.js and b.out.js is here:

 function _main() {
[..]
- (_printf(1110,$vararg_buffer7)|0);
- while(1) {
- }
- return (0)|0;
+ (_printf(1121,$vararg_buffer7)|0);
+ _emscripten_set_main_loop((6|0),1,1);
+ $3 = $retval;
+ STACKTOP = sp;return ($3|0);
 }

Running a.out.js, it will print some progress as the program starts up, then finally print mainne4 which is right before the infinite loop. It then prints no further progress - the worker never gets the message sent earlier via postMessage. Running b.out.js it prints at the end

mainne4
main iter
run! false

main iter is from the main loop, showing it is running an iteration (which means it exited the event loop before). run! false is sent from the worker (there are some run! true loggings earlier, which are from the main loop), which shows that it did receive the message that was sent. (It then fails on something else unrelated.)

I'm not sure what's going on here - one possibility is that the child doesn't receive the message in the first case. Alternatively, maybe it does but console.log doesn't print anything directly from the worker, and instead it messages the main thread to do so (which is blocked, so it's never printed)?

Note that in browsers both testcases work.

Tested on 12.2.0.

testcases.tar.gz

@devsnek
Copy link
Member

devsnek commented May 12, 2019

repro

'use strict';

const { Worker, isMainThread } = require('worker_threads');

if (isMainThread) {
  const worker = new Worker(__filename);
  while (true) {}
} else {
  console.log('alive!'); // never prints
}

@bnoordhuis
Copy link
Member

That's not unexpected, or is it? The worker doesn't print the message itself, it sends it to the main thread which prints it on the worker's behalf - but only when the main thread isn't blocked, of course.

Making the main thread interruptible isn't impossible (v8::Isolate::RequestInterrupt(isolate, callback) from worker thread to main thread, then check for pending work in callback) but only when the work is done in C++ or in a separate JS isolate.

Since console.log() and a lot of other things are implemented in JS, that's a pretty tall order. Moving it to a separate isolate isn't exactly trivial, never mind rewriting it in C++.

(Having said all that, your test case doesn't really match what the OP describes, if I read it right. That's about postMessage-ing from the main thread to the worker.)

@addaleax
Copy link
Member

I’ll take a look at the test cases here, but in @devsnek’s case I would consider this expected behaviour, yes. There’s no point at which the main thread would reasonably call the event handler.

Browsers also don’t emit the event in a single-threaded version of your reproduction, so it’s definetely a bit more complicated:

{
  const { port1, port2 } = new MessageChannel();
  port1.onmessage = (data) => console.log('message', data)
  port2.postMessage('foo')
  while (true) {}
}

@devsnek
Copy link
Member

devsnek commented May 15, 2019

@addaleax

I would consider this expected behaviour

why do we post back to the main thread to use stdout? is it to make sure logs don't get muddled together? if that's the problem, can we put console on the bg thread or something? not having console if loops happen is a serious problem, as evidenced by the messages in this issue if nothing else.

your codeblock seems like expected behaviour to me, these events aren't supposed to fire synchronously.

@bnoordhuis i had the following code:

if (isMainThread) {
  const worker = new Worker(__filename);
  worker.postMessage('hi!');
  while (true) {}
} else {
  parentPort.on('message', (data) => {
    console.log(data);
    require('fs').writeFileSync('./worker_data', data);
  });
}

which i reduced to what i posted above, since the fs.writeFileSync ran fine.

@jasnell
Copy link
Member

jasnell commented May 12, 2021

I'm going to remove the confirmed-bug label on this issue. There are definitely some aspects of this that can be documented better but it's not technically a bug.

@jasnell jasnell added doc Issues and PRs related to the documentations. and removed confirmed-bug Issues with confirmed bugs. labels May 12, 2021
jasnell added a commit to jasnell/node that referenced this issue May 12, 2021
Fixes: nodejs#25630
Signed-off-by: James M Snell <jasnell@gmail.com>
jasnell added a commit to jasnell/node that referenced this issue May 13, 2021
Fixes: nodejs#25630
Signed-off-by: James M Snell <jasnell@gmail.com>
targos pushed a commit that referenced this issue May 18, 2021
Fixes: #25630
Signed-off-by: James M Snell <jasnell@gmail.com>

PR-URL: #38658
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
targos pushed a commit that referenced this issue May 30, 2021
Fixes: #25630
Signed-off-by: James M Snell <jasnell@gmail.com>

PR-URL: #38658
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
targos pushed a commit that referenced this issue Jun 5, 2021
Fixes: #25630
Signed-off-by: James M Snell <jasnell@gmail.com>

PR-URL: #38658
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
targos pushed a commit that referenced this issue Jun 5, 2021
Fixes: #25630
Signed-off-by: James M Snell <jasnell@gmail.com>

PR-URL: #38658
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
targos pushed a commit that referenced this issue Jun 11, 2021
Fixes: #25630
Signed-off-by: James M Snell <jasnell@gmail.com>

PR-URL: #38658
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Issues and PRs related to the documentations. worker Issues and PRs related to Worker support.
Projects
None yet
6 participants