-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Calling worker.terminate()
while http/2 reads on a stream crashes the whole process with FATAL ERROR / SIGABRT
#38418
Comments
It's unlikely that Node.js itself is at fault on this. What's more likely is a dependency module somewhere that is bypassing the public supported API and going directly to Node.js' internal API surface. Are you able to provide a reproducable example? |
Thanks for the quick reply. I'm still trying to make this reproducible. Is there any way to get more info out of SIGABRT? There are 0 search results on npm and I can't really find anything useful. I'm trying to make this happen again within the Electron app and getting some sort of hint were the cause lies would make stripping it down way easier. Edit: I'm still not able to trigger it again. One thing I didn't mention above is that if the SIGABRT would not have happened the worker would have exited because of an |
You can look into using a tool like llnode to get a proper stack trace in cases like this. |
I was not able to trigger this again. Unfortunately when it happened I did not gather enough information. For the future I'm now launching Electron in dev with |
So it happened again yesterday. This time during regular closing of the Electron app. Regular closing in my case means doing some clean up and then Thanks @addaleax for pointing to llnode. I was able to get something out of the core dump. I also just came across electron/electron#28468 and related issues (electron/electron#28486). Given that this is happening pre-patch (Electron 12.0.1), could you (with some confidence) say that it's the same problem looking at the following? Does this even have any value?
I now updated all dependencies (not sure if the fix even landed until 12.0.5 since there is no release note and I couldn't find the commit). So hopefully this won't happen again. Debugging SIGABRT is no fun 😅 |
@jasnell @addaleax After many hours I'm now able to consistently reproduce this with vanilla Node (v14.16.1 and v16.0.0) with http/2 and no dependencies. There is a race condition when closing worker threads while http/2 is reading on a stream. At some point I realized that gRPC was causing the SIGABRT if the Electron app was closing just when "STREAM: do read" was happening (I was able to reproduce it way more often when I cranked up the number gRPC calls). Then I was able to strip away Electron and reproduce it just by
Then I went ahead and used the vanilla http/2 server and client and now I'm getting a slightly different crash but still:
This can be reproduced with a very simple repo here, it does not require any dependencies and uses the http/2 client and server example code right from the docs: https://github.com/Prinzhorn/electron-sigabrt In this video was lucky and I got it immediately, it usually reproduces within a handful of tries: vokoscreen-2021-05-03_08-39-08.mp4 |
worker.terminate()
while http/2 reads on a stream crashes the whole process with FATAL ERROR / SIGABRT
@Prinzhorn I don’t think that would be quite the same bug, but still, thanks for reporting. I think just changing the |
So you're saying the first crash (with gRPC) is something else that just happens to be triggered by the same conditions? To me that sounds like a Node issue as well, but since it requires If you clone my repo you can go back to the gRPC version, it just makes a simple empty rpc call:
vokoscreen-2021-05-03_10-15-42.mp4 |
@Prinzhorn It might be related in some way, but it’s definitely a different bug that would require a different solution. (I don’t think I have spare time to look into this in more detail, sorry.) |
@addaleax thanks, I'll look into this some more eventually and will open an issue with |
The FATAL ERROR does seem related to worker threads, not I can reproduce this by selecting a test that terminates a worker thread:
Logs
This was with 16.0.0 locally, but the CI failure was with 16.3.0. It feels as if not all work is interrupted and a value is accessed unexpectedly. |
One more observation, the PR this started failing in switched the worker thread to use ESM. This PR passed, and is using CJS. |
See discussion in nodejs/node#38418
See discussion in nodejs/node#38418
Waiting to terminate the worker thread until we've received at least one message from it seems to help, see avajs/ava#2778. |
`OnStreamAfterReqFinished` uses `v8::Object::Has` to check if it needs to call `oncomplete`. `v8::Object::Has` needs to execute Javascript. However when worker threads are involved, `OnStreamAfterReqFinished` may be called after the worker thread termination has begun via `worker.terminate()`. This makes `v8::Object::Has` return `Nothing`, which triggers an assert. This diff fixes the issue by simply defaulting us to `false` in the case where `Nothing` is returned. This is sound because we can't execute `oncomplete` anyway as the isolate is terminating. Fixes: nodejs#38418
`OnStreamAfterReqFinished` uses `v8::Object::Has` to check if it needs to call `oncomplete`. `v8::Object::Has` needs to execute Javascript. However when worker threads are involved, `OnStreamAfterReqFinished` may be called after the worker thread termination has begun via `worker.terminate()`. This makes `v8::Object::Has` return `Nothing`, which triggers an assert. This diff fixes the issue by simply defaulting us to `false` in the case where `Nothing` is returned. This is sound because we can't execute `oncomplete` anyway as the isolate is terminating. Fixes: nodejs#38418
`OnStreamAfterReqFinished` uses `v8::Object::Has` to check if it needs to call `oncomplete`. `v8::Object::Has` needs to execute Javascript. However when worker threads are involved, `OnStreamAfterReqFinished` may be called after the worker thread termination has begun via `worker.terminate()`. This makes `v8::Object::Has` return `Nothing`, which triggers an assert. This diff fixes the issue by simply defaulting us to `false` in the case where `Nothing` is returned. This is sound because we can't execute `oncomplete` anyway as the isolate is terminating. Fixes: #38418 PR-URL: #42874 Reviewed-By: Darshan Sen <raisinten@gmail.com> Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
`OnStreamAfterReqFinished` uses `v8::Object::Has` to check if it needs to call `oncomplete`. `v8::Object::Has` needs to execute Javascript. However when worker threads are involved, `OnStreamAfterReqFinished` may be called after the worker thread termination has begun via `worker.terminate()`. This makes `v8::Object::Has` return `Nothing`, which triggers an assert. This diff fixes the issue by simply defaulting us to `false` in the case where `Nothing` is returned. This is sound because we can't execute `oncomplete` anyway as the isolate is terminating. Fixes: #38418 PR-URL: #42874 Reviewed-By: Darshan Sen <raisinten@gmail.com> Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
`OnStreamAfterReqFinished` uses `v8::Object::Has` to check if it needs to call `oncomplete`. `v8::Object::Has` needs to execute Javascript. However when worker threads are involved, `OnStreamAfterReqFinished` may be called after the worker thread termination has begun via `worker.terminate()`. This makes `v8::Object::Has` return `Nothing`, which triggers an assert. This diff fixes the issue by simply defaulting us to `false` in the case where `Nothing` is returned. This is sound because we can't execute `oncomplete` anyway as the isolate is terminating. Fixes: #38418 PR-URL: #42874 Reviewed-By: Darshan Sen <raisinten@gmail.com> Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
`OnStreamAfterReqFinished` uses `v8::Object::Has` to check if it needs to call `oncomplete`. `v8::Object::Has` needs to execute Javascript. However when worker threads are involved, `OnStreamAfterReqFinished` may be called after the worker thread termination has begun via `worker.terminate()`. This makes `v8::Object::Has` return `Nothing`, which triggers an assert. This diff fixes the issue by simply defaulting us to `false` in the case where `Nothing` is returned. This is sound because we can't execute `oncomplete` anyway as the isolate is terminating. Fixes: #38418 PR-URL: #42874 Reviewed-By: Darshan Sen <raisinten@gmail.com> Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
`OnStreamAfterReqFinished` uses `v8::Object::Has` to check if it needs to call `oncomplete`. `v8::Object::Has` needs to execute Javascript. However when worker threads are involved, `OnStreamAfterReqFinished` may be called after the worker thread termination has begun via `worker.terminate()`. This makes `v8::Object::Has` return `Nothing`, which triggers an assert. This diff fixes the issue by simply defaulting us to `false` in the case where `Nothing` is returned. This is sound because we can't execute `oncomplete` anyway as the isolate is terminating. Fixes: #38418 PR-URL: #42874 Reviewed-By: Darshan Sen <raisinten@gmail.com> Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
`OnStreamAfterReqFinished` uses `v8::Object::Has` to check if it needs to call `oncomplete`. `v8::Object::Has` needs to execute Javascript. However when worker threads are involved, `OnStreamAfterReqFinished` may be called after the worker thread termination has begun via `worker.terminate()`. This makes `v8::Object::Has` return `Nothing`, which triggers an assert. This diff fixes the issue by simply defaulting us to `false` in the case where `Nothing` is returned. This is sound because we can't execute `oncomplete` anyway as the isolate is terminating. Fixes: #38418 PR-URL: #42874 Reviewed-By: Darshan Sen <raisinten@gmail.com> Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
`OnStreamAfterReqFinished` uses `v8::Object::Has` to check if it needs to call `oncomplete`. `v8::Object::Has` needs to execute Javascript. However when worker threads are involved, `OnStreamAfterReqFinished` may be called after the worker thread termination has begun via `worker.terminate()`. This makes `v8::Object::Has` return `Nothing`, which triggers an assert. This diff fixes the issue by simply defaulting us to `false` in the case where `Nothing` is returned. This is sound because we can't execute `oncomplete` anyway as the isolate is terminating. Fixes: nodejs/node#38418 PR-URL: nodejs/node#42874 Reviewed-By: Darshan Sen <raisinten@gmail.com> Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
I'm sorry for not following the issue template. This just happened locally during development of an Electron 12.0.1 app which comes with Node 14.16.0. As far as I can tell Electron does not patch stream_base.cc so I assume this was a Node thing. Maybe this has already been fixed in a more recent Node release (edit: looks like 14.16.1 was only a security release), but I couldn't find anything.
Unfortunately I haven't been able to reproduce it again. I assume I ran into a race condition. I will try to track this down, but it's hard.
Here's what I do know:
new WebSocket
to said local server, but I don't know if any of these cause the issue with the stream. But I also don't think this was a coincidence.I will try to put my race-condition helmet on and get on my Fuzzer and see if I can trigger this again. But maybe someone can point in a direction if this assertion should ever be able to fail due to a user error (e.g. a bug in
http
orfastify
)? I'm not even sure what theonread
is in this case but I think forgetting to set it in user land should not take down the whole process, should it?The text was updated successfully, but these errors were encountered: