Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs: don't end fs promises on Isolate termination #42910

Merged

Conversation

santigimeno
Copy link
Member

This is specially prevalent in the case of having in-progress FileHandle
operations in a worker thread while the Worker is exiting.

Fixes: #42829

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. fs Issues and PRs related to the fs subsystem / file system. needs-ci PRs that need a full CI run. labels Apr 29, 2022
@nodejs-github-bot

This comment was marked as outdated.

@aduh95 aduh95 added author ready PRs that have at least one approval, no pending requests for changes, and a CI started. request-ci Add this label to start a Jenkins CI on a PR. labels Apr 29, 2022
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Apr 29, 2022
@nodejs-github-bot

This comment was marked as outdated.

@nodejs-github-bot

This comment was marked as outdated.

@RaisinTen
Copy link
Contributor

Do you know which exception is getting scheduled and causing V8 to return empty Maybes? I tried to run the test without the fix and it's always coming from one of these Check() calls for me:

node/src/api/exceptions.cc

Lines 141 to 149 in 02e0c17

e->Set(context,
env->errno_string(),
Integer::New(isolate, errorno)).Check();
e->Set(context, env->code_string(), js_code).Check();
e->Set(context, env->syscall_string(), js_syscall).Check();
if (!js_path.IsEmpty())
e->Set(context, env->path_string(), js_path).Check();
if (!js_dest.IsEmpty())
e->Set(context, env->dest_string(), js_dest).Check();
.

@santigimeno
Copy link
Member Author

This is the other exception, which doesn't occur that often, in fact I could try to devise a different test which exercises that path more often:

FATAL ERROR: v8::FromJust Maybe value is Nothing.
 1: 0xb57b90 node::Abort() [node]
 2: 0xa701b3 node::FatalError(char const*, char const*) [node]
 3: 0xd4388a v8::Utils::ReportApiFailure(char const*, char const*) [node]
 4: 0xb5bfed node::fs::FileHandle::CloseReq::Resolve() [node]
 5: 0xb5c170  [node]
 6: 0x1634dbd  [node]
 7: 0x1639596  [node]
 8: 0x164bcd4  [node]
 9: 0x1639ee8 uv_run [node]
10: 0xa95e45 node::SpinEventLoop(node::Environment*) [node]
11: 0xc22472 node::worker::Worker::Run() [node]
12: 0xc22a58  [node]
13: 0x7f765d46a609  [/lib/x86_64-linux-gnu/libpthread.so.0]
14: 0x7f765d38f163 clone [/lib/x86_64-linux-gnu/libc.so.6]
Aborted

@santigimeno
Copy link
Member Author

And this is the 3rd one:

node[30432]: ../src/node_file-inl.h:160:node::fs::FSReqPromise<AliasedBufferT>::~FSReqPromise() [with AliasedBufferT = node::AliasedBufferBase<double, v8::Float64Array>]: Assertion `finished_' failed.
 1: 0xb57b90 node::Abort() [node]
 2: 0xb57c0e  [node]
 3: 0xb5288a  [node]
 4: 0xb5e74a node::fs::FSReqAfterScope::~FSReqAfterScope() [node]
 5: 0xb5f09e node::fs::AfterOpenFileHandle(uv_fs_s*) [node]
 6: 0x1634dbd  [node]
 7: 0x1639596  [node]
 8: 0x164bcd4  [node]
 9: 0x1639ee8 uv_run [node]
10: 0xa95e45 node::SpinEventLoop(node::Environment*) [node]
11: 0xc22472 node::worker::Worker::Run() [node]
12: 0xc22a58  [node]
13: 0x7fcbc059f609  [/lib/x86_64-linux-gnu/libpthread.so.0]
14: 0x7fcbc04c4163 clone [/lib/x86_64-linux-gnu/libc.so.6]
Aborted

@santigimeno
Copy link
Member Author

santigimeno commented Apr 30, 2022

@RaisinTen if you want to test it on your side, applying this patch to the current test makes the 2 other errors appear mostly all the time:

diff --git a/test/parallel/test-worker-fshandles-closed.js b/test/parallel/test-worker-fshandles-closed.js
index 730b5fc3b1..e3c62bba4a 100644
--- a/test/parallel/test-worker-fshandles-closed.js
+++ b/test/parallel/test-worker-fshandles-closed.js
@@ -49,7 +49,7 @@ if (isMainThread) {
   }
 
   open_close_ok().then(common.mustCall());
-  open_close_nok().then(common.mustCall());
+  open_close_ok().then(common.mustCall());
 
   parentPort.postMessage('terminate');
 }

@RaisinTen
Copy link
Contributor

Sure, feel free to add more tests for those paths though I don't think it's strictly necessary for this PR. The crash sites are indeed spread over multiple places. However, I was curious to know which JS exception (not C++ crash) caused this. Do worker threads throw an exception if we interact with the isolate during termination? If so, what's the source?

test/parallel/test-worker-fshandles-closed.js Outdated Show resolved Hide resolved
test/parallel/test-worker-fshandles-closed.js Outdated Show resolved Hide resolved
test/parallel/test-worker-fshandles-closed.js Outdated Show resolved Hide resolved
test/parallel/test-worker-fshandles-closed.js Outdated Show resolved Hide resolved
test/parallel/test-worker-fshandles-closed.js Outdated Show resolved Hide resolved
@RaisinTen
Copy link
Contributor

Tbh, I was curious to know why pending exceptions caused v8::Object::Set() to crash only sometimes because I noticed some unexpected behavior which I thought was a bug in V8 (reported in https://crbug.com/v8/12839) but it seems we do some exception catching magic.

src/node_file-inl.h Outdated Show resolved Hide resolved
@aduh95 aduh95 removed the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label May 2, 2022
@RaisinTen
Copy link
Contributor

To answer my question, this crashes because of this exception -

return Throw(ReadOnlyRoots(this).termination_exception());
. #42874 is also trying to fix a crash that happens due to the same reason.

@santigimeno santigimeno added the request-ci Add this label to start a Jenkins CI on a PR. label Jun 17, 2022
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Jun 17, 2022
@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot

This comment was marked as duplicate.

@nodejs-github-bot
Copy link
Collaborator

@aduh95 aduh95 added the commit-queue Add this label to land a pull request using GitHub Actions. label Jun 18, 2022
@nodejs-github-bot nodejs-github-bot added commit-queue-failed An error occurred while landing this pull request using GitHub Actions. and removed commit-queue Add this label to land a pull request using GitHub Actions. labels Jun 18, 2022
@nodejs-github-bot
Copy link
Collaborator

Commit Queue failed
- Loading data for nodejs/node/pull/42910
✔  Done loading data for nodejs/node/pull/42910
----------------------------------- PR info ------------------------------------
Title      fs: don't end fs promises on Isolate termination (#42910)
Author     Santiago Gimeno  (@santigimeno)
Branch     santigimeno:santi/some_filehandle_fixes -> nodejs:main
Labels     c++, fs, author ready, needs-ci
Commits    1
 - fs: don't end fs promises on Isolate termination
Committers 1
 - Santiago Gimeno 
PR-URL: https://github.com/nodejs/node/pull/42910
Fixes: https://github.com/nodejs/node/issues/42829
Reviewed-By: Antoine du Hamel 
Reviewed-By: Darshan Sen 
------------------------------ Generated metadata ------------------------------
PR-URL: https://github.com/nodejs/node/pull/42910
Fixes: https://github.com/nodejs/node/issues/42829
Reviewed-By: Antoine du Hamel 
Reviewed-By: Darshan Sen 
--------------------------------------------------------------------------------
   ⚠  Commits were pushed since the last review:
   ⚠  - fs: don't end fs promises on Isolate termination
   ℹ  This PR was created on Fri, 29 Apr 2022 14:06:18 GMT
   ✔  Approvals: 2
   ✔  - Antoine du Hamel (@aduh95) (TSC): https://github.com/nodejs/node/pull/42910#pullrequestreview-958012346
   ✔  - Darshan Sen (@RaisinTen) (TSC): https://github.com/nodejs/node/pull/42910#pullrequestreview-962848317
   ✔  Last GitHub CI successful
   ℹ  Last Full PR CI on 2022-06-18T00:22:32Z: https://ci.nodejs.org/job/node-test-pull-request/44668/
- Querying data for job/node-test-pull-request/44668/
   ✔  Last Jenkins CI successful
--------------------------------------------------------------------------------
   ✔  Aborted `git node land` session in /home/runner/work/node/node/.ncu
https://github.com/nodejs/node/actions/runs/2519886079

@aduh95 aduh95 added commit-queue Add this label to land a pull request using GitHub Actions. and removed commit-queue-failed An error occurred while landing this pull request using GitHub Actions. labels Jun 18, 2022
@nodejs-github-bot nodejs-github-bot removed the commit-queue Add this label to land a pull request using GitHub Actions. label Jun 18, 2022
@nodejs-github-bot nodejs-github-bot merged commit 4c077b0 into nodejs:main Jun 18, 2022
@nodejs-github-bot
Copy link
Collaborator

Landed in 4c077b0

@joyeecheung
Copy link
Member

This has introduced a flake in the CI: #43499 should we revert this?

@santigimeno
Copy link
Member Author

This has introduced a flake in the CI: #43499 should we revert this?

Please, hold on a bit. I'm investigating this and post a fix shortly.

targos pushed a commit that referenced this pull request Jul 12, 2022
This is specially prevalent in the case of having in-progress FileHandle
operations in a worker thread while the Worker is exiting.

Fixes: #42829

PR-URL: #42910
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Darshan Sen <raisinten@gmail.com>
targos pushed a commit that referenced this pull request Jul 18, 2022
This is specially prevalent in the case of having in-progress FileHandle
operations in a worker thread while the Worker is exiting.

Fixes: #42829

PR-URL: #42910
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Darshan Sen <raisinten@gmail.com>
targos pushed a commit that referenced this pull request Jul 31, 2022
This is specially prevalent in the case of having in-progress FileHandle
operations in a worker thread while the Worker is exiting.

Fixes: #42829

PR-URL: #42910
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Darshan Sen <raisinten@gmail.com>
guangwong pushed a commit to noslate-project/node that referenced this pull request Oct 10, 2022
This is specially prevalent in the case of having in-progress FileHandle
operations in a worker thread while the Worker is exiting.

Fixes: nodejs/node#42829

PR-URL: nodejs/node#42910
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Darshan Sen <raisinten@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. c++ Issues and PRs that require attention from people who are familiar with C++. fs Issues and PRs related to the fs subsystem / file system. needs-ci PRs that need a full CI run.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Crash in AfterOpenFileHandle
7 participants