Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion `(env_->execution_async_id()) == (0)' failed #38155

Closed
ronag opened this issue Apr 8, 2021 · 36 comments
Closed

Assertion `(env_->execution_async_id()) == (0)' failed #38155

ronag opened this issue Apr 8, 2021 · 36 comments
Labels
async_hooks Issues and PRs related to the async hooks subsystem. net Issues and PRs related to the net subsystem. stream Issues and PRs related to the stream subsystem.

Comments

@ronag
Copy link
Member

ronag commented Apr 8, 2021

I managed to trigger an assertion.

undici$ node --version
v15.13.0
test/client-connect.js 2> /Users/ronagy/.nvm/versions/node/v15.13.0/bin/node[49832]: ../src/api/callback.cc:125:void node::InternalCallbackScope::Close(): Assertion `(env_->execution_async_id()) == (0)' failed.
test/client-connect.js 2>  1: 0x1013990c5 node::Abort() (.cold.1) [/Users/ronagy/.nvm/versions/node/v15.13.0/bin/node]
test/client-connect.js 2>  2: 0x1000c6c89 node::Abort() [/Users/ronagy/.nvm/versions/node/v15.13.0/bin/node]
test/client-connect.js 2>  3: 0x1000c6af1 node::Assert(node::AssertionInfo const&) [/Users/ronagy/.nvm/versions/node/v15.13.0/bin/node]
test/client-connect.js 2>  4: 0x100001cdd node::InternalCallbackScope::Close() [/Users/ronagy/.nvm/versions/node/v15.13.0/bin/node]
test/client-connect.js 2>  5: 0x10000228a node::InternalMakeCallback(node::Environment*, v8::Local<v8::Object>, v8::Local<v8::Object>, v8::Local<v8::Function>, int, v8::Local<v8::Value>*, node::async_context) [/Users/ronagy/.nvm/versions/node/v15.13.0/bin/node]
test/client-connect.js 2>  6: 0x10001a7e8 node::AsyncWrap::MakeCallback(v8::Local<v8::Function>, int, v8::Local<v8::Value>*) [/Users/ronagy/.nvm/versions/node/v15.13.0/bin/node]
test/client-connect.js 2>  7: 0x100183821 node::StreamBase::CallJSOnreadMethod(long, v8::Local<v8::ArrayBuffer>, unsigned long, node::StreamBase::StreamBaseJSChecks) [/Users/ronagy/.nvm/versions/node/v15.13.0/bin/node]
test/client-connect.js 2>  8: 0x100185787 node::EmitToJSStreamListener::OnStreamRead(long, uv_buf_t const&) [/Users/ronagy/.nvm/versions/node/v15.13.0/bin/node]
test/client-connect.js 2>  9: 0x10018a61b node::LibuvStreamWrap::OnUvRead(long, uv_buf_t const*) [/Users/ronagy/.nvm/versions/node/v15.13.0/bin/node]
test/client-connect.js 2> 10: 0x100a63a8e uv__stream_io [/Users/ronagy/.nvm/versions/node/v15.13.0/bin/node]
test/client-connect.js 2> 11: 0x100a6c45c uv__io_poll [/Users/ronagy/.nvm/versions/node/v15.13.0/bin/node]
test/client-connect.js 2> 12: 0x100a58d51 uv_run [/Users/ronagy/.nvm/versions/node/v15.13.0/bin/node]
test/client-connect.js 2> 13: 0x100002e74 node::SpinEventLoop(node::Environment*) [/Users/ronagy/.nvm/versions/node/v15.13.0/bin/node]
test/client-connect.js 2> 14: 0x100107a64 node::NodeMainInstance::Run(node::EnvSerializeInfo const*) [/Users/ronagy/.nvm/versions/node/v15.13.0/bin/node]
test/client-connect.js 2> 15: 0x1000949c4 node::Start(int, char**) [/Users/ronagy/.nvm/versions/node/v15.13.0/bin/node]
test/client-connect.js 2> 16: 0x7fff20386621 start [/usr/lib/system/libdyld.dylib]
@ronag
Copy link
Member Author

ronag commented Apr 8, 2021

@addaleax is this supposed to be possible?

@addaleax
Copy link
Member

addaleax commented Apr 9, 2021

@ronag no, hence the assertion

@ronag ronag added confirmed-bug Issues with confirmed bugs. stream Issues and PRs related to the stream subsystem. net Issues and PRs related to the net subsystem. labels Apr 9, 2021
@Flarna Flarna added the async_hooks Issues and PRs related to the async hooks subsystem. label Apr 9, 2021
@Flarna
Copy link
Member

Flarna commented Apr 9, 2021

@ronag Do you have a reproducer? I doubt this happens just by calling node --version :o)

@ronag
Copy link
Member Author

ronag commented Apr 9, 2021

Sorry. Happened while debugging another issue and haven’t happened since.

@ronag
Copy link
Member Author

ronag commented Apr 22, 2021

@Flarna @addaleax If you clone this https://github.com/nodejs/undici/tree/native-assert and run tap test/tls-session-reuse.js you should get the assertion.

@Flarna
Copy link
Member

Flarna commented Apr 22, 2021

I did a fast try but failed to reproduce the crash. Should it happen always or only every now and then?
Any dependecy to nodejs version or OS?

@ronag
Copy link
Member Author

ronag commented Apr 22, 2021

Node v15.5.1, OSX 11.2.3.

It should happen every time.

@targos
Copy link
Member

targos commented Apr 23, 2021

Stack trace on my mac:

/Users/targos/git/nodejs/node/out/Debug/node[16113]: ../../src/api/callback.cc:125:void node::InternalCallbackScope::Close(): Assertion `(env_->execution_async_id()) == (0)' failed.
 1: 0x1024e3b38 node::DumpBacktrace(__sFILE*) [/Users/targos/git/nodejs/node/out/Debug/node]
 2: 0x10262bf90 node::Abort() [/Users/targos/git/nodejs/node/out/Debug/node]
 3: 0x10262bc18 node::AppendExceptionLine(node::Environment*, v8::Local<v8::Value>, v8::Local<v8::Message>, node::ErrorHandlingMode) [/Users/targos/git/nodejs/node/out/Debug/node]
 4: 0x1024627fc node::InternalCallbackScope::Close() [/Users/targos/git/nodejs/node/out/Debug/node]
 5: 0x1024625b4 node::InternalCallbackScope::~InternalCallbackScope() [/Users/targos/git/nodejs/node/out/Debug/node]
 6: 0x102461dcc node::InternalCallbackScope::~InternalCallbackScope() [/Users/targos/git/nodejs/node/out/Debug/node]
 7: 0x102513d70 node::Environment::RunAndClearNativeImmediates(bool) [/Users/targos/git/nodejs/node/out/Debug/node]
 8: 0x10251352c node::Environment::CheckImmediate(uv_check_s*) [/Users/targos/git/nodejs/node/out/Debug/node]
 9: 0x10365256c uv__run_check [/Users/targos/git/nodejs/node/out/Debug/node]
10: 0x103647f60 uv_run [/Users/targos/git/nodejs/node/out/Debug/node]
11: 0x10246791c node::SpinEventLoop(node::Environment*) [/Users/targos/git/nodejs/node/out/Debug/node]
12: 0x1026c3594 node::NodeMainInstance::Run(node::EnvSerializeInfo const*) [/Users/targos/git/nodejs/node/out/Debug/node]
13: 0x1025a0724 node::Start(int, char**) [/Users/targos/git/nodejs/node/out/Debug/node]
14: 0x103e4f3d4 main [/Users/targos/git/nodejs/node/out/Debug/node]
15: 0x195131f34 start [/usr/lib/system/libdyld.dylib]

@targos
Copy link
Member

targos commented Apr 23, 2021

node/src/api/callback.cc

Lines 122 to 127 in e46c680

// Make sure the stack unwound properly. If there are nested MakeCallback's
// then it should return early and not reach this code.
if (env_->async_hooks()->fields()[AsyncHooks::kTotals]) {
CHECK_EQ(env_->execution_async_id(), 0);
CHECK_EQ(env_->trigger_async_id(), 0);
}

I think it's that assert?

@Flarna
Copy link
Member

Flarna commented Apr 23, 2021

Till now I failed to reproduce on windows or WSL. Found no time yet to test on linux. I have no access to a mac.

@Flarna
Copy link
Member

Flarna commented Apr 23, 2021

Tried on linux and can't reproduce there with 15.14.0. The test just fails with JS exceptions:

 FAIL  test/tls-session-reuse.js 5 failed of 5 142.07ms
 ✖ Cannot convert undefined or null to object
 ✖ should be equal
 ✖ Cannot read property 'toString' of undefined
 ✖ Cannot read property 'reuseSessions' of undefined
 ✖ connect ECONNREFUSED 127.0.0.1:33543

Either I'm doing something wrong here or it effects only MacOs.

@targos
Copy link
Member

targos commented Apr 23, 2021

Note that I can only reproduce with node, not with tap

@Flarna
Copy link
Member

Flarna commented Apr 23, 2021

using node master directly reproduces this also on windows:

c:\workspaces\GitHubForks\node\src\api\callback.cc:125: Assertion `(env_->execution_async_id()) == (0)' failed.
 1: 00007FF772C4331F node::DumpBacktrace+143 [c:\workspaces\GitHubForks\node\src\debug_utils.cc]:L308
 2: 00007FF772BA6D36 node::Abort+22 [c:\workspaces\GitHubForks\node\src\node_errors.cc]:L253
 3: 00007FF772BA7161 node::Assert+145 [c:\workspaces\GitHubForks\node\src\node_errors.cc]:L269
 4: 00007FF772C8449A node::InternalCallbackScope::Close+602 [c:\workspaces\GitHubForks\node\src\api\callback.cc]:L125
 5: 00007FF772C8409E node::InternalCallbackScope::~InternalCallbackScope+14 [c:\workspaces\GitHubForks\node\src\api\callback.cc]:L79
 6: 00007FF772C390CE node::Environment::RunAndClearNativeImmediates+942 [c:\workspaces\GitHubForks\node\src\env.cc]:L795
 7: 00007FF772C1BF89 node::Environment::CheckImmediate+89 [c:\workspaces\GitHubForks\node\src\env.cc]:L917
 8: 00007FF772CB3C78 uv_check_invoke+56 [c:\workspaces\GitHubForks\node\deps\uv\src\win\loop-watcher.c]:L121
 9: 00007FF772CC1898 uv_run+664 [c:\workspaces\GitHubForks\node\deps\uv\src\win\core.c]:L630
10: 00007FF772C838D4 node::SpinEventLoop+308 [c:\workspaces\GitHubForks\node\src\api\embed_helpers.cc]:L36
11: 00007FF772B45DC3 node::NodeMainInstance::Run+211 [c:\workspaces\GitHubForks\node\src\node_main_instance.cc]:L144
12: 00007FF772BEC747 node::Start+215 [c:\workspaces\GitHubForks\node\src\node.cc]:L1084
13: 00007FF772925BDC wmain+476 [c:\workspaces\GitHubForks\node\src\node_main.cc]:L87
14: 00007FF774199B3C __scrt_common_main_seh+268 [D:\a01\_work\26\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl]:L288
15: 00007FFF6ABB7034 BaseThreadInitThunk+20
16: 00007FFF6B4C2651 RtlUserThreadStart+33

@ronag
Copy link
Member Author

ronag commented Apr 23, 2021

Seems like a different error?

@Flarna
Copy link
Member

Flarna commented Apr 23, 2021

Looking at the offsets I guess it's a problem regarding symbol name resolution. Will try a local build.

Update: Local build shows a better stack, updated above comment

@Flarna
Copy link
Member

Flarna commented Apr 25, 2021

I did some debugging but haven't nailed down the actual issue. The error seems to happen if a promise is rejected but not handled within an emitter which in turn re-throws the error.

But meanwhile I doubt that this is a node problem.
Your setup includes async-hook-domain which does some monkey patching on fatal error handler .
If I disable this the crash goes away.

@ronag
Copy link
Member Author

ronag commented Apr 25, 2021

I did some debugging but haven't nailed down the actual issue. The error seems to happen if a promise is rejected but not handled within an emitter which in turn re-throws the error.

But meanwhile I doubt that this is a node problem.
Your setup includes async-hook-domain which does some monkey patching on fatal error handler .
If I disable this the crash goes away.

Maybe a ping to @isaacs

@Flarna
Copy link
Member

Flarna commented Apr 25, 2021

Maybe worth to mention that this happens also with node 14 if --unhandled-rejection=strict is given.

@ronag ronag removed the confirmed-bug Issues with confirmed bugs. label Apr 25, 2021
@Flarna
Copy link
Member

Flarna commented Apr 26, 2021

maybe related: #33771

@szmarczak
Copy link
Member

szmarczak commented Sep 13, 2021

Constantly reproducible in https://github.com/szmarczak/undici/tree/crash

v16.9.1

$ node test/client-pipelining.js
[...]
node[84820]: ../src/api/callback.cc:134:void node::InternalCallbackScope::Close(): Assertion `(env_->execution_async_id()) == (0)' failed.
 1: 0xafcff0 node::Abort() [node]
 2: 0xafd06e  [node]
 3: 0xa390e1 node::InternalCallbackScope::Close() [node]
 4: 0xa391a1 node::InternalCallbackScope::~InternalCallbackScope() [node]
 5: 0xb6a84d node::PerIsolatePlatformData::RunForegroundTask(std::unique_ptr<v8::Task, std::default_delete<v8::Task> >) [node]
 6: 0xb6c6a9 node::PerIsolatePlatformData::FlushForegroundTasksInternal() [node]
 7: 0x1559e06  [node]
 8: 0x156c274  [node]
 9: 0x155a738 uv_run [node]
10: 0xa39fe5 node::SpinEventLoop(node::Environment*) [node]
11: 0xb3f206 node::NodeMainInstance::Run(node::EnvSerializeInfo const*) [node]
12: 0xac0952 node::Start(int, char**) [node]
13: 0x7f84c77c0ba5 __libc_start_main [/usr/lib/haswell/libc.so.6]
14: 0xa37d8c  [node]
Aborted (core dumped)

@targos
Copy link
Member

targos commented Sep 14, 2021

With debug build:

/Users/targos/git/nodejs/node/out/Debug/node[42541]: ../../src/api/callback.cc:134:void node::InternalCallbackScope::Close(): Assertion `(env_->execution_async_id()) == (0)' failed.
 1: 0x100af1dc8 node::DumpBacktrace(__sFILE*) [/Users/targos/git/nodejs/node/out/Debug/node]
 2: 0x100c4d400 node::Abort() [/Users/targos/git/nodejs/node/out/Debug/node]
 3: 0x100c4cff4 node::AppendExceptionLine(node::Environment*, v8::Local<v8::Value>, v8::Local<v8::Message>, node::ErrorHandlingMode) [/Users/targos/git/nodejs/node/out/Debug/node]
 4: 0x100a7128c node::InternalCallbackScope::Close() [/Users/targos/git/nodejs/node/out/Debug/node]
 5: 0x100a71058 node::InternalCallbackScope::~InternalCallbackScope() [/Users/targos/git/nodejs/node/out/Debug/node]
 6: 0x100a7088c node::InternalCallbackScope::~InternalCallbackScope() [/Users/targos/git/nodejs/node/out/Debug/node]
 7: 0x100d7e654 node::PerIsolatePlatformData::RunForegroundTask(std::__1::unique_ptr<v8::Task, std::__1::default_delete<v8::Task> >) [/Users/targos/git/nodejs/node/out/Debug/node]
 8: 0x100d7c9a8 node::PerIsolatePlatformData::FlushForegroundTasksInternal() [/Users/targos/git/nodejs/node/out/Debug/node]
 9: 0x100d7c678 node::PerIsolatePlatformData::FlushTasks(uv_async_s*) [/Users/targos/git/nodejs/node/out/Debug/node]
10: 0x101ccdb44 uv__async_io [/Users/targos/git/nodejs/node/out/Debug/node]
11: 0x101cea5b4 uv__io_poll [/Users/targos/git/nodejs/node/out/Debug/node]
12: 0x101cce160 uv_run [/Users/targos/git/nodejs/node/out/Debug/node]
13: 0x100a76560 node::SpinEventLoop(node::Environment*) [/Users/targos/git/nodejs/node/out/Debug/node]
14: 0x100ce84ac node::NodeMainInstance::Run(int*, node::Environment*) [/Users/targos/git/nodejs/node/out/Debug/node]
15: 0x100ce7e58 node::NodeMainInstance::Run(node::EnvSerializeInfo const*) [/Users/targos/git/nodejs/node/out/Debug/node]
16: 0x100bb3adc node::Start(int, char**) [/Users/targos/git/nodejs/node/out/Debug/node]
17: 0x1020d6384 main [/Users/targos/git/nodejs/node/out/Debug/node]
18: 0x107a250f4
[1]    42541 abort      /Users/targos/git/nodejs/node/out/Debug/node test/client-pipelining.js

@ronag
Copy link
Member Author

ronag commented Sep 14, 2021

@addaleax @jasnell I think we have a repro now

@Flarna
Copy link
Member

Flarna commented Sep 14, 2021

It's easy to reproduce but as said above I doubt it's a nodejs issue.

Or was there any change regarding the testsetup to no longer use async-hook-domain?

@ronag
Copy link
Member Author

ronag commented Sep 14, 2021

I doubt it's a nodejs issue.

An assertion should never happen when running user code. IMHO it's a node issue.

@targos
Copy link
Member

targos commented Sep 14, 2021

@ronag That's true only for user code that doesn't touch internal APIs. For example, it's not a node issue if someone triggers an assertion by directly calling process.binding('fs').chmod().
If this assertion is triggered because async-hook-domain has a bug in its custom implementation of process._fatalException, then it's an async-hook-domain issue

@ronag
Copy link
Member Author

ronag commented Sep 14, 2021

@targos Yes. But I'm more referring to @szmarczak's repro.

@targos
Copy link
Member

targos commented Sep 14, 2021

That repro doesn't crash if I comment out this line:
https://github.com/tapjs/async-hook-domain/blob/master/index.js#L31

@ronag
Copy link
Member Author

ronag commented Sep 14, 2021

That repro doesn't crash if I comment out this line:
https://github.com/tapjs/async-hook-domain/blob/master/index.js#L31

Ah, I didn't realize that was relevant there as well.

@theoludwig
Copy link
Contributor

Does anyone have a solution to this issue? Or is there is anything we could do to fix this issue?
Is it something that https://github.com/tapjs/async-hook-domain should fix or is it on the Node.js side?

I can't reproduce the issue locally on my Ubuntu 21.10, but somehow the issue appears when using ubuntu-latest on the GitHub Actions (using tap test runner), using macos-latest instead and there is no issue.
Is this issue related to the Linux kernel or some low-level stuff?
Does anyone have an explanation of why there is no problem on my local computer with Ubuntu 21.10, but there is this issue on GitHub Actions using ubuntu-latest, so latest should be 20.04 I guess.

@margaritluch
Copy link

has there been found a solution to this issue? I'm experiencing the same problem and have changed node version several times, nothing helps :(

@fterradev
Copy link

fterradev commented Aug 11, 2022

I have a situation where this issue does occur, but it does not if I use --inspect-brk.

To be more precise, this script starts an express server, among other things, and for some reason it doesn't fail if I manage to attach the debugger before the end of some initialization steps.

So I can either start the process and quickly attach the debugger, or just use --inspect-brk, which is easier to make sure I can attach before the end of those initializations.

Unfortunately I can't share this project because it doesn't belong to me and it is not public.
But I guess this can be a useful workaround for people out there.

@delvedor
Copy link
Member

delvedor commented Nov 5, 2022

Heya, I've hit the same problem; I've created a repro here.

@Flarna
Copy link
Member

Flarna commented Nov 5, 2022

@delvedor Thanks for your repo. But again it uses tap which internally uses async-hook-domain which is likely doing something wrong by monkeypatching node internal. See above comments.

@Flarna
Copy link
Member

Flarna commented Nov 2, 2023

Seems latest versions of async-hook-domain no longer monkey path process._fatalException (see here).

Is this problem still occurring anywhere with latest version of async-hook-domain used?

@isaacs
Copy link
Contributor

isaacs commented Nov 3, 2023

@Flarna i haven't seen the issue lately, no.

@Flarna
Copy link
Member

Flarna commented Nov 6, 2023

closing as it seem problem is solved by latest version of async-hook-domain. Feel free to reopen or create a new issue if it is reproducible without 3rd party dependencies.

@Flarna Flarna closed this as completed Nov 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
async_hooks Issues and PRs related to the async hooks subsystem. net Issues and PRs related to the net subsystem. stream Issues and PRs related to the stream subsystem.
Projects
None yet
Development

No branches or pull requests

10 participants