Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion (wrap->ssl_) != nullptr failed in TLSWrap::GetServername #48000

Closed
tniessen opened this issue May 14, 2023 · 20 comments
Closed

Assertion (wrap->ssl_) != nullptr failed in TLSWrap::GetServername #48000

tniessen opened this issue May 14, 2023 · 20 comments
Labels
linux Issues and PRs related to the Linux platform. net Issues and PRs related to the net subsystem. tls Issues and PRs related to the tls subsystem.

Comments

@tniessen
Copy link
Member

Version

HEAD

Platform

fedora-last-latest-x64

Subsystem

tls

What steps will reproduce the bug?

Start a Jenkins CI job that includes fedora-last-latest-x64.

How often does it reproduce? Is there a required condition?

It happens quite often. The build time graph shows a clear increase in error rates during the last few days:

build time trend showing significantly more failures in the last few days

What is the expected behavior? Why is that the expected behavior?

No error.

What do you see instead?

mkdir -p out/doc
mkdir -p out/doc/api
cp -r doc/api out/doc
mkdir -p out/doc/api/assets
if [ -d doc/api/assets ]; then cp -r doc/api/assets out/doc/api; fi;
if [ -x /home/iojs/build/workspace/node-test-commit-linux/./node ] && [ -e /home/iojs/build/workspace/node-test-commit-linux/./node ]; then /home/iojs/build/workspace/node-test-commit-linux/./node  tools/doc/versions.mjs out/previous-doc-versions.json; elif [ -x `command -v node` ] && [ -e `command -v node` ] && [ `command -v node` ]; then `command -v node`  tools/doc/versions.mjs out/previous-doc-versions.json; else echo "No available node, cannot run \"node  tools/doc/versions.mjs out/previous-doc-versions.json\""; exit 1; fi;
/home/iojs/build/workspace/node-test-commit-linux/./node[4083780]: ../src/crypto/crypto_tls.cc:1233:static void node::crypto::TLSWrap::GetServername(const v8::FunctionCallbackInfo<v8::Value>&): Assertion `(wrap->ssl_) != nullptr' failed.
 1: 0xc8df40 node::Abort() [/home/iojs/build/workspace/node-test-commit-linux/./node]
 2: 0xc8dfbe  [/home/iojs/build/workspace/node-test-commit-linux/./node]
 3: 0xe53b6a node::crypto::TLSWrap::GetServername(v8::FunctionCallbackInfo<v8::Value> const&) [/home/iojs/build/workspace/node-test-commit-linux/./node]
 4: 0xf1578f v8::internal::FunctionCallbackArguments::Call(v8::internal::CallHandlerInfo) [/home/iojs/build/workspace/node-test-commit-linux/./node]
 5: 0xf15ffd  [/home/iojs/build/workspace/node-test-commit-linux/./node]
 6: 0xf164c5 v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) [/home/iojs/build/workspace/node-test-commit-linux/./node]
 7: 0x191ddf6  [/home/iojs/build/workspace/node-test-commit-linux/./node]
/bin/sh: line 1: 4083780 Aborted                 (core dumped) /home/iojs/build/workspace/node-test-commit-linux/./node tools/doc/versions.mjs out/previous-doc-versions.json
make[2]: *** [Makefile:780: out/previous-doc-versions.json] Error 134
make[1]: *** [Makefile:738: doc-only] Error 2
make: *** [Makefile:579: run-ci] Error 2
Build step 'Execute shell' marked build as failure

Additional information

This only started happening recently. Unless there was an infrastructure change (cc @nodejs/build), it likely is due to a recent change on main.

@tniessen tniessen added tls Issues and PRs related to the tls subsystem. linux Issues and PRs related to the Linux platform. labels May 14, 2023
@tniessen tniessen added the net Issues and PRs related to the net subsystem. label May 14, 2023
@targos
Copy link
Member

targos commented May 15, 2023

I'm not aware of any recent change to fedora-last-latest-x64 machines.

@targos
Copy link
Member

targos commented May 15, 2023

@targos
Copy link
Member

targos commented May 15, 2023

According to the build history, the first failure was https://ci.nodejs.org/job/node-test-commit-linux/nodes=fedora-last-latest-x64/52042/

Commit: 75b0d9e

@MoLow
Copy link
Member

MoLow commented May 15, 2023

The increase in errors seems to only affect test-rackspace-fedora32-x64-1:

@targos should we take it offline until this is resolved?

@targos
Copy link
Member

targos commented May 15, 2023

I just took it offline.

@targos
Copy link
Member

targos commented May 15, 2023

It's possible to reproduce on the machine using:

su - iojs
cd build/workspace/node-test-commit-linux
out/Release/node tools/doc/versions.mjs

@tniessen Do you have an idea on how we can debug it? Do you want access to the machine?

@targos
Copy link
Member

targos commented May 15, 2023

Reverting 2d24b29 fixes the error.

@targos
Copy link
Member

targos commented May 15, 2023

/cc @ShogunPanda

@targos
Copy link
Member

targos commented May 15, 2023

After reverting 2d24b29, the request at https://github.com/nodejs/node/blob/8b3777d0c82c01229e724d84586fdc472fd4deda/tools/doc/versions.mjs#LL44C25-L44C31 fails with ERR_SOCKET_CONNECTION_TIMEOUT.

@ShogunPanda
Copy link
Contributor

@targos Can I have access to the machine so I can debug it?

@targos
Copy link
Member

targos commented May 15, 2023

@ShogunPanda Sure. Can you open an access request on nodejs/build?

@ShogunPanda
Copy link
Contributor

@targos Done: nodejs/build#3354

@trentm
Copy link
Contributor

trentm commented May 16, 2023

If it helps, I've been running into this in my CI and can repro in an ubuntu:20.04 Docker container on my macOS laptop:

% docker run --rm -ti ubuntu:20.04 /bin/bash

root@66a97b1671c7:/# apt-get update && apt-get install -y curl vim git
...

root@66a97b1671c7:/# mkdir app
root@66a97b1671c7:/# cd app
root@66a97b1671c7:/app# curl -O https://nodejs.org/download/nightly/v21.0.0-nightly202305158b3777d0c8/node-v21.0.0-nightly202305158b3777d0c8-linux-x64.tar.gz
...
root@66a97b1671c7:/app# tar xf node-v21.0.0-nightly202305158b3777d0c8-linux-x64.tar.gz
root@66a97b1671c7:/app# export PATH=/app/node-v21.0.0-nightly202305158b3777d0c8-linux-x64/bin:$PATH

root@66a97b1671c7:/app# git clone https://github.com/elastic/apm-agent-nodejs.git
...
root@66a97b1671c7:/app# cd apm-agent-nodejs/
root@66a97b1671c7:/app/apm-agent-nodejs# npm install
npm WARN EBADENGINE Unsupported engine {
npm WARN EBADENGINE   package: '@azure/msal-node@1.14.6',
npm WARN EBADENGINE   required: { node: '10 || 12 || 14 || 16 || 18' },
npm WARN EBADENGINE   current: { node: 'v21.0.0-nightly202305158b3777d0c8', npm: '9.6.6' }
npm WARN EBADENGINE }
npm install[3410]: ../src/crypto/crypto_tls.cc:1233:static void node::crypto::TLSWrap::GetServername(const v8::FunctionCallbackInfo<v8::Value>&): Assertion `(wrap->ssl_) != nullptr' failed.
 1: 0xc8e3f0 node::Abort() [npm install]
 2: 0xc8e46e  [npm install]
 3: 0xe545aa node::crypto::TLSWrap::GetServername(v8::FunctionCallbackInfo<v8::Value> const&) [npm install]
 4: 0xf161ff v8::internal::FunctionCallbackArguments::Call(v8::internal::CallHandlerInfo) [npm install]
 5: 0xf16a6d  [npm install]
 6: 0xf16f35 v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) [npm install]
 7: 0x191ddf6  [npm install]
Aborted

Something about the large set of devDependencies in the package-lock.json file result in npm install tripping this assert most of the time.

@ShogunPanda
Copy link
Contributor

@trentm yup, that is helpful. I hope to have an answer soon.

@tniessen
Copy link
Member Author

This is also happening in GitHub actions, see, for example, this test-linux run.

npm ci[93013]: ../src/crypto/crypto_tls.cc:1233:static void node::crypto::TLSWrap::GetServername(const v8::FunctionCallbackInfo<v8::Value>&): Assertion `(wrap->ssl_) != nullptr' failed.
 1: 0x556d50e978d4 node::Abort() [npm ci]
 2: 0x556d50e97968  [npm ci]
 3: 0x556d5107239c node::crypto::TLSWrap::GetServername(v8::FunctionCallbackInfo<v8::Value> const&) [npm ci]
 4: 0x556d5113ecc0 v8::internal::FunctionCallbackArguments::Call(v8::internal::CallHandlerInfo) [npm ci]
 5: 0x556d5113f5b9  [npm ci]
 6: 0x556d5113faa5 v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) [npm ci]
 7: 0x556d51bd5df6  [npm ci]
Aborted (core dumped)

@silverwind
Copy link
Contributor

See this for a test case that should reliably reproduce the crash on v21 or ERR_SOCKET_CONNECTION_TIMEOUT on v20.

@ShogunPanda
Copy link
Contributor

Will wait for @silverwind to verify latest main fixes this and then I'll close.

@silverwind
Copy link
Contributor

Will verify tomorrow once a nightly build is available.

@silverwind
Copy link
Contributor

Results from v21.0.0-nightly202306015e98a74327:

  • TLSError is fixed
  • ERR_SOCKET_CONNECTION_TIMEOUT is fixed
  • I see intermittent ENETUNREACH, IIRC, this was happening in node 20 as well

See this for stack trace etc.

@ShogunPanda
Copy link
Contributor

The ENETUNREACH is unrelated, see: https://github.com/orgs/nodejs/discussions/48028#discussioncomment-6065330.
Closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linux Issues and PRs related to the Linux platform. net Issues and PRs related to the net subsystem. tls Issues and PRs related to the tls subsystem.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants