Node hangs when running a stand-alone script with captured output #21960

logiclrd · 2018-07-24T16:26:37Z

Version: v8.1.3
Platform: Windows 10 version 10.0.14393 x64
Subsystem:

I am running into an issue on our build servers where Node hangs after running a stand-alone script if it is launched with its standard handles redirected to pipes. Specifically, I am using Ruby's popen3 function, which wraps OS pipe functionality to obtain pipe streams and then passes the underlying handles to the C runtime library's popen API function. I don't believe there is anything particular about the way Ruby is doing this, and I have the expectation that similar code written in C++ directly calling the underlying API functions would exhibit the same problem. I have not actually tested this, though.

I first observed this behaviour as part of an npm install. When NPM installs the node-sass module, it tries to run an install hook script, and the resulting node scripts/install.js process hangs until it is sent a break signal or terminated externally. This does not occur if npm install is simply run at the command-line, nor does it occur with simple piping done via the command interpreter (npm install | more, for instance).

In further debugging, I tried installing a local copy of node-sass in which I made alterations to the scripts/install.js script. The hang is reproducible even if the entire install.js script is a single console.log line. Furthermore, if I outright remove the scripts/install.js file, then the parent process of what would have been node scripts/install.js encounters an error (expected), complains loudly about it, and then hangs. The command-line of this parent process is:

C:\Program Files\nodejs\\node.exe "C:\NPM\node_modules\npm\bin\npm-cli.js" install "C:\Path\To\node-sass"

I'm not sure what the next step is in identifying the root cause of this issue. I suspect NPM might play some part in it, because if I try my test on a dummy script file directly (node test.js), then the hang is not observed.

The following Ruby code should demonstrate the hang if the node-sass module is configured to be installed into node_modules in the current project directory:

require 'open3'

Open3.popen3("npm install node-sass") do |stdin, stdout, stderr, thr|
  stdout_thread = Thread.new { stdout.each_line { |line| puts line } }
  stderr_thread = Thread.new { stderr.each_line { |line| puts "[STDERR] #{line}" } }

  stdout_thread.join
  stderr_thread.join
  puts "Exit code: #{thr.value.exitstatus}"
end

At the time of the hang, if I attach a debugger to the rogue node.exe process, I see the following threads:

Main thread:
=> Blocked in OS API: GetQueuedCompletionStatusEx
=> Call stack:
    node.exe!uv_poll_ex(uv_loop_s * loop, unsigned long timeout) Line 432	C
    node.exe!uv_run(uv_loop_s * loop, uv_run_mode mode) Line 513	C
    node.exe!node::Start(v8::Isolate * isolate, node::IsolateData * isolate_data, int argc, const char * const * argv, int exec_argc, const char * const * exec_argv) Line 4520	C++
    node.exe!node::Start(uv_loop_s * event_loop, int argc, const char * const * argv, int exec_argc, const char * const * exec_argv) Line 4585	C++
    node.exe!node::Start(int argc, char * * argv) Line 4640	C++
    [Inline Frame] node.exe!invoke_main() Line 79	C++

Worker threads: 8 threads are in this state
=> Blocked in OS API: WaitForSingleObjectEx
=> Call stack:
    [Inline Frame] node.exe!v8::base::Semaphore::Wait() Line 159	C++
    [Inline Frame] node.exe!v8::platform::TaskQueue::GetNext() Line 46	C++
    node.exe!v8::platform::WorkerThread::Run() Line 25	C++
    [Inline Frame] node.exe!v8::base::Thread::NotifyStartedAndRun() Line 512	C++
    node.exe!v8::base::ThreadEntry(void * arg) Line 1328	C++
    [Inline Frame] node.exe!invoke_thread_procedure(unsigned int(*)(void *) context, void * const) Line 91	C++
    node.exe!thread_start<unsigned int (__cdecl*)(void * __ptr64)>(void * const parameter) Line 115	C++

Thread pool threads: 4 threads are in this state
=> Blocked in OS API: NtWaitForWorkViaWorkerFactory
=> Call stack:
    ntdll.dll!TppWorkerThread()

Note that the process that is hanging is not the direct descendant of the calling site that is waiting (in this case the Ruby interpreter). The child process launched by popen3 has in fact exited, but the streams (stdour, stderr) aren't signalled because they were inherited by grandchildren, and one of those is the culprit that isn't exiting.

[WAITING] Ruby code: `popen "npm install"`
`- [exited] cmd.exe npm.bat
   |- [exited] node.exe npm-cli.js prefix -g
   `- [exited] node.exe npm-cli.js install
      `- [exited] cmd.exe /d /s /c node.exe scripts/install.js
         `- [STILL RUNNING/HUNG] node.exe scripts/install.js

The hung process is sitting around keeping stdout and stderr open, and since the pattern of the call site is to wait for the end of those streams as the indicator that the child process tree is complete, the call site blocks indefinitely as well.

I have experimented with altering the call site so that it instead explicitly waits for its direct child process to exit, and forces the streams closed. This does permit the calling code to proceed, but the rogue process still remains, which (since this is Windows) causes issues later on to do with file locking.

If there is further debugging I can do to try to narrow down the cause of this issue, I would appreciate any advice. :-) Thanks!

The text was updated successfully, but these errors were encountered:

Trott · 2018-11-05T04:43:20Z

Anyone have any thoughts on this? If not (or even if so), should it be transferred to the help repo or somewhere else?

Fishrock123 · 2018-11-15T04:35:46Z

I don't think that npm in particular does anything abnormal that would trigger this, the Node side of it is almost certainly something to do with child_process, which npm uses to spawn subprocesses for lifecycle scripts.

My guess would be that cmd.exe isn't handling a pipe shutdown "correct", but I could be wrong.

@logiclrd If you are able to, could you try running this in something other than cmd.exe, for example powershell or git bash?

If it still fails then there is more certainly some windows pipe shutdown handling bug in node or perhaps libuv.

Trott · 2018-11-18T17:48:53Z

@nodejs/platform-windows

jasnell · 2020-06-25T20:42:07Z

Without some kind of reproduction we can try to test and explore the issue, this is going to continue going unresolved. There's been no conversation in a long time and it's not clear that there's going to be any progress. Closing but we can reopen if more information comes in

logiclrd · 2020-06-25T21:18:14Z

@Fishrock123 asked if I could try using something other than cmd.exe, but I don't think I can. It's the standard shell/command interpreter and is being invoked implicitly because NPM is actually a batch file npm.bat. A second, nested instance is being launched by npm-cli.js in order to run node.exe scripts/install.js, and I have zero control over that from the outside.

@jasnell, you wrote that there is no reproduction, but I didn't see any suggestion that the reproduction I included in the initial report didn't actually work -- there is a reproduction!

jasnell · 2020-06-25T21:20:49Z

Sorry, I wasn't clear there, I wasn't able to reproduce it.

jasnell · 2020-06-25T21:28:14Z

Will reopen but it's likely to be a while before someone is able to take a look

woodcockjosh · 2020-12-29T11:22:50Z

I also experienced this issue of the script hanging indefinitely when running node v14.15.1 on linux with the shebang
#!/usr/bin/env NODE_NO_WARNINGS=1 node

I was able to get the script to not hang by changing the shebang to
#!/usr/bin/env node

Even though the resulting script didn't produce any warnings, the script would not complete with that flag.

With the same version of node on macos with the same script the script ran successfully. So my issue appears to be related to linux only.

Here is the script that hangs

#!/usr/bin/env NODE_NO_WARNINGS=1 node

console.log('It worked');
process.exit(0);

Here is the script that doesn't hang

#!/usr/bin/env node

console.log('It worked');
process.exit(0);

vigneras · 2022-12-16T03:18:54Z

Exactly same issue as woodcockjosh. NODE_NO_WARNINGS=1 makes node hanging on Ubuntu/Linux. I've not tried on any other system.

bnoordhuis · 2022-12-16T10:59:19Z

The issue here is with env(1), not node. You must pass -S, otherwise env ends up in a kind of exec loop that you can observe with strace -fe execve script.js. Happens with any environment variable, not just NODE_NO_WARNINGS=1.

Long story short, the fixed version is this:

#!/usr/bin/env -S NODE_NO_WARNINGS=1 node
console.log('It worked');
process.exit(0);

logiclrd · 2022-12-16T11:27:12Z

I'm sorry, what?? The original issue is that on Windows, a child process hangs around holding the stdout and stderr pipes open. This is closed as a duplicate after someone reports a problem with environment variable splitting on Linux?? It seems clear that it is a completely unrelated issue!

logiclrd · 2022-12-16T11:44:26Z

Just to point out that this issue is about redirected output streams on Windows, and has absolutely nothing to do with environment variables on Linux. woodcockjosh's report of seeing the same issue is spurious; it is not related to this bug report at all.

logiclrd · 2022-12-16T11:44:40Z

^-- @bnoordhuis

logiclrd · 2022-12-16T11:57:18Z

Well, I just ran my reproduction script with the latest Ruby version and Node v16.14.0 (just because it's what I happen to have installed), and the issue failed to reproduce. So I guess that's that. The issue was closed for entirely the wrong reason, but there's apparently no bug here any more?? In any event, my organization has pretty much migrated away from a Ruby-based buildchain anyway.

Fishrock123 added process Issues and PRs related to the process subsystem. windows Issues and PRs related to the Windows platform. labels Nov 15, 2018

jasnell closed this as completed Jun 25, 2020

jasnell reopened this Jun 25, 2020

jasnell added the help wanted Issues that need assistance from volunteers or PRs that need help to proceed. label Jun 26, 2020

bnoordhuis closed this as not planned Won't fix, can't repro, duplicate, stale Dec 16, 2022

simonas-notcat mentioned this issue Apr 10, 2023

fix(cli): NODE_NO_WARNINGS on windows decentralized-identity/veramo#1164

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Node hangs when running a stand-alone script with captured output #21960

Node hangs when running a stand-alone script with captured output #21960

logiclrd commented Jul 24, 2018

Trott commented Nov 5, 2018

Fishrock123 commented Nov 15, 2018 •

edited

Loading

Trott commented Nov 18, 2018

jasnell commented Jun 25, 2020

logiclrd commented Jun 25, 2020

jasnell commented Jun 25, 2020

jasnell commented Jun 25, 2020

woodcockjosh commented Dec 29, 2020 •

edited

Loading

vigneras commented Dec 16, 2022

bnoordhuis commented Dec 16, 2022

logiclrd commented Dec 16, 2022

logiclrd commented Dec 16, 2022

logiclrd commented Dec 16, 2022

logiclrd commented Dec 16, 2022

Node hangs when running a stand-alone script with captured output #21960

Node hangs when running a stand-alone script with captured output #21960

Comments

logiclrd commented Jul 24, 2018

Trott commented Nov 5, 2018

Fishrock123 commented Nov 15, 2018 • edited Loading

Trott commented Nov 18, 2018

jasnell commented Jun 25, 2020

logiclrd commented Jun 25, 2020

jasnell commented Jun 25, 2020

jasnell commented Jun 25, 2020

woodcockjosh commented Dec 29, 2020 • edited Loading

vigneras commented Dec 16, 2022

bnoordhuis commented Dec 16, 2022

logiclrd commented Dec 16, 2022

logiclrd commented Dec 16, 2022

logiclrd commented Dec 16, 2022

logiclrd commented Dec 16, 2022

Fishrock123 commented Nov 15, 2018 •

edited

Loading

woodcockjosh commented Dec 29, 2020 •

edited

Loading