Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node hangs when running a stand-alone script with captured output #21960

Closed
logiclrd opened this issue Jul 24, 2018 · 14 comments
Closed

Node hangs when running a stand-alone script with captured output #21960

logiclrd opened this issue Jul 24, 2018 · 14 comments
Labels
help wanted Issues that need assistance from volunteers or PRs that need help to proceed. process Issues and PRs related to the process subsystem. windows Issues and PRs related to the Windows platform.

Comments

@logiclrd
Copy link

  • Version: v8.1.3
  • Platform: Windows 10 version 10.0.14393 x64
  • Subsystem:

I am running into an issue on our build servers where Node hangs after running a stand-alone script if it is launched with its standard handles redirected to pipes. Specifically, I am using Ruby's popen3 function, which wraps OS pipe functionality to obtain pipe streams and then passes the underlying handles to the C runtime library's popen API function. I don't believe there is anything particular about the way Ruby is doing this, and I have the expectation that similar code written in C++ directly calling the underlying API functions would exhibit the same problem. I have not actually tested this, though.

I first observed this behaviour as part of an npm install. When NPM installs the node-sass module, it tries to run an install hook script, and the resulting node scripts/install.js process hangs until it is sent a break signal or terminated externally. This does not occur if npm install is simply run at the command-line, nor does it occur with simple piping done via the command interpreter (npm install | more, for instance).

In further debugging, I tried installing a local copy of node-sass in which I made alterations to the scripts/install.js script. The hang is reproducible even if the entire install.js script is a single console.log line. Furthermore, if I outright remove the scripts/install.js file, then the parent process of what would have been node scripts/install.js encounters an error (expected), complains loudly about it, and then hangs. The command-line of this parent process is:

C:\Program Files\nodejs\\node.exe "C:\NPM\node_modules\npm\bin\npm-cli.js" install "C:\Path\To\node-sass"

I'm not sure what the next step is in identifying the root cause of this issue. I suspect NPM might play some part in it, because if I try my test on a dummy script file directly (node test.js), then the hang is not observed.

The following Ruby code should demonstrate the hang if the node-sass module is configured to be installed into node_modules in the current project directory:

require 'open3'

Open3.popen3("npm install node-sass") do |stdin, stdout, stderr, thr|
  stdout_thread = Thread.new { stdout.each_line { |line| puts line } }
  stderr_thread = Thread.new { stderr.each_line { |line| puts "[STDERR] #{line}" } }

  stdout_thread.join
  stderr_thread.join
  puts "Exit code: #{thr.value.exitstatus}"
end

At the time of the hang, if I attach a debugger to the rogue node.exe process, I see the following threads:

Main thread:
=> Blocked in OS API: GetQueuedCompletionStatusEx
=> Call stack:
    node.exe!uv_poll_ex(uv_loop_s * loop, unsigned long timeout) Line 432	C
    node.exe!uv_run(uv_loop_s * loop, uv_run_mode mode) Line 513	C
    node.exe!node::Start(v8::Isolate * isolate, node::IsolateData * isolate_data, int argc, const char * const * argv, int exec_argc, const char * const * exec_argv) Line 4520	C++
    node.exe!node::Start(uv_loop_s * event_loop, int argc, const char * const * argv, int exec_argc, const char * const * exec_argv) Line 4585	C++
    node.exe!node::Start(int argc, char * * argv) Line 4640	C++
    [Inline Frame] node.exe!invoke_main() Line 79	C++

Worker threads: 8 threads are in this state
=> Blocked in OS API: WaitForSingleObjectEx
=> Call stack:
    [Inline Frame] node.exe!v8::base::Semaphore::Wait() Line 159	C++
    [Inline Frame] node.exe!v8::platform::TaskQueue::GetNext() Line 46	C++
    node.exe!v8::platform::WorkerThread::Run() Line 25	C++
    [Inline Frame] node.exe!v8::base::Thread::NotifyStartedAndRun() Line 512	C++
    node.exe!v8::base::ThreadEntry(void * arg) Line 1328	C++
    [Inline Frame] node.exe!invoke_thread_procedure(unsigned int(*)(void *) context, void * const) Line 91	C++
    node.exe!thread_start<unsigned int (__cdecl*)(void * __ptr64)>(void * const parameter) Line 115	C++

Thread pool threads: 4 threads are in this state
=> Blocked in OS API: NtWaitForWorkViaWorkerFactory
=> Call stack:
    ntdll.dll!TppWorkerThread()

Note that the process that is hanging is not the direct descendant of the calling site that is waiting (in this case the Ruby interpreter). The child process launched by popen3 has in fact exited, but the streams (stdour, stderr) aren't signalled because they were inherited by grandchildren, and one of those is the culprit that isn't exiting.

[WAITING] Ruby code: `popen "npm install"`
`- [exited] cmd.exe npm.bat
   |- [exited] node.exe npm-cli.js prefix -g
   `- [exited] node.exe npm-cli.js install
      `- [exited] cmd.exe /d /s /c node.exe scripts/install.js
         `- [STILL RUNNING/HUNG] node.exe scripts/install.js

The hung process is sitting around keeping stdout and stderr open, and since the pattern of the call site is to wait for the end of those streams as the indicator that the child process tree is complete, the call site blocks indefinitely as well.

I have experimented with altering the call site so that it instead explicitly waits for its direct child process to exit, and forces the streams closed. This does permit the calling code to proceed, but the rogue process still remains, which (since this is Windows) causes issues later on to do with file locking.

If there is further debugging I can do to try to narrow down the cause of this issue, I would appreciate any advice. :-) Thanks!

@Trott
Copy link
Member

Trott commented Nov 5, 2018

Anyone have any thoughts on this? If not (or even if so), should it be transferred to the help repo or somewhere else?

@Fishrock123 Fishrock123 added process Issues and PRs related to the process subsystem. windows Issues and PRs related to the Windows platform. labels Nov 15, 2018
@Fishrock123
Copy link
Contributor

Fishrock123 commented Nov 15, 2018

I don't think that npm in particular does anything abnormal that would trigger this, the Node side of it is almost certainly something to do with child_process, which npm uses to spawn subprocesses for lifecycle scripts.

My guess would be that cmd.exe isn't handling a pipe shutdown "correct", but I could be wrong.

@logiclrd If you are able to, could you try running this in something other than cmd.exe, for example powershell or git bash?

If it still fails then there is more certainly some windows pipe shutdown handling bug in node or perhaps libuv.

@Trott
Copy link
Member

Trott commented Nov 18, 2018

@nodejs/platform-windows

@jasnell
Copy link
Member

jasnell commented Jun 25, 2020

Without some kind of reproduction we can try to test and explore the issue, this is going to continue going unresolved. There's been no conversation in a long time and it's not clear that there's going to be any progress. Closing but we can reopen if more information comes in

@jasnell jasnell closed this as completed Jun 25, 2020
@logiclrd
Copy link
Author

@Fishrock123 asked if I could try using something other than cmd.exe, but I don't think I can. It's the standard shell/command interpreter and is being invoked implicitly because NPM is actually a batch file npm.bat. A second, nested instance is being launched by npm-cli.js in order to run node.exe scripts/install.js, and I have zero control over that from the outside.

@jasnell, you wrote that there is no reproduction, but I didn't see any suggestion that the reproduction I included in the initial report didn't actually work -- there is a reproduction!

@jasnell
Copy link
Member

jasnell commented Jun 25, 2020

Sorry, I wasn't clear there, I wasn't able to reproduce it.

@jasnell
Copy link
Member

jasnell commented Jun 25, 2020

Will reopen but it's likely to be a while before someone is able to take a look

@jasnell jasnell reopened this Jun 25, 2020
@jasnell jasnell added the help wanted Issues that need assistance from volunteers or PRs that need help to proceed. label Jun 26, 2020
@woodcockjosh
Copy link

woodcockjosh commented Dec 29, 2020

I also experienced this issue of the script hanging indefinitely when running node v14.15.1 on linux with the shebang
#!/usr/bin/env NODE_NO_WARNINGS=1 node

I was able to get the script to not hang by changing the shebang to
#!/usr/bin/env node

Even though the resulting script didn't produce any warnings, the script would not complete with that flag.

With the same version of node on macos with the same script the script ran successfully. So my issue appears to be related to linux only.

Here is the script that hangs

#!/usr/bin/env NODE_NO_WARNINGS=1 node

console.log('It worked');
process.exit(0);

Here is the script that doesn't hang

#!/usr/bin/env node

console.log('It worked');
process.exit(0);

@vigneras
Copy link

Exactly same issue as woodcockjosh. NODE_NO_WARNINGS=1 makes node hanging on Ubuntu/Linux. I've not tried on any other system.

@bnoordhuis
Copy link
Member

The issue here is with env(1), not node. You must pass -S, otherwise env ends up in a kind of exec loop that you can observe with strace -fe execve script.js. Happens with any environment variable, not just NODE_NO_WARNINGS=1.

Long story short, the fixed version is this:

#!/usr/bin/env -S NODE_NO_WARNINGS=1 node
console.log('It worked');
process.exit(0);

@bnoordhuis bnoordhuis closed this as not planned Won't fix, can't repro, duplicate, stale Dec 16, 2022
@logiclrd
Copy link
Author

I'm sorry, what?? The original issue is that on Windows, a child process hangs around holding the stdout and stderr pipes open. This is closed as a duplicate after someone reports a problem with environment variable splitting on Linux?? It seems clear that it is a completely unrelated issue!

@logiclrd
Copy link
Author

Just to point out that this issue is about redirected output streams on Windows, and has absolutely nothing to do with environment variables on Linux. woodcockjosh's report of seeing the same issue is spurious; it is not related to this bug report at all.

@logiclrd
Copy link
Author

^-- @bnoordhuis

@logiclrd
Copy link
Author

Well, I just ran my reproduction script with the latest Ruby version and Node v16.14.0 (just because it's what I happen to have installed), and the issue failed to reproduce. So I guess that's that. The issue was closed for entirely the wrong reason, but there's apparently no bug here any more?? In any event, my organization has pretty much migrated away from a Ruby-based buildchain anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Issues that need assistance from volunteers or PRs that need help to proceed. process Issues and PRs related to the process subsystem. windows Issues and PRs related to the Windows platform.
Projects
None yet
Development

No branches or pull requests

7 participants