Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Piping process.stdin to child.stdin leaves behind an open handle #32291

Open
ehmicky opened this issue Mar 15, 2020 · 16 comments
Open

Piping process.stdin to child.stdin leaves behind an open handle #32291

ehmicky opened this issue Mar 15, 2020 · 16 comments
Labels
stream Issues and PRs related to the stream subsystem.

Comments

@ehmicky
Copy link

ehmicky commented Mar 15, 2020

  • Version: 13.11.0 (tried on 8.3.0 too, same result)
  • Platform: Linux
  • Subsystem: Ubuntu 19.10

What steps will reproduce the bug?

foo.c

#include <stdio.h>
int main() {
  char str[100];
  gets(str);
  puts(str);
  return 0;
}

Compile it:

$ gcc -o foo foo.c

index.js:

const { spawn } = require('child_process')

const child = spawn('./foo')
process.stdin.pipe(child.stdin)
child.on('exit', () => {
  console.log('done')
  process.stdin.unpipe(child.stdin)
})

Launch index.js:

$ node index.js
input
done

After entering some input, the child process exits and done is printed. However the main process does not exit (although it should). It looks like process.stdin has a left handle. After entering newline again on stdin, the process now exits.

How often does it reproduce?

Always

What is the expected behavior?

See above.

What do you see instead?

See above.

@mscdex
Copy link
Contributor

mscdex commented Mar 15, 2020

You could replace process.stdin.unpipe(child.stdin) with process.stdin.destroy().

@addaleax
Copy link
Member

@mscdex That’s true, but it still seems to me like when we’re not reading from the only stream in use, that that should stop the event loop?

@ehmicky
Copy link
Author

ehmicky commented Mar 15, 2020

Thanks @mscdex, this does make the process exit properly.

But as @addaleax points out, this still indicates an underlying issue: the main process should exit without having to destroy process.stdin.

@addaleax
Copy link
Member

I think ultimately the issue here is that once we’ve told a stream to start reading, we have no way of telling it to stop until it does read some data. We have a hack specifically for process.stdin.pause() in place, but that won’t work for other streams or, apparently, .unpipe().

@ronag @nodejs/streams Any ideas? Would it be possible to figure out when a stream is no longer actively being consumed and e.g. emit a event in that case that stops an in-progress _read()?

@mcollina
Copy link
Member

Any ideas? Would it be possible to figure out when a stream is no longer actively being consumed and e.g. emit a event in that case that stops an in-progress _read()?

We do not expose readStop() to the end users: stopping/aborting an ongoing request is something we have no agreed way to do. The closer we got is the stream.destroy() mechanism.
For legacy reasons, pipe() does not destroy streams. Therefore this is not a bug and it works as expected.

The "correct" implementation is:

const { spawn } = require('child_process')
const { pipeline } = require('stream')

const child = spawn('./foo')
// pipeline calls destroy for you
pipeline(process.stdin, child.stdin, (err) => {
  console.log('done', err)
})

@addaleax
Copy link
Member

@mcollina I’m sorry but I think that answer misses the point here.

We do not expose readStop() to the end users: stopping/aborting an ongoing request is something we have no agreed way to do. The closer we got is the stream.destroy() mechanism.

Yes. I am asking for a new mechanism to be introduced for this, essentially.

For legacy reasons, pipe() does not destroy streams. Therefore this is not a bug and it works as expected.

That may be helpful for the issue that OP is running into, but it assumes that you always want to destroy a stream when you are unpiping it, and that’s definitely not true. What if stdin were to be piped into another destination after the child process ended?

This problem also arises even without the use of pipes:

'use strict';
function handler(chunk) {
  console.log('got', chunk);
  process.stdin.removeListener('data', handler);
};

process.stdin.on('data', handler);

After the first chunk, there is nothing in the process that should keep the event loop alive, but the process does not stop trying to read from stdin.

@mcollina
Copy link
Member

I understand now.

The property of _handle is managed by net.Socket (and equivalents, e.g. tty). Are you asking for a callback that is called when pause or unpipe happen and implement it into net.Socket?

@addaleax
Copy link
Member

@mcollina Yes, pause or unpipe or when reading stops in another way such as removing all data/readable listeners 👍

@mcollina
Copy link
Member

I think that's possible. It means adding some more intelligence to:

Readable.prototype.removeListener = function(ev, fn) {
const res = Stream.prototype.removeListener.call(this, ev, fn);
if (ev === 'readable') {
// We need to check if there is someone still listening to
// readable and reset the state. However this needs to happen
// after readable has been emitted but before I/O (nextTick) to
// support once('readable', fn) cycles. This means that calling
// resume within the same tick will have no
// effect.
process.nextTick(updateReadableListening, this);
}
return res;
};
Readable.prototype.off = Readable.prototype.removeListener;
Readable.prototype.removeAllListeners = function(ev) {
const res = Stream.prototype.removeAllListeners.apply(this, arguments);
if (ev === 'readable' || ev === undefined) {
// We need to check if there is someone still listening to
// readable and reset the state. However this needs to happen
// after readable has been emitted but before I/O (nextTick) to
// support once('readable', fn) cycles. This means that calling
// resume within the same tick will have no
// effect.
process.nextTick(updateReadableListening, this);
}
return res;
};
function updateReadableListening(self) {
const state = self._readableState;
state.readableListening = self.listenerCount('readable') > 0;
if (state.resumeScheduled && state[kPaused] === false) {
// Flowing needs to be set to true now, otherwise
// the upcoming resume will not flow.
state.flowing = true;
// Crude way to check if we should resume
} else if (self.listenerCount('data') > 0) {
self.resume();
} else if (!state.readableListening) {
state.flowing = null;
}
}
.

@jsumners
Copy link
Contributor

Original OP here: the short version is that I performed one pipe with a corresponding unpipe and expected that to correctly dereference anything attached to the stream such that the event loop would be free to exit. Basically, shouldn't there be a reference count of some sort on the internal resources that gets decremented (freed) on unpipe?

@ehmicky
Copy link
Author

ehmicky commented Mar 17, 2020

I would like to add that destroying the stream is not always an option in cases where the function does not know whether the stream will be re-used (in which case it should not be destroyed) or not (in which case it should not leave a handle open, keeping the parent process running).

@ronag
Copy link
Member

ronag commented Mar 17, 2020

does not know whether the stream will be re-used

Do you maybe have a concrete example of this? Shouldn't it be up to the "root" consumer/app then to ensure that resources are released/destroyed? If resources are not properly cleaned up then it should keep running, no?

@ehmicky
Copy link
Author

ehmicky commented Mar 17, 2020

Yes you're right. What I meant was more along the following lines. In an example like:

const { spawn } = require('child_process')

const child = spawn('./foo')
process.stdin.pipe(child.stdin)
child.on('exit', () => {
  console.log('done')
  process.stdin.destroy()
})

This seems to be a problem that spawning a child process that uses the parent stdin requires destroying it to do a proper cleanup, instead of just reverting the pipe().

@ronag
Copy link
Member

ronag commented Mar 17, 2020

Hm, this is not my area of expertise but I would expect any open handles/references to be closed on 'exit' and data just to be buffered in memory on the stdin stream (i.e. it should just have buffered data without any handle reference). What keeps the process alive in this case?

@targos targos added the stream Issues and PRs related to the stream subsystem. label Dec 27, 2020
@rivertam
Copy link

Do you maybe have a concrete example of this? Shouldn't it be up to the "root" consumer/app then to ensure that resources are released/destroyed? If resources are not properly cleaned up then it should keep running, no?

I'm writing essentially a script runner and I want to run multiple processes in a row. In this case, I'm basically converting shell scripts to Node.js:

#/bin/bash

interactive-command-1 arg1 arg2
interactive-command-2 arg1 arg2

becomes

#!/usr/bin/env node

async function main() {
  await execute('interactive-command-1 arg1 arg2')
  await execute('interactive-command-2 arg1 arg2')
}

so I'm trying to write an execute command that works for this usecase. This is a usecase where I don't think the "root" application should always have to be like

main().then(() => {
  process.stdin.destroy();
});

which I think is the implication here. It's part of the contract of execute that stdio gets forwarded to the subprocess during the Promise execution and gets returned after resolution, so it just feels gross that there's still more cleanup to do.

@robsonos
Copy link

Hi all, any updates on this?
I have a similar issue where after I unpipe a child stream.transform the parent stream becomes unresponsive

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stream Issues and PRs related to the stream subsystem.
Projects
None yet
Development

No branches or pull requests

9 participants