'close' event for spawn() isn't emitted under unknown conditions #7159

iamdoron · 2016-06-05T06:36:30Z

Version: v4.4.5
Platform: OS X 10.11.5 (15F34) [Darwin 15.5.0 Darwin Kernel Version 15.5.0: Tue Apr 19 18:36:36 PDT 2016; root:xnu-3248.50.21~8/RELEASE_X86_64 x86_64]
Subsystem: child_process

I wasn't able to create a code example for this issue, but is happens consistently in our integration tests. We spawn a python process (2.7.10) with binary encoding - eventually we get stderr 'end' & process 'exit' events, but no stdout 'end' or process 'close' events - it seems like it's specific to some of the tests, but not for all. I tested on v4.4.4 (and some other earlier versions) with no similar issues at all - so you might be able to pin point the issue without a proper code example.

a sketch of the code

const process = spawn(pythonPath, command, { encoding: 'binary' } );
process
.on('exit', (code) => { console.log('exit') })
.on('close', (code) => { console.log('close') }) // for some cases this event doesn't arrive

The text was updated successfully, but these errors were encountered:

bnoordhuis · 2016-06-05T07:26:18Z

There is nothing in the diff between v4.4.4 and v4.4.5 that stands out. I don't think we'll be able to do much without a test case.

iamdoron · 2016-06-05T08:49:10Z

@bnoordhuis - I think it might be connected to gunzip, the following test case demonstrates the issue (at least on my machine)

example.sh:

#!/bin/bash
cat ./example.gz

example.gz

javascript:

const spawn = require("child_process").spawn;
const zlib = require('zlib');
const child = spawn("./example.sh")
child.on('exit', () => {
  console.log("exit");
}).on('close', () => {
  console.log("close");
})
const gunzip = zlib.createGunzip();
gunzip.on('end', () => {
  console.log("gunzip ended");
})
child.stdout.pipe(gunzip).on('data', function() {})

output on 4.4.5 (on my machine)

exit

on 4.4.4:

exit
close
gunzip ended

addaleax · 2016-06-05T09:06:11Z

48684af is the only zlib-related change in v4.4.5, and even though it touches the zlib close() logic, it does not seem like it could have this effect. I also can’t reproduce any change in behaviour from v4.4.4 to v4.4.5 using the above example on Linux.

There’s also my 51c0808, which touches the piping login for streams, but I wouldn’t say that that’s a likely candidate here, either.

addaleax · 2016-06-05T09:18:18Z

@iamdoron Does anything change from Node v4.1.2 to v4.2.0 for you?

iamdoron · 2016-06-05T09:21:24Z

I was able to reproduce it on ubuntu Linux 3.13.0-71-generic #114-Ubuntu SMP Tue Dec 1 02:34:22 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

just gunzip the example.gz file and append it to a new file until you have ~96MB of unzipped file (maybe you'll need more, I guess it depends on the machine), then gzip it and try running the example again

MylesBorins · 2016-06-05T09:33:52Z

@addaleax I plan to take a look into this on Monday. Taking advantage of githubs new ability to assign multiple people!

Feel free to unassign

addaleax · 2016-06-05T09:33:53Z

Can confirm that 51c0808 is actually responsible for this… sigh.

addaleax · 2016-06-05T09:34:32Z

/cc @nodejs/streams I guess

MylesBorins · 2016-06-05T09:35:39Z

Meta data from 51c0808

Fixes: #5820
Fixes: #5257
PR-URL: #6023

addaleax · 2016-06-05T09:42:51Z

Sufficiently large example.gz for reproduction: example.gz

Reset the `readableState.awaitDrain` counter after manual calls to `.resume()`. What might happen otherwise is that a slow consumer at the end of the pipe could end up stalling the piping in the following scenario: 1. The writable stream indicates that its buffer is full. 2. This leads the readable stream to `pause()` and increase its `awaitDrain` counter, which will be decreased by the writable’s next `drain` event. 3. Something calls `.resume()` manually. 4. The readable continues to pipe to the writable, but once again the writable stream indicates that the buffer is full. 5. The `awaitDrain` counter is thus increased again, but since it has now been increased twice for a single piping destination, the next `drain` event will not be able to reset `awaitDrain` to zero. 6. The pipe is stalled and no data is passed along anymore. The solution in this commit is to reset the `awaitDrain` counter to zero when `resume()` is called. Fixes: nodejs#7159

addaleax · 2016-06-05T11:48:56Z

PR: #7160

timlesallen · 2016-06-08T01:31:55Z

We have had to peg to 4.4.4 because of a issue like this. In our case we are piping from sox through ffmpeg both of which we are spawn()ing. We get the exit event on sox but not for ffmpeg.

addaleax · 2016-06-08T08:53:54Z

@thealphanerd Is there a reasonable chance the fix will make it into v4.5.0?

addaleax · 2016-06-08T12:55:16Z

For future reference, bisecting using @petrosagg’s test says that this specific problem with child processes was introduced in 34b535f (#4071).

MylesBorins · 2016-06-08T15:22:48Z

@addaleax #7160 fixes @petrosagg’s problem too?

addaleax · 2016-06-08T15:24:00Z

@thealphanerd Yes, I think we pretty much worked out that it is a duplicate of this one here.

MylesBorins · 2016-06-08T15:24:31Z

so this was not introduced in v4.4.5?

addaleax · 2016-06-08T15:26:51Z

Yes and no – It was introduced in v4.2.5, but had no chance to affect anything because it was covered by another regression that was present from v4.2.0 to v4.4.5 (The one fixed by #6023 and reported in #5820/#5257, as you mentioned above).

MylesBorins · 2016-06-08T15:28:33Z

Ok so... the regression that #6023 fixed made this other regression possible? TLDR we need the new fix and everything should be "better"

addaleax · 2016-06-08T15:34:20Z

Yes, exactly. If you should decide that you want to see live #7160 for a longer time in v6, then it’s probably better to revert the #6023 fix temporarily because I’d consider this bug more serious… I’m pretty sure that #7160 is safe to land in v4 and won’t introduce any more regressions, but I get that there are reasons for the rules around this process.

MylesBorins · 2016-06-08T16:18:38Z

@addaleax thanks for all the insight... will keep you in the loop about what the decision is

Reset the `readableState.awaitDrain` counter after manual calls to `.resume()`. What might happen otherwise is that a slow consumer at the end of the pipe could end up stalling the piping in the following scenario: 1. The writable stream indicates that its buffer is full. 2. This leads the readable stream to `pause()` and increase its `awaitDrain` counter, which will be decreased by the writable’s next `drain` event. 3. Something calls `.resume()` manually. 4. The readable continues to pipe to the writable, but once again the writable stream indicates that the buffer is full. 5. The `awaitDrain` counter is thus increased again, but since it has now been increased twice for a single piping destination, the next `drain` event will not be able to reset `awaitDrain` to zero. 6. The pipe is stalled and no data is passed along anymore. The solution in this commit is to reset the `awaitDrain` counter to zero when `resume()` is called. Fixes: #7159 PR-URL: #7160 Reviewed-By: Matteo Collina <matteo.collina@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com>

addaleax added the child_process Issues and PRs related to the child_process subsystem. label Jun 5, 2016

bnoordhuis added the unconfirmed label Jun 5, 2016

addaleax added the zlib Issues and PRs related to the zlib subsystem. label Jun 5, 2016

MylesBorins assigned MylesBorins and addaleax Jun 5, 2016

addaleax added the stream Issues and PRs related to the stream subsystem. label Jun 5, 2016

addaleax removed the unconfirmed label Jun 5, 2016

MylesBorins added the v4.x label Jun 5, 2016

addaleax mentioned this issue Jun 5, 2016

stream: reset awaitDrain after manual .resume() #7160

Closed

3 tasks

This was referenced Jun 7, 2016

child-process: data loss with piped stdout #7184

Closed

Fix child process stdio data loss with slow piped consumers #7185

Closed

addaleax removed the zlib Issues and PRs related to the zlib subsystem. label Jun 8, 2016

MylesBorins mentioned this issue Jun 8, 2016

Regression in v4.4.5 nodejs/Release#110

Closed

MylesBorins mentioned this issue Jun 13, 2016

Streams Regression in v4.x nodejs/Release#112

Closed

addaleax closed this as completed in e2e615e Jun 13, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

'close' event for spawn() isn't emitted under unknown conditions #7159

'close' event for spawn() isn't emitted under unknown conditions #7159

iamdoron commented Jun 5, 2016

bnoordhuis commented Jun 5, 2016

iamdoron commented Jun 5, 2016

addaleax commented Jun 5, 2016

addaleax commented Jun 5, 2016

iamdoron commented Jun 5, 2016 •

edited

Loading

MylesBorins commented Jun 5, 2016

addaleax commented Jun 5, 2016

addaleax commented Jun 5, 2016

MylesBorins commented Jun 5, 2016

addaleax commented Jun 5, 2016

addaleax commented Jun 5, 2016

timlesallen commented Jun 8, 2016 •

edited

Loading

addaleax commented Jun 8, 2016

addaleax commented Jun 8, 2016

MylesBorins commented Jun 8, 2016

addaleax commented Jun 8, 2016

MylesBorins commented Jun 8, 2016

addaleax commented Jun 8, 2016 •

edited

Loading

MylesBorins commented Jun 8, 2016 •

edited

Loading

addaleax commented Jun 8, 2016

MylesBorins commented Jun 8, 2016 •

edited

Loading

'close' event for spawn() isn't emitted under unknown conditions #7159

'close' event for spawn() isn't emitted under unknown conditions #7159

Comments

iamdoron commented Jun 5, 2016

bnoordhuis commented Jun 5, 2016

iamdoron commented Jun 5, 2016

addaleax commented Jun 5, 2016

addaleax commented Jun 5, 2016

iamdoron commented Jun 5, 2016 • edited Loading

MylesBorins commented Jun 5, 2016

addaleax commented Jun 5, 2016

addaleax commented Jun 5, 2016

MylesBorins commented Jun 5, 2016

addaleax commented Jun 5, 2016

addaleax commented Jun 5, 2016

timlesallen commented Jun 8, 2016 • edited Loading

addaleax commented Jun 8, 2016

addaleax commented Jun 8, 2016

MylesBorins commented Jun 8, 2016

addaleax commented Jun 8, 2016

MylesBorins commented Jun 8, 2016

addaleax commented Jun 8, 2016 • edited Loading

MylesBorins commented Jun 8, 2016 • edited Loading

addaleax commented Jun 8, 2016

MylesBorins commented Jun 8, 2016 • edited Loading

iamdoron commented Jun 5, 2016 •

edited

Loading

timlesallen commented Jun 8, 2016 •

edited

Loading

addaleax commented Jun 8, 2016 •

edited

Loading

MylesBorins commented Jun 8, 2016 •

edited

Loading

MylesBorins commented Jun 8, 2016 •

edited

Loading