Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stream: ensure pipeline always destroys streams #31940

Closed

Conversation

ronag
Copy link
Member

@ronag ronag commented Feb 24, 2020

There was an edge case where an incorrect assumption was made
in regards to whether eos/finished means that the stream is
actually destroyed or not.

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines

@ronag ronag added the stream Issues and PRs related to the stream subsystem. label Feb 24, 2020
There was an edge case where an incorrect assumption was made
in regardos whether eos/finished means that the stream is
actually destroyed or not.
@ronag ronag force-pushed the stream-pipeline-always-destroy branch from 23c5c02 to 5a55b1d Compare February 24, 2020 22:42
@nodejs-github-bot
Copy link
Collaborator

@ronag
Copy link
Member Author

ronag commented Feb 24, 2020

This is blocking backport of #31223 to node V13 which does not have autoDestroy enabled by default, thus some tests fail.

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a backport it should target the relative branch (v13-staging in this case). Moreover I would expect to see a rather different change compared to what is on master.

@ronag
Copy link
Member Author

ronag commented Feb 25, 2020

If this is a backport it should target the relative branch (v13-staging in this case). Moreover I would expect to see a rather different change compared to what is on master.

This is not a backport. It fixes a problem on master for streams that do not autoDestroy. However, it should be backported as well and is required for backporting #31223 to v13. Since in master we changed default autoDestroy to true which is not (and will not be) included in v13.

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@nodejs-github-bot
Copy link
Collaborator

@ronag ronag added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Feb 25, 2020
ronag added a commit that referenced this pull request Feb 26, 2020
There was an edge case where an incorrect assumption was made
in regardos whether eos/finished means that the stream is
actually destroyed or not.

PR-URL: #31940
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
@ronag
Copy link
Member Author

ronag commented Feb 26, 2020

Landed in b2be348

@ronag ronag closed this Feb 26, 2020
ronag added a commit to nxtedition/node that referenced this pull request Feb 27, 2020
There was an edge case where an incorrect assumption was made
in regardos whether eos/finished means that the stream is
actually destroyed or not.

Backport-PR-URL: nodejs#31975
PR-URL: nodejs#31940
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
codebytere pushed a commit that referenced this pull request Mar 1, 2020
There was an edge case where an incorrect assumption was made
in regardos whether eos/finished means that the stream is
actually destroyed or not.

Backport-PR-URL: #31975
PR-URL: #31940
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
@codebytere codebytere mentioned this pull request Mar 1, 2020
@arcanis
Copy link
Contributor

arcanis commented Mar 5, 2020

@ronag It seems this diff has introduced a major regression in Got:
sindresorhus/got#1107 (comment)

Note that I don't know if it's Got that relied on a buggy behavior or if it's an actual regression, but I figured I should ref the issues together 🙂

ronag referenced this pull request Mar 10, 2020
If the last stream in a pipeline is still usable/readable
don't destroy it to allow further composition.

Fixes: #32105

PR-URL: #32110
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
@targos
Copy link
Member

targos commented Apr 20, 2020

Should this land on v12.x-staging? If so, please open a backport PR.

@ronag
Copy link
Member Author

ronag commented Apr 20, 2020

Should not land on 12.x. Has indirect consequences that can cause breakage.

@mafintosh
Copy link
Member

Does this work with things like a TCP stream where destroy actually destroys the underlying fd was finish still can mean its waiting for FIN ack?

@ronag
Copy link
Member Author

ronag commented Apr 20, 2020

Does this work with things like a TCP stream where destroy actually destroys the underlying fd was finish still can mean its waiting for FIN ack?

Sorry, I didn't get that?

@mafintosh
Copy link
Member

simple example pipeline(socket, socket). This calls destroy automatically after eos now yes? Ie basically forces autoDestroy if I’m understanding that correctly.

For streams that have postfinish state this can be problematic. For example TCP (this may have changed since I looked at it last), the stream emits finish when all pending data has been sent but end() has a side effect. Its sends a FIN packet also. Now you auto destroy it meaning FIN wont get resend if that packet is lost.

My utp-native (tcp like stream over udp) module will def break because of this due to o/

@ronag
Copy link
Member Author

ronag commented Apr 20, 2020

For streams that have postfinish state this can be problematic. For example TCP (this may have changed since I looked at it last), the stream emits finish when all pending data has been sent but end() has a side effect. Its sends a FIN packet also. Now you auto destroy it meaning FIN wont get resend if that packet is lost.

I'm not familiar with this so I can't really answer. Maybe @addaleax or @jasnell?

If more work needs to be done, e.g. send a FIN packet, I think that should be handled by _destroy alternatively 'finish' should be delayed until the FIN stuff is complete (I think this might be what currently happens?).

Though, in v14 pipeline won't call destroy as it assumes that "standard" streams (such as net.Socket since #31806) will handle it themselves (#32158).

@mafintosh
Copy link
Member

follow up: wont this break all duplex streams?

net.createServer(function (socket) {
  pipeline(socket, a)
  pipeline(b, socket)
})

Wont this destroy the socket when either of those pipelines are done?

@mafintosh
Copy link
Member

I can see tcp uses autoDestroy so it that part is prob fine, re my first q

@ronag
Copy link
Member Author

ronag commented Apr 20, 2020

Wont this destroy the socket when either of those pipelines are done?

There has been follow up PRs that ensure that if:

  • The first stream is writable, it won't be destroyed
  • The last stream is readable, it won't be destroyed

The current implementation is a little bit messy, but if you look at this PR which cleans it up a bit it might be clear.

@ronag
Copy link
Member Author

ronag commented Apr 20, 2020

@mafintosh Anything still left unanswered?

@mafintosh
Copy link
Member

@ronag i think the other pr solves my above issue. Didn’t realise this was older.

@mafintosh
Copy link
Member

Have you tested this with http servers? Calling destroy on a req destroys the full socket I think

@ronag
Copy link
Member Author

ronag commented Apr 20, 2020

Have you tested this with http servers? Calling destroy on a req destroys the full socket I think

Yes, I noticed

@mafintosh
Copy link
Member

mafintosh commented Apr 20, 2020

Heads up, this breaks tar-stream also due to the same reasons as HTTP. Just verified on latest 13.

@mafintosh
Copy link
Member

Test case

echo hello > hello.txt
echo world > world.txt
tar c hello.txt world.txt > test.tar
const tar = require('tar-stream')
const fs = require('fs')
const path = require('path')
const pipeline = require('stream').pipeline

fs.createReadStream('test.tar')
  .pipe(tar.extract())
  .on('entry', function (header, stream, done) {
    console.log(header.name) // in 13 this will only unpack one file due to
                                                 // pipeline destroying the entire stream
    pipeline(stream, fs.createWriteStream(path.join('/tmp', header.name)), done)
  })

@mafintosh
Copy link
Member

Let me put that into a proper issue

@mafintosh
Copy link
Member

Thanks for the navigation help @ronag :)

ronag added a commit to nxtedition/node that referenced this pull request Apr 21, 2020
This PR logically reverts nodejs#31940 which
has caused lots of unnecessary breakage in the ecosystem.

This PR also aligns better with the actual documented behavior:

`stream.pipeline()` will call `stream.destroy(err)` on all streams except:
  * `Readable` streams which have emitted `'end'` or `'close'`.
  * `Writable` streams which have emitted `'finish'` or `'close'`.

The behavior introduced in nodejs#31940
was much more aggressive in terms of destroying streams. This was
good for avoiding potential resources leaks however breaks some
common assumputions in legacy streams.

Furthermore, it makes the code simpler and removes some hacks.

Fixes: nodejs#32954
Fixes: nodejs#32955
ronag added a commit that referenced this pull request Apr 23, 2020
This PR logically reverts #31940
which has caused lots of unnecessary breakage in the ecosystem.

This PR also aligns better with the actual documented behavior:

`stream.pipeline()` will call `stream.destroy(err)` on all streams
except:
  * `Readable` streams which have emitted `'end'` or `'close'`.
  * `Writable` streams which have emitted `'finish'` or `'close'`.

The behavior introduced in #31940
was much more aggressive in terms of destroying streams. This was
good for avoiding potential resources leaks however breaks some
common assumputions in legacy streams.

Furthermore, it makes the code simpler and removes some hacks.

Fixes: #32954
Fixes: #32955

PR-URL: #32968
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Mathias Buus <mathiasbuus@gmail.com>
ronag added a commit to nxtedition/node that referenced this pull request Apr 23, 2020
This PR logically reverts nodejs#31940 which
has caused lots of unnecessary breakage in the ecosystem.

This PR also aligns better with the actual documented behavior:

`stream.pipeline()` will call `stream.destroy(err)` on all streams except:
  * `Readable` streams which have emitted `'end'` or `'close'`.
  * `Writable` streams which have emitted `'finish'` or `'close'`.

The behavior introduced in nodejs#31940
was much more aggressive in terms of destroying streams. This was
good for avoiding potential resources leaks however breaks some
common assumputions in legacy streams.

Furthermore, it makes the code simpler and removes some hacks.

Fixes: nodejs#32954
Fixes: nodejs#32955

PR-URL: nodejs#32968
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Mathias Buus <mathiasbuus@gmail.com>
Backport-PR-URL: nodejs#32980
BethGriggs pushed a commit that referenced this pull request Apr 27, 2020
This PR logically reverts #31940
which has caused lots of unnecessary breakage in the ecosystem.

This PR also aligns better with the actual documented behavior:

`stream.pipeline()` will call `stream.destroy(err)` on all streams
except:
  * `Readable` streams which have emitted `'end'` or `'close'`.
  * `Writable` streams which have emitted `'finish'` or `'close'`.

The behavior introduced in #31940
was much more aggressive in terms of destroying streams. This was
good for avoiding potential resources leaks however breaks some
common assumputions in legacy streams.

Furthermore, it makes the code simpler and removes some hacks.

Fixes: #32954
Fixes: #32955

PR-URL: #32968
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Mathias Buus <mathiasbuus@gmail.com>
BethGriggs pushed a commit that referenced this pull request Apr 28, 2020
This PR logically reverts #31940
which has caused lots of unnecessary breakage in the ecosystem.

This PR also aligns better with the actual documented behavior:

`stream.pipeline()` will call `stream.destroy(err)` on all streams
except:
  * `Readable` streams which have emitted `'end'` or `'close'`.
  * `Writable` streams which have emitted `'finish'` or `'close'`.

The behavior introduced in #31940
was much more aggressive in terms of destroying streams. This was
good for avoiding potential resources leaks however breaks some
common assumputions in legacy streams.

Furthermore, it makes the code simpler and removes some hacks.

Fixes: #32954
Fixes: #32955

PR-URL: #32968
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Mathias Buus <mathiasbuus@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. stream Issues and PRs related to the stream subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants