Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert stream base #1140

Closed
wants to merge 16 commits into from
Closed

Revert stream base #1140

wants to merge 16 commits into from

Conversation

piscisaureus
Copy link
Contributor

The introduction of stream_base is a nice idea in itself, but it has not been thoroughly tested and it causes test failures. This is also evidenced by the fact that a ton of followup fixes were necessary which didn't even resolve all the issues. In hindsight it was inappropriate to have landed in on a stable branch.

Therefore I move to revert it, and continue development on the 'NG' branch until it is rock solid.

@indutny
cc @bnoordhuis, @rvagg, @mikeal

@bnoordhuis
Copy link
Member

Reverting 563771d and 671fbd5 seems a bit unfortunate. LGTM apart from that.

@Fishrock123
Copy link
Contributor

Fwiw the paypal leak was originally happening on a release that was prior to stream_base's introduction.

@rvagg
Copy link
Member

rvagg commented Mar 13, 2015

https://jenkins-iojs.nodesource.com/job/iojs+any-pr+multi/293/

This is probably a tc-agenda item because it involves a bit of policy, we're pretty deep in to stream_base so far and reverting it could be considered a breaking change. My personal preference would be to try and fix the remaining problems with it at this stage.

  • the leak in Possible memory leak in TLS(Wrap?) #1075, which we don't know is related to this, PayPal mentioned the possibility of a leak long before stream_base changes even showed up and the most egregious parts of the leak appear to have been resolved already
  • the windows problems as documented in [PRIORITY] Persistent failures on Windows #1005 for which we don't appear to have the personnel to fix and I suspect is at the root of this PR, I'll be interested in seeing how CI reports on the 3 remaining problems, this could tip the scales because if we can't get stuff fixed on Windows then it may justify this kind of drastic action

@rvagg
Copy link
Member

rvagg commented Mar 13, 2015

@rvagg
Copy link
Member

rvagg commented Mar 13, 2015

compare to details in #1005 where we are down to 3 consistent failures on Windows, it looks like this may not address those and they go further back than stream_base.

@piscisaureus
Copy link
Contributor Author

Reverting 563771d and 671fbd5 seems a bit unfortunate. LGTM apart from that.

I will put those commits back; I had to revert them to cleanly revert earlier commits.

compare to details in #1005 where we are down to 3 consistent failures on Windows, it looks like this may not address those and they go further back than stream_base.

I spent the better part of my night debugging test failures. Although I haven't identified the exact root cause, it does seem that stream_base has introduced a serious bug where writable streams are closed before all data is fully flushed. Because this tracking is done in javascript, I suspect unix may be affected as well. See #1066 (comment).

On my local machine this revert fixes a bunch of tests. There are a few remaining failures that seem unrelated, I'm digging into them. #1137 fixes one of those.

@piscisaureus
Copy link
Contributor Author

This is probably a tc-agenda item because it involves a bit of policy, we're pretty deep in to stream_base so far and reverting it could be considered a breaking change. My personal preference would be to try and fix the remaining problems with it at this stage.

I agree, this is totally a policy issue.
We need a strategy to pull ourselves out of the morass when something like this happens.

Note that I have no objections to the stream_base idea itself, and I expect it to re-land when the kinks are worked out.

@indutny
Copy link
Member

indutny commented Mar 13, 2015

@piscisaureus we didn't get any reports about TLS leaks except #1075, which we are quite close to narrowing down. Considering this and the comments in this thread regarding the issue happening before the changes - the TLS should be fixed soon and the leak doesn't appear to be affecting anyone else.

The tests on windows, @piscisaureus which tests are failing before these changes and which are failing after reverting them? It appears to me that there is only two test failures at the moment on windows:

  • One with tls wrapping of socket
  • And one that was introduced by stream_base changes themselves.

Considering that the latter one doesn't happen on my machine at all - could you please confirm that it is not failing right before the stream_base changes?

Overall, I'm committed to resolving this issues and quite sure that they will be figured out soon.

@rvagg
Copy link
Member

rvagg commented Mar 13, 2015

also, I'm +1 on using reversion PRs to put a fire under those responsible for the changes to give them a chance to rectify problems--fix or be reverted!

@indutny
Copy link
Member

indutny commented Mar 13, 2015

Anyway, I'll be in office in a couple of hours - and will try to nail it down ;)

@indutny
Copy link
Member

indutny commented Mar 13, 2015

@piscisaureus I think tls-over-http test is a sibling of e1bf670 , going to figure it out on FreeBSD, because it is much easier to do this there :)

@piscisaureus
Copy link
Contributor Author

@indutny

I think tls-over-http test is a sibling of e1bf670 , going to figure it out on FreeBSD, because it is much easier to do this there :)

Ok, that's cool.

@indutny
Copy link
Member

indutny commented Mar 13, 2015

The key difference appears to be between this:

screen shot 2015-03-13 at 12 44 59
screen shot 2015-03-13 at 12 45 37

@indutny
Copy link
Member

indutny commented Mar 14, 2015

Ha, I fixed that test :)

@indutny
Copy link
Member

indutny commented Mar 14, 2015

Fixed the test with #1155 , still needs some investigation, but it works now.

@indutny
Copy link
Member

indutny commented Mar 14, 2015

Just a follow-up comment test-tls-over-http-tunnel is fixed in #1155, and could have been actually failing prior to StreamBase if TLSWrap wouldn't allow writev() (which I accidentally did in StreamBase patches).

So what do we have now? It seems that a leak in #1075 and a windows test that was introduced in StreamBase changes itself (and is not failing on my machine). Have I missed anything else? cc @piscisaureus

@indutny
Copy link
Member

indutny commented Mar 14, 2015

Oh, and according to @Fishrock123 the #1075 was happening prior to StreamBase (cc @jasisk, may I ask you to confirm this?).

@jasisk
Copy link

jasisk commented Mar 14, 2015

Confirmed. We observed this leak from the first version of io.js we introduced into the application mentioned in #1075 (v1.1.0).

@indutny
Copy link
Member

indutny commented Mar 14, 2015

@piscisaureus I have bisected the io.js/node.js commit log, and found that starting from eef0715 it fails when removing .cork()/.uncork().

@indutny
Copy link
Member

indutny commented Mar 14, 2015

@piscisaureus considering the change itself, it appears that there is some deep issue in net with shutdown request terminated on .destroy().

@bnoordhuis do you think it is worth fixing? Or should I figure this out internally in tls module?

@rvagg rvagg mentioned this pull request Mar 14, 2015
@indutny
Copy link
Member

indutny commented Mar 15, 2015

Can we close this issue now?

@piscisaureus
Copy link
Contributor Author

If the issue is fixed then there is no need to do the revert. But I'd like to keep it open until the next TC meeting because there are some tangential questions that need answering.

@indutny
Copy link
Member

indutny commented Mar 24, 2015

Closing?

@indutny indutny closed this Mar 24, 2015
@indutny indutny deleted the revert-stream-wrap branch March 24, 2015 02:39
@rvagg rvagg removed the tc-agenda label Apr 1, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants