Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP/1 request destroy behavior change on framing error #24586

Closed
dougwilson opened this issue Nov 24, 2018 · 38 comments
Closed

HTTP/1 request destroy behavior change on framing error #24586

dougwilson opened this issue Nov 24, 2018 · 38 comments
Labels
http Issues or PRs related to the http subsystem.

Comments

@dougwilson
Copy link
Member

  • Version: v10.13.0
  • Platform: Windows 10
  • Subsystem: http

I noticed a different in how the HTTP/1 request objects are left after a HTTP/1 framing error (which causes an error in the underlying http_parser). In Node.js 8 and below, the request object would be destroyed and be left in non-readable state (i.e. req.readable === false). It seems like in Node.js 10+ this is no longer the case. Is this an expected change? Express.js has underlying machinery that is looking to see if a request body should be read (since the request starts out in a pasued state since 0.10) and it doesn't attempt to read when it's no longer readable (since events like 'end' will not be emitted, leaving things hanging around forever).

Here is an example that reproduces the scenario:

var http = require('http')
var net = require('net')

var server = http.createServer(function (req, res) {
  setTimeout(function () {
    if (!req.socket.readable) {
      console.log('req already terminated')
      server.close()
      return
    }

    console.log('req read start')

    var bufs = []

    req.on('data', function (chunk) {
      console.log('request recv %d bytes', chunk.length)
      bufs.push(chunk)
    })

    req.on('end', function () {
      var data = Buffer.concat(bufs)
      console.log('request got %d bytes', data.length)
      res.end('OK')
    })

    req.on('close', function () {
      console.log('clean up here...')
      server.close()
    })
  }, 5000)
})

server.listen(0, function () {
  var port = server.address().port
  var sock = net.createConnection(port, '127.0.0.1')

  sock.on('connect', function () {
    sock.write('POST / HTTP/1.1\r\n')
    sock.write('Host: localhost\r\n')
    sock.write('Transfer-Encoding: chunked\r\n')
    sock.write('\r\n')
    sock.write('W')
    sock.on('close', function () {
      console.log('client socket closed')
    })
  })
})

setTimeout(function () {
  console.log('timeout!')
  process.exit(1)
}, 10000).unref()

In Node.js 8 and lower you get the following output:

client socket closed
req already terminated

In Node.js 10 you get the following output:

req read start
timeout!

I'm not sure if this was an expected change or not. If it is an expected change, I'm just looking for what the correct way to know if a bit of code (the part inside setTimeout) that runs at an arbitrary time after the request came in should know if it's actually ever going to get a complete read or not. The only thing I've found so far in Node.js 10 is that you have to try and read no matter what and just rely on coding in a read timeout to know when to give up instead of knowing earlier.

@addaleax addaleax added the http Issues or PRs related to the http subsystem. label Nov 24, 2018
@addaleax
Copy link
Member

@nodejs/http

@addaleax
Copy link
Member

Bisecting points to 9b7a691.

/cc @lpinca

@lpinca
Copy link
Member

lpinca commented Nov 24, 2018

I think the problem is that the client socket is paused. If you add sock.resume() after the 'connect' event is emitted, it works as expected.

@dougwilson
Copy link
Member Author

Thanks @lpinca . So there are I guess two issues with that suggestion: (1) this was just a replication all within Node.js, but originally the client was a C library, not Node.js, and (2) the question here is really about the server; I don't think the server really gets a say over what the clients are doing in the end (I mean, the client sent bad framing already :) ).

Is the suggested solution here that whatever client code out there just needs to be altered such that Node.js streams on the server code are acting like they used to? Or are you saying something else?

@dougwilson
Copy link
Member Author

Just thinking more about how to articulate my thoughts: I feel like when a framing error occurs when the client is transmitting the body of the http request, there should be some way to understand on the req that this happened and the request is in an error state or some such. There really just seems like no way to do this in Node.js 10 as far as I can tell (I opened a related issue #24585 about that). Of course in the other issue, it was an event on socket that emitted even prior to attaching a 'data' event listener when the request stream starts out in paused mode, so I was using req.readable in prior versions to infer that an error occurred.

I'm just generally at a loss of how, in Node.js 10+, you can tell that a request is in an error state any more to do able to process it differently (for example, skip even bothering to read the request body, typically just req.resume() it to dump it out without storing it).

A typical framing mistake clients will make in chunked mode is they end the request with \r\n\r\n improperly, instead of with a zero-length chunk (\r\n0\r\n\r\n).

It seems that one difference here is that in Node.js 8 and below this mistake would end up calling .destroy() on the object, which set the request object to be in the non-readable state, but now in Node.js 10+ there is some code that is instead writing a HTTP/1.1 400 Bad Request back and then I guess trying to gracefully close down the connection, which it seems that clients can then trigger this state if they are not reading off the socket on their side for whatever reason (clients with framing errors are just generally malfunctional, though).

@lpinca
Copy link
Member

lpinca commented Nov 24, 2018

Is the suggested solution here that whatever client code out there just needs to be altered such that Node.js streams on the server code are acting like they used to?

No, I don't think so, the problem in the example is that no one is closing the TCP connection because there is buffered data to consume on the client. Only when buffered data is consumed on the client the socket is closed but this is how it works in Node.js. If the C lib was closing the socket then no changes should be necessary.

The question is more about how the parser handle errors. If the socket is not explicitly destroyed on error, it is still readable as long as there is buffered data to read.

@dougwilson
Copy link
Member Author

the problem in the example is that no one is closing the TCP connection because there is buffered data to consume on the client. Only when buffered data is consumed on the client the socket is closed but this is how it works in Node.js. If the C lib was closing the socket then no changes should be necessary.

Right, but that is what I'm saying: the c client is not consuming the data, just like the code I provided above. It was working in node.js 8. How can I get the server to behave the same way in Node.js 10 given the same client behavior?

@lpinca
Copy link
Member

lpinca commented Nov 24, 2018

Make the client close the connection? One of the two peer must do that. If that particular client (not affected by the Node.js breaking change) was doing it before, why isn't it doing now? Or does it?

@dougwilson
Copy link
Member Author

And to clarify: what I mean by not consuming is just the term you were using. Specifically in the c code, it is reading the data, but it does not send back a ACK to the server's FIN, which is why Node.js thinks the other side is open still (I used Wireshark to see the exact packets being exchanged). The c client sees the HTTP 400 Bad Request response, but it's some poorly written thing by a vendor and since that doesn't have connection: close it it assuming the default http/1.1 behavior of keep alive I guess? I don't know why it does not ACK the server's FIN.

But what I can probably do in Express.js is just also check the writable state of the socket. Probably just assume when socket.writable === false, then don't bother trying to read the request since something has happened where the socket is half closed now. I don't exactly understand yet what the ramifications of such an assumption would be, though. Ideally if there was a way to know that there was a parser error that got the node.js side into this state...

@dougwilson
Copy link
Member Author

Make the client close the connection? One of the two peer must do that. If that particular client (not affected by the Node.js breaking change) was doing it before, why isn't it doing now? Or does it?

It is some vendor code, I can't just alter the client code... I just want to be able to skip reading a request that is in a broken state on the node.js side is all. Lots of clients are going to interact with a server that are not under the control of the same people who control the server, right?

@lpinca
Copy link
Member

lpinca commented Nov 24, 2018

Yeah definitely. I think one way is to use the 'clientError' event which is emitted on parse error and destroy the socket, like it was done before f2f391e.

@dougwilson
Copy link
Member Author

But that doesn't really help in my example above. The example is that there is code that executes to read a request (the body of setTimeout) that runs at a later point (after doing things like looking up and validating auth headers).

Can you share what an example would look like to out un place of the req.socket.readable in your example?

Or are you saying that the solution is that it will just work as it as long as I add a custom clientError event listener to the server object that just reinstates the old socket.destroy() behavior? That's unfortunately not possible to make a change in Express.js to fix the behavior for users since Express.js does not create the server object to attach event listeners to; it just returns the requestListener function the user needs to pass to their own http (or https) server.

@lpinca
Copy link
Member

lpinca commented Nov 24, 2018

Or are you saying that the solution is that it will just work as it as long as I add a custom clientError event listener to the server object that just reinstates the old socket.destroy() behavior?

This.

That's unfortunately not possible to make a change in Express.js to fix the behavior for users since Express.js does not create the server object to attach event listeners to; it just returns the requestListener function the user needs to pass to their own http (or https) server.

How about this hack:

function destroy(err, socket) {
  socket.destroy(err)
}

and inside the 'request' listener add

req.socket.server.on('clientError', destroy);

Edit: ofc the event listener should be added only once.

@dougwilson
Copy link
Member Author

Oh, didn't realize there was a server property on the socket. I don't see that documented anywhere, is that public API? So two questions on that, though (1) would there be any negative consequences doing that if existing servers are doing something in that event and (2) what is a good way to only add the event listener once?

@lpinca
Copy link
Member

lpinca commented Nov 24, 2018

  1. would there be any negative consequences doing that if existing servers are doing something in that event

Yes, potentially, if something is being written on the socket with multiple .write() calls, the socket could be destroyed prematurely, but in this case, if there is already a handler, it probably doesn't make sense to add a new one as it's already responsibility of the existing one to handle the event properly?

  1. what is a good way to only add the event listener once

I think, emitter.listenerCount() can be used but that is only available on Node.js >= 3.2.0

@dougwilson
Copy link
Member Author

dougwilson commented Nov 24, 2018

Ok, so circling back around to my very initial question: is this an expected change and Express.js and it's users just need to deal with the new behavior, as there us nothing Node.js will be changing in this regard, is that correct? @lpinca

@lpinca
Copy link
Member

lpinca commented Nov 24, 2018

Yes, reverting 9b7a691 or f2f391e is not an option imho. I think f2f391e was not a good idea but it got a lot of approvals and it's too late now.

@lpinca
Copy link
Member

lpinca commented Nov 24, 2018

That's just my opinion though, let's see wait for other collaborators to chime in.

@dougwilson
Copy link
Member Author

dougwilson commented Nov 24, 2018

So a general streams question: if a stream has an error while it's in a paused state, is there no way to know the stream is in error when you start reading the stream?

Right now this seems like a bad stream design as it is: the request errored out and there is apparently zero way to know this. It would of course be nice to know this before allocating buffers, decoders, etc to start reading into on the server.

But even worse is that there seems to be zero way to realize this even after you start reading in the example above; the only thing the server can do is wait for the (default 2 minute) timeout on the socket, even though the client cannot send any more data since the http_parser entered into an error state.

@lpinca
Copy link
Member

lpinca commented Nov 24, 2018

@dougwilson I think that's an issue with all streaming parsers. You don't know about the error until the bad chunk is actually processed. There is no error at TCP layer in this case.

@dougwilson
Copy link
Member Author

dougwilson commented Nov 24, 2018

Right, but I'm saying in this case there is zero way, during the read, to know about the http parse error. This was possible in previous Node.js version by listening to req.socket.on(error ... I think this is a regression in Node.js 10. How can you tell that the error occurred in this case to just stop the read? There is an error occurring here, but no way to see it in Node.js 10 like you could in previous Node.js versions.

@lpinca
Copy link
Member

lpinca commented Nov 24, 2018

I don't know, it seems no 'error' event was emitted on the net.Socket when an http parse error occurred.

'clientError' was emitted on the server and it still is but behavior changed.

@dougwilson
Copy link
Member Author

dougwilson commented Nov 24, 2018

In previous versions of Node.js there is an error emitted on the socket. See the issue #24585 for the example to run to reproduce.

@dougwilson
Copy link
Member Author

The clientError event is unusable because it happens outside of the code that would be trying to read the request body.

@lpinca
Copy link
Member

lpinca commented Nov 24, 2018

The 'error' event was emitted on the socket because destory() was called on it with an error see f2f391e#diff-feaf3339998a19f0baf3f82414762c22R468.

I didn't test but if you use the 'clientError' event and destroy the socket with the error you get along with the event, you should get the same old behavior.

@MylesBorins
Copy link
Contributor

MylesBorins commented Nov 27, 2018

/cc @nodejs/tsc

@dougwilson
Copy link
Member Author

dougwilson commented Nov 30, 2018

I'm closing this because it seems Node.js core is not interested in getting Express.js to correctly function on current Node.js version like 10+.

lpinca added a commit to lpinca/node that referenced this issue Nov 30, 2018
Destroy the socket if the `'clientError'` event is emitted and there is
no listener for it.

Fixes: nodejs#24586
@mcollina
Copy link
Member

cc @nodejs/moderation can one of you help here? This is quite unusual and I am on mobile, thanks.

@nodejs nodejs deleted a comment from dougwilson Nov 30, 2018
@nodejs nodejs deleted a comment from mcollina Nov 30, 2018
@nodejs nodejs deleted a comment from dougwilson Nov 30, 2018
@nodejs nodejs deleted a comment from mcollina Nov 30, 2018
@nodejs nodejs deleted a comment from dougwilson Nov 30, 2018
@nodejs nodejs deleted a comment from mcollina Nov 30, 2018
@nodejs nodejs deleted a comment from dougwilson Nov 30, 2018
@nodejs nodejs deleted a comment from dougwilson Nov 30, 2018
@nodejs nodejs deleted a comment from dougwilson Nov 30, 2018
@nodejs nodejs deleted a comment from mcollina Nov 30, 2018
@lpinca
Copy link
Member

lpinca commented Nov 30, 2018

I've deleted some comments as per request.

@Fishrock123
Copy link
Contributor

So the root issue here is that paused streams don’t get errors?

There is a lot to digest here and it seems that something is not right but it is unclear what that might be.

@lpinca
Copy link
Member

lpinca commented Dec 1, 2018

@Fishrock123 no, the issue is that the socket is no longer destroyed on parse error as per f2f391e. The server socket waits for the client to close the connection. However this might never happen.

@mcollina mcollina added tsc-agenda Issues and PRs to discuss during the meetings of the TSC. and removed tsc-agenda Issues and PRs to discuss during the meetings of the TSC. labels Dec 3, 2018
@Trott Trott closed this as completed in ff7d053 Dec 4, 2018
@Trott
Copy link
Member

Trott commented Dec 4, 2018

I'm under the impression that this can be closed now that #24757 has landed. I understand the TSC has things to discuss related to this issue, but I believe they're meta-issues and the specific problem is solved. If I'm mistaken, please re-open or comment to that effect, of course. Thanks! /cc @dougwilson @lpinca

@Trott
Copy link
Member

Trott commented Dec 4, 2018

(Or perhaps this should stay open until that change lands in v10.x?)

@lpinca
Copy link
Member

lpinca commented Dec 4, 2018

I think it can be closed. This behaviour change discussed in this thread has been there since Node.js v9.0.0.

@mcollina mcollina removed the tsc-agenda Issues and PRs to discuss during the meetings of the TSC. label Dec 5, 2018
BridgeAR pushed a commit that referenced this issue Dec 5, 2018
Destroy the socket if the `'clientError'` event is emitted and there is
no listener for it.

Fixes: #24586

PR-URL: #24757
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
MylesBorins pushed a commit that referenced this issue Dec 5, 2018
Destroy the socket if the `'clientError'` event is emitted and there is
no listener for it.

Fixes: #24586

PR-URL: #24757
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
refack pushed a commit to refack/node that referenced this issue Jan 14, 2019
Destroy the socket if the `'clientError'` event is emitted and there is
no listener for it.

Fixes: nodejs#24586

PR-URL: nodejs#24757
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
http Issues or PRs related to the http subsystem.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants