-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTP/1 request destroy behavior change on framing error #24586
Comments
@nodejs/http |
I think the problem is that the client socket is paused. If you add |
Thanks @lpinca . So there are I guess two issues with that suggestion: (1) this was just a replication all within Node.js, but originally the client was a C library, not Node.js, and (2) the question here is really about the server; I don't think the server really gets a say over what the clients are doing in the end (I mean, the client sent bad framing already :) ). Is the suggested solution here that whatever client code out there just needs to be altered such that Node.js streams on the server code are acting like they used to? Or are you saying something else? |
Just thinking more about how to articulate my thoughts: I feel like when a framing error occurs when the client is transmitting the body of the http request, there should be some way to understand on the I'm just generally at a loss of how, in Node.js 10+, you can tell that a request is in an error state any more to do able to process it differently (for example, skip even bothering to read the request body, typically just A typical framing mistake clients will make in chunked mode is they end the request with It seems that one difference here is that in Node.js 8 and below this mistake would end up calling |
No, I don't think so, the problem in the example is that no one is closing the TCP connection because there is buffered data to consume on the client. Only when buffered data is consumed on the client the socket is closed but this is how it works in Node.js. If the C lib was closing the socket then no changes should be necessary. The question is more about how the parser handle errors. If the socket is not explicitly destroyed on error, it is still readable as long as there is buffered data to read. |
Right, but that is what I'm saying: the c client is not consuming the data, just like the code I provided above. It was working in node.js 8. How can I get the server to behave the same way in Node.js 10 given the same client behavior? |
Make the client close the connection? One of the two peer must do that. If that particular client (not affected by the Node.js breaking change) was doing it before, why isn't it doing now? Or does it? |
And to clarify: what I mean by not consuming is just the term you were using. Specifically in the c code, it is reading the data, but it does not send back a ACK to the server's FIN, which is why Node.js thinks the other side is open still (I used Wireshark to see the exact packets being exchanged). The c client sees the HTTP 400 Bad Request response, but it's some poorly written thing by a vendor and since that doesn't have connection: close it it assuming the default http/1.1 behavior of keep alive I guess? I don't know why it does not ACK the server's FIN. But what I can probably do in Express.js is just also check the writable state of the socket. Probably just assume when socket.writable === false, then don't bother trying to read the request since something has happened where the socket is half closed now. I don't exactly understand yet what the ramifications of such an assumption would be, though. Ideally if there was a way to know that there was a parser error that got the node.js side into this state... |
It is some vendor code, I can't just alter the client code... I just want to be able to skip reading a request that is in a broken state on the node.js side is all. Lots of clients are going to interact with a server that are not under the control of the same people who control the server, right? |
Yeah definitely. I think one way is to use the |
But that doesn't really help in my example above. The example is that there is code that executes to read a request (the body of setTimeout) that runs at a later point (after doing things like looking up and validating auth headers). Can you share what an example would look like to out un place of the req.socket.readable in your example? Or are you saying that the solution is that it will just work as it as long as I add a custom clientError event listener to the server object that just reinstates the old socket.destroy() behavior? That's unfortunately not possible to make a change in Express.js to fix the behavior for users since Express.js does not create the server object to attach event listeners to; it just returns the requestListener function the user needs to pass to their own http (or https) server. |
This.
How about this hack: function destroy(err, socket) {
socket.destroy(err)
} and inside the req.socket.server.on('clientError', destroy); Edit: ofc the event listener should be added only once. |
Oh, didn't realize there was a server property on the socket. I don't see that documented anywhere, is that public API? So two questions on that, though (1) would there be any negative consequences doing that if existing servers are doing something in that event and (2) what is a good way to only add the event listener once? |
Yes, potentially, if something is being written on the socket with multiple
I think, |
Ok, so circling back around to my very initial question: is this an expected change and Express.js and it's users just need to deal with the new behavior, as there us nothing Node.js will be changing in this regard, is that correct? @lpinca |
That's just my opinion though, let's see wait for other collaborators to chime in. |
So a general streams question: if a stream has an error while it's in a paused state, is there no way to know the stream is in error when you start reading the stream? Right now this seems like a bad stream design as it is: the request errored out and there is apparently zero way to know this. It would of course be nice to know this before allocating buffers, decoders, etc to start reading into on the server. But even worse is that there seems to be zero way to realize this even after you start reading in the example above; the only thing the server can do is wait for the (default 2 minute) timeout on the socket, even though the client cannot send any more data since the http_parser entered into an error state. |
@dougwilson I think that's an issue with all streaming parsers. You don't know about the error until the bad chunk is actually processed. There is no error at TCP layer in this case. |
Right, but I'm saying in this case there is zero way, during the read, to know about the http parse error. This was possible in previous Node.js version by listening to req.socket.on(error ... I think this is a regression in Node.js 10. How can you tell that the error occurred in this case to just stop the read? There is an error occurring here, but no way to see it in Node.js 10 like you could in previous Node.js versions. |
In previous versions of Node.js there is an error emitted on the socket. See the issue #24585 for the example to run to reproduce. |
The clientError event is unusable because it happens outside of the code that would be trying to read the request body. |
The I didn't test but if you use the |
/cc @nodejs/tsc |
I'm closing this because it seems Node.js core is not interested in getting Express.js to correctly function on current Node.js version like 10+. |
Destroy the socket if the `'clientError'` event is emitted and there is no listener for it. Fixes: nodejs#24586
cc @nodejs/moderation can one of you help here? This is quite unusual and I am on mobile, thanks. |
I've deleted some comments as per request. |
So the root issue here is that paused streams don’t get errors? There is a lot to digest here and it seems that something is not right but it is unclear what that might be. |
@Fishrock123 no, the issue is that the socket is no longer destroyed on parse error as per f2f391e. The server socket waits for the client to close the connection. However this might never happen. |
I'm under the impression that this can be closed now that #24757 has landed. I understand the TSC has things to discuss related to this issue, but I believe they're meta-issues and the specific problem is solved. If I'm mistaken, please re-open or comment to that effect, of course. Thanks! /cc @dougwilson @lpinca |
(Or perhaps this should stay open until that change lands in v10.x?) |
I think it can be closed. This behaviour change discussed in this thread has been there since Node.js v9.0.0. |
Destroy the socket if the `'clientError'` event is emitted and there is no listener for it. Fixes: nodejs#24586 PR-URL: nodejs#24757 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Matteo Collina <matteo.collina@gmail.com> Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
I noticed a different in how the HTTP/1 request objects are left after a HTTP/1 framing error (which causes an error in the underlying http_parser). In Node.js 8 and below, the request object would be destroyed and be left in non-readable state (i.e.
req.readable === false
). It seems like in Node.js 10+ this is no longer the case. Is this an expected change? Express.js has underlying machinery that is looking to see if a request body should be read (since the request starts out in a pasued state since 0.10) and it doesn't attempt to read when it's no longer readable (since events like'end'
will not be emitted, leaving things hanging around forever).Here is an example that reproduces the scenario:
In Node.js 8 and lower you get the following output:
In Node.js 10 you get the following output:
I'm not sure if this was an expected change or not. If it is an expected change, I'm just looking for what the correct way to know if a bit of code (the part inside
setTimeout
) that runs at an arbitrary time after the request came in should know if it's actually ever going to get a complete read or not. The only thing I've found so far in Node.js 10 is that you have to try and read no matter what and just rely on coding in a read timeout to know when to give up instead of knowing earlier.The text was updated successfully, but these errors were encountered: