-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The http parser is not async, why does it hook into the async_hook machinery? #108
Comments
Can you reference some files you saw? I did a little exploration on Maybe after nodejs/node#48528 lands we can explore whether it's possible to remove it. |
This is regarding the http server implementation. |
Someone at some point decided they wanted to track http parsing as a resource as an http request would otherwise just look like a plain TCP socket. There's no new "resource" object created for an http request, it just upgrades a TCP socket to an HTTP connection. Because of how async_hooks is tied to the concept of a single id mapping to a single resource object the resource object of the TCP socket can't be reused for the HTTP socket events without breaking things. The choice was either don't differentiate HTTP connections in async_hooks or find some other object to be the "resource" of the http request. For whatever reason it was decided that making the http parser the resource was an appropriate solution. That decision then also led to discovering the http.Agent pooling problem on the client request side, but the use of the http parser as the resource object for an HTTP requests happens for both incoming and outgoing HTTP. This is one of the countless reasons I want us making more purpose-focused APIs so we can mark the entire async_hooks API as deprecated or internal. Unfortunately removing this behaviour will probably break users, which is why I'm pushing so hard on getting AsyncLocalStorage off async_hooks so we can get the biggest use case off of it and not have to care so much anymore about breaking things. |
It's all about tracking individual async "operations". For a long time async hooks was the thing to track this. By introducing The current PR from @Qard moves parts of this propagation work into v8 as there was helpful functionality added. There is work ongoing to move even more into the JS language (see https://github.com/tc39/proposal-async-context). But still it's usercode which has to somehow "mark" individual operations (HTTP Requests) on shared resources (Socket). This is currently done via There is for example |
@Qard Why? All code in http parser is synchronous, so there is literally no reason for it. The only fix that might be required is adding an
@Flarna the http parser is not async at all. It's a purely synchronous library. The fact that the http parser has this behavior implemented in C++ (vs JS) requires more back-and-forth across the C++/JS boundary, slowing everything down. Undici got a performance speed-up by dropping the native http parser and moving to the same code running inside WASM (wasm is slower than native code). |
Yes, the http parser is sync but HTTP operations itself are not. For whatever reason the parser instance was used as the resource object to identify/track an individual HTTP request at some point in time. As the HTTP Parser instance is reused between client and server and across requests it requires extra care unfortunately. One could say it was a bad choice but well that's a bit late now. I think a much better object to track an HTTP request would be the Another variant to reduce this binding would be to avoid reusing the HTTP parser instances. This was tried in nodejs/node#24330 which was replaced by nodejs/node#25094. Even nodejs/node#25094 says "do not reuse HTTPParser" in the headline it's misleading because the HTTPParser itself is still reused but on every reuse a new object is actually used as async resource. I think a comparison to unidic is not that fair here. HTTP in node has a terrible long history and any refactoring holds quite some risk to break a lot. I'm quite sure a rewrite with knowledge of now and without thinking about compatibility issues would result in a significant performance improvement in HTTP. But I guess this issue is the wrong place to discuss a HTTP rewrite :o). |
@mcollina It's not needed for async tracking. It is needed for differentiating raw TCP connections from HTTP connections. It's probably possible to replace it with an equivalent AsyncResource somewhere else. All I meant to convey was that just plain removing it will likely break users because some might be using the different resource type to identify sockets as HTTP requests so without that we might break them. |
The http parser does quite a bit of additional JS/C++ transitions by being observable via async_hooks. In fact, a wasm build is faster than C++ by avoiding that.
(this is just a hunch, but the benefits should be there).
The text was updated successfully, but these errors were encountered: