-
-
Notifications
You must be signed in to change notification settings - Fork 311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consecutive HTTP requests cause socket hang up on node v20 #5765
Comments
Would be also interesting to find out why this only happens with Lighthouse as all other CLs don't have this issue. Maybe something related to set headers / keep-alive / connection header. |
Does not seem to be related to HTTP response headers
socket hang up only happens with Lighthouse |
After digging a bit this is related to how the socket is being handled by the two sides below the http layer. Why only Lighthouse is sending the As a tl/dr; for the short term it appears that using an That PR's in node-fetch/node-fetch#1736 is approved and waiting for merge as of Apr 12. It defaults to the node agent's keep-alive setting which will keep the socket open for reuse without having to pass an agent to skip a conditional check that is removed. In the mean time the workaround for passing a keep-alive agent suggested in that PR should work. The core issue though is how the sockets are being recycled in node though. After reading the other thread linked in "additional context" I started my spelunking in the node repo. mcolina mentioned in this comment that he agrees with benbnoordhuis how its possible to pop an invalid socket which may resolve part of the issue. Adjusting that may help the situation and could provide a performance boost commensurate with this comment. Although if that was the only issue it would not explain why 100% of the time an error gets thrown as there should be some percentage of the sockets that get shifted from the checked side of the array under loads. @DevasiaThomas mentioned just below that he is working on a fix as of Apr 21. I tend to agree with @DevasiaThomas though that it will take more than just updating that I have pinged him here to see if he has made headway and would like help collaborating on the solution. I will be happy to help as this issue causing issues with us upgrading our project to node 20. I dropped some hints to myself and others below as I dig. I will update the below as I go with how to resolve the issue. Some code notes for potential resolution:There is a note in Socket.prototype._destroy about I found this line in the socket that appears to be getting called after the write side of the client socket is closed by the server and the socket is written to again. Reference onReadableStreamEnd that throws the error during the microtask que with process.nextTick. |
node-fetch/node-fetch#1736 seems to fix the issue, if sim tests are also passing on node 20 I think we are good to go |
@matthewkeil - just saw your above post. With respect to this:
I don't know if you saw some of musings on this in the post here node-fetch/node-fetch#1735 - but yes, I came to the same conclusion as you (from my investigations and code digging) - the issue I believe is that packets can still be sent to the socket before it is cleared up in a following microtask. Any packets that get sent trigger an ECONNRESET I'm not sure if nodejs/node#47130 is really moving ahead sadly. Perhaps it could do with another pair of eyes, but I don't have the time right now sadly. |
Describe the bug
Consecutive HTTP requests cause socket hang up on node v20. This only happens with some servers, so far I could only reproduce this with Lighthouse. This issue causes the sim tests to fail and might cause other issues, e.g. when running Lodestar VC with a Lighthouse BN.
Expected behavior
No socket hang up / requests failures
Steps to reproduce
Running the following script will cause the first request to pass but the second one fails with socket hang up error
Yielding to macro queue in-between the two requests does not produce an error
Additional context
The issue is already reported upstream
There are also some suggested workarounds but those can't easily applied in our case since we are using cross-fetch and require browser compatiblity.
Operating system
Linux
Lodestar version or commit hash
6e01421
The text was updated successfully, but these errors were encountered: