-
Notifications
You must be signed in to change notification settings - Fork 565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UND_ERR_CONNECT_TIMEOUT
errors thrown when there is CPU intensive code on the event loop
#3410
Comments
I think something like this would also do the trick: diff --git a/lib/util/timers.js b/lib/util/timers.js
index d0091cc1..fa56be8b 100644
--- a/lib/util/timers.js
+++ b/lib/util/timers.js
@@ -2,13 +2,13 @@
const TICK_MS = 499
-let fastNow = Date.now()
+let fastNow = 0
let fastNowTimeout
const fastTimers = []
function onTimeout () {
- fastNow = Date.now()
+ fastNow += TICK_MS
let len = fastTimers.length
let idx = 0 Though it needs some work to update tests. |
@mknichel any progress or updates on this one? My Vercel/Next.js based projects are facing bad experiences. |
I am currently working on this |
Any updates? |
I am on it ;) |
this bug is potentially still existing. we need more investigating. |
Hi, I've been getting the UND_ERR_CONNECT_TIMEOUT for a few months now. For my use case I have a default Next.js 14 App, using next-auth@beta with GitHub as provider. I go to http://localhost:3000/api/auth/signin and attempt to sign in.
Code: https://github.com/philipsolarz/my-next-app/ Any ideas or hints? |
ah, a new reproduction. let me check it |
I dont see how i can provoke the issue. Do you have discord? twitter? something were we could chat? |
@philipsolarz |
What OS are you running & what's your hardware configuration? I've been having the issue in prod for the past three-ish months, but never had it locally... |
I also got same issue. MacOS
|
It is not because of ipv6 or ipv4 |
Same issue here. |
Has anyone figured out how to solve this issue yet? |
I can reproduce this issue sometimes when I am trying to update next itself or installed some new dependencies. |
Bug Description
Work on the event loop can interrupt the Undici lifecycle for making requests, causing errors to be thrown even when there is no problem with the underlying connection. For example, if a fetch request is started and then work on the event loop takes more than 10 seconds (default connect timeout), Undici will throw a
UND_ERR_CONNECT_TIMEOUT
error even if the connection could be established very quickly.I believe what is happening is:
setTimeout
with the value of theconnectTimeoutMs
to throw an error and cancel the connection if it takes too long (https://github.com/nodejs/undici/blob/main/lib/core/connect.js). It makes a call toGetAddrInfoReqWrap
(https://github.com/nodejs/node/blob/main/lib/dns.js#L221), but this is asynchronous and processing of the callback will be delayed until the next event loop.onConnectTimeout
timer is run because the previous task took longer than the timeout.onConnectTimeout
callssetImmediate
with a function to destroy the socket and throw the error. https://github.com/nodejs/undici/blob/main/lib/core/connect.jsGetAddrInfoReq
lookup callback (emitLookup
innode:net
) is run. This code begins the TCP connection (internalConnect
is called in https://github.com/nodejs/node/blob/main/lib/net.js#L1032) but that is also asynchronous, so it won't finish in this round of the event loop.setImmediate
function is run in the next phase which destroys the socket and throws theUND_ERR_CONNECT_TIMEOUT
error.Internally at Vercel, we have been seeing a high number of these
UND_ERR_CONNECT_TIMEOUT
issues while pre-rendering pages in our Next.js application. I can't run this task on my local machine so it's harder to debug, but it's a CPU intensive task and moving fetch requests to a worker thread eliminated the Undici errors. We tried other suggestions (like--dns-result-order=ipv4first
and verified that we were not seeing any packet loss) that did not resolve the issue. Increasing the connect timeout resolves the issue in the reproduction but not the issue in our Next.js build (which I can't explain).Reproducible By
A minimal reproduction is available at https://github.com/mknichel/undici-connect-timeout-errors.
We can reproduce the behavior on Node 18.x and 20.x and with the
5.24.0
and the latest version of Undici (6.19.2
)Expected Behavior
The Undici request lifecycle could operate on a separate thread that does not get blocked by user code. By separating it out from the user code, this would remove impact of any user code on requests.
To test this theory, we created a dispatcher that proxied the fetch request to a dedicated worker thread (
new Worker
fromworker_threads
). This eliminated all the Undici errors that we were seeing in our Next.js build.Logs & Screenshots
In the minimal reproduction, the error is:
In our Next.js builds, the error is:
Environment
The reproduction repo was erroring for me on Mac OS 14.4, while internally we are seeing issues on AWS EC2 Intel machines.
Additional context
Vercel/Next.js users have reported
UND_ERR_CONNECT_TIMEOUT
issues to us:The text was updated successfully, but these errors were encountered: