Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Happy Eyeballs support (address IPv6 issues in Node 17) #41625

Closed
andreialecu opened this issue Jan 21, 2022 · 24 comments · Fixed by #44731
Closed

Happy Eyeballs support (address IPv6 issues in Node 17) #41625

andreialecu opened this issue Jan 21, 2022 · 24 comments · Fixed by #44731
Labels
dns Issues and PRs related to the dns subsystem. feature request Issues that request new features to be added to Node.js. libuv Issues and PRs related to the libuv dependency or the uv binding. net Issues and PRs related to the net subsystem.

Comments

@andreialecu
Copy link

andreialecu commented Jan 21, 2022

What is the problem this feature will solve?

It appears that node 17 has merged this PR: #39987 which may result in IPv6 addresses being sorted first.

As per an answer in this reddit thread from @treysis: https://www.reddit.com/r/ipv6/comments/qbr8jc/comment/hhmg7uz/?context=3 it appears that Happy Eyeballs is not implemented however.

The lack of Happy Eyeballs support can result in connections failing. In particular, there are a lot of data points related to similar support lacking from the OkHttp library which is used heavily on Android and results in connectivity issues for a relatively big number of users. Ref: square/okhttp#506 (comment)

The problem is that consumer routers sometimes glitch out on IPv6.

A router I manage in one particular location, which is a consumer grade TP-Link router, will routinely hang on IPv6 connections about once per month, until it is restarted. More on this here: square/okhttp#6954 (comment)

I have noticed that Node 17 will then pretty much hang on outgoing connections for up to one minute before eventually being able to connect. Restarting the router resolves the issue.

What is the feature you are proposing to solve the problem?

Implement the Happy Eyeballs algorithm.

What alternatives have you considered?

There is an user-land patch here: https://www.npmjs.com/package/happy-eyeballs by @zwhitchcox which might help as a workaround.

@aduh95
Copy link
Contributor

aduh95 commented Jan 21, 2022

I agree Node.js would benefit having this built-in. Is this something you'd be interested in working on @andreialecu?

@tniessen tniessen added the dns Issues and PRs related to the dns subsystem. label Jan 21, 2022
@tniessen
Copy link
Member

I am not sure what subsystem this would fit into. It seems that Happy Eyeballs not only applies to HTTP (http, http2, https), but also to other IPv4/IPv6 sockets, i.e., any TCP (net) or UDP socket (dgram).

@andreialecu
Copy link
Author

andreialecu commented Jan 21, 2022

There's a writeup on how this is implemented in OSX here by an Apple engineer:

https://mailarchive.ietf.org/arch/msg/v6ops/DYiI9v_O66RNbMJsx0NsatFkubQ

The updated implementation performs the following:

  • Query the DNS resolver for A and AAAA.
    If the DNS records are not in the cache, the requests are sent back to back on the wire, AAAA first.
  • If the first reply we get is AAAA, we send out the v6 SYN immediately
  • If the first reply we get is A and we're expecting a AAAA, we start a 25ms timer
    • If the timer fires, we send out the v4 SYN
    • If we get the AAAA during that 25ms window, we move on to address selection
  • When we have a list of IP addresses (either from the DNS cache or by receiving them close together with v4 before v6),
    we perform our own address selection algorithm to sort them. This algorithm uses historical RTT data to prefer addresses
    that have lower latency - but has a 25ms leeway: if the historical RTT of two compared address are within 25ms of each
    other, we use RFC3484 to pick the best one.
  • Once the list is sorted, we send out the SYN for the first address and start timers based on average and variance of the
    historical TCP RTT. Roughly speaking, we start the second address around the same time we send out a SYN retransmission
    for the first address.
  • The first address to reply with a SYN-ACK wins the race, we then cancel the other TCP connection attempts.

There's also an official RFC at: https://datatracker.ietf.org/doc/html/rfc8305

Since it deals with SYN packets, I'm not sure if this can be implemented in JS land or it needs to touch the C++ networking layer. I'm not familiar at all with nodejs networking internals.

@treysis
Copy link
Contributor

treysis commented Jan 21, 2022

Yes, I believe it should go into net. Doesn't the http module make use of net in the end as well?

Happy Eyeballs with UDP being connectionless is a bit more complicated.

@tniessen
Copy link
Member

Doesn't the http module make use of net in the end as well?

Mostly, yes, at least by default.

As of 2017, IPv6 brokenness is now generally regarded as a non-problem. Wikipedia

I am wondering how much complexity we can justify just to work around broken IPv6 stacks. The algorithms seem useful even beyond IPv6 compatibility, but so far I have no idea how we could implement this. We probably cannot do this for all TCP connections, at least not by default. We could maybe add an option to socket.connect() to enable this behavior. However, an implementation would probably require changes within libuv first, which is responsible for establishing TCP connections.

@treysis
Copy link
Contributor

treysis commented Jan 21, 2022

We probably cannot do this for all TCP connections, at least not by default.

Why not? If I am not wrong this is the default for all Happy Eyeballs implementations. Though it uses some kind of cache.

@andreialecu
Copy link
Author

andreialecu commented Jan 21, 2022

As of 2017, IPv6 brokenness is now generally regarded as a non-problem.

That statement is probably not factual. At least not at scale. Even if 1% of users are affected, depending on the number of users, that can add up quite a bit.

Reddit has recently disabled IPv6 due to issues they were having:
https://reddit.com/r/ipv6/comments/r047ah/redditcom_has_deleted_aaaa_records_and_is_back_to

Also their Android app does not implement Happy Eyeballs, and here's a report specifically pinpointing this to be the problem:
https://reddit.com/r/ipv6/comments/qy70hv/new_phone_failing_to_load_any_ipv6_websites_on/hlebckp/?utm_source=reddit&utm_medium=web2x&context=3

There's also a huge thread here: aws-amplify/amplify-js#5539 about an Android only issue that I was also running into, and which was specifically about Happy Eyeballs.

Additionally, there are many other threads in other JS ecosystems related to Happy Eyeballs: facebook/react-native#29608 (comment)

There are tons of reports still popping up. And it goes unreported most of the time, because it's very random and pretty much unreproducible since it's hard to actually pin it down to IPv6 being the culprit.

@tniessen
Copy link
Member

@andreialecu Don't get me wrong, I am not saying there are no issues. I frequently run into some myself. That was just me quoting Wikipedia. IPv6 is older than myself and I really wish we didn't have to worry about broken IPv6 stacks at this point 😄

Either way, as I said in my last comment, someone would probably have to implement this in libuv before we could consider adding it in Node.js.

@andreialecu
Copy link
Author

Either way, as I said in my last comment, someone would probably have to implement this in libuv before we could consider adding it in Node.js.

Perhaps this might help with that: https://github.com/shtrom/happy-eyeballs-c

@tniessen tniessen added net Issues and PRs related to the net subsystem. libuv Issues and PRs related to the libuv dependency or the uv binding. labels Jan 21, 2022
@bnoordhuis
Copy link
Member

someone would probably have to implement this in libuv before we could consider adding it in Node.js

Not necessary, just call uv_tcp_connect() on two uv_tcp_t handles and keep the winner.

@tniessen
Copy link
Member

someone would probably have to implement this in libuv before we could consider adding it in Node.js

Not necessary, just call uv_tcp_connect() on two uv_tcp_t handles and keep the winner.

My understanding was that libuv might establish multiple connections in that case, whereas the algorithm above only sends SYNs until one receives a reply. Maybe that's wrong, or maybe establishing connections isn't that bad.

@bnoordhuis
Copy link
Member

You uv_close() the loser. Do that before the ACK is received and the connection was never established.

@github-actions
Copy link
Contributor

There has been no activity on this feature request for 5 months and it is unlikely to be implemented. It will be closed 6 months after the last non-automated comment.

For more information on how the project manages feature requests, please consult the feature request management document.

@danopia
Copy link

danopia commented Aug 26, 2022

Reddit has recently disabled IPv6 due to issues they were having: https://reddit.com/r/ipv6/comments/r047ah/redditcom_has_deleted_aaaa_records_and_is_back_to

Also their Android app does not implement Happy Eyeballs, and here's a report specifically pinpointing this to be the problem: https://reddit.com/r/ipv6/comments/qy70hv/new_phone_failing_to_load_any_ipv6_websites_on/hlebckp/?utm_source=reddit&utm_medium=web2x&context=3

While Reddit is still in a staged rollout of IPv6, they've already published about implementing Happy Eyeballs on Android: https://www.reddit.com/r/RedditEng/comments/v1upr8/ipv6_support_on_android/

@mcollina
Copy link
Member

cc @nodejs/tsc this might be something we should look to implement before v18 hits LTS.

@bnb
Copy link
Contributor

bnb commented Sep 14, 2022

+1 this seems to be directly affecting undici / undici-fetch. It would be nice to not have that baked into v18 LTS 😊

@GeoffreyBooth
Copy link
Member

we should look to implement before v18 hits LTS.

My team and I have been bitten by this issue (specifically with regard to localhost). Big +1 for fixing it and backporting the fix.

@treysis
Copy link
Contributor

treysis commented Sep 14, 2022

I believe 3rd party package maintainers will realize they have to listen on ::1 as well.

@undergroundwires
Copy link

How did you solve this? GitHub runners do not support IPv6 and I cannot use fetch (native fetch i.e. nodejs/undici)

I tried the following but it did not solve it:

import dns from 'dns';
dns.setDefaultResultOrder('ipv4first');

@treysis
Copy link
Contributor

treysis commented Mar 15, 2024

@undergroundwires which NodeJS version are you using.

@undergroundwires
Copy link

Thank you @treysis for showing interest.

First I tested with latest 18 (18.19.1), then latest 20 (20.11.1), I send requests using (fetch) on GitHub runner.

It fails specifically on wikipedia.org requests.

Related issue for fetch: nodejs/undici#1531
Related issue for GitHub runners not being able to do IPv6: actions/runner-images#668, actions/runner#3138

@treysis
Copy link
Contributor

treysis commented Mar 18, 2024

@undergroundwires does your host support IPv4? Or is it running on IPv6-only AWS?

@undergroundwires
Copy link

undergroundwires commented Mar 19, 2024

I created a repository to reproduce this in as minimal way as possible:

https://github.com/undergroundwires/node-fetch-ipv6

I run fetch on using vite test runner on node and without any test runner. I get same results.

I realize that the first two requests are successful, I start getting the error after the second or third one.

See the test: test file

Second test (multiple fetch in order) fails, and all run fine locally. I run tests on default GitHub runners using both latest node 20 and 18. I get same errors:

stdout | vite-test.spec.ts > batch fetch in order
Fetching:  https://web.archive.org/web/20221029145712/https://kb.mozillazine.org/Downloads.rdf

stdout | vite-test.spec.ts > batch fetch in order
Fetching:  https://web.archive.org/web/20240120213614/https://techcommunity.microsoft.com/t5/windows-it-pro-blog/group-configuration-search-highlights-in-windows/ba-p/3263989

 ❯ vite-test.spec.ts  (2 tests | 1 failed) 24519ms
   ❯ vite-test.spec.ts > batch fetch in order
     → fetch failed

⎯⎯⎯⎯⎯⎯⎯ Failed Tests 1 ⎯⎯⎯⎯⎯⎯⎯

 FAIL  vite-test.spec.ts > batch fetch in order
TypeError: fetch failed
 ❯ vite-test.spec.ts:26:22
     24|   for (const url of urls) {
     25|     console.log('Fetching: ', url);
     26|     const response = await fetch(url, { method: 'HEAD'});
       |                      ^
     27|     await sleep(10000);
     28|     expect(response.status).to.equal(200);

Caused by: Caused by: ConnectTimeoutError: Connect Timeout Error
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
Serialized Error: { code: 'UND_ERR_CONNECT_TIMEOUT' }

Error: TypeError: fetch failed
 ❯ vite-test.spec.ts:26:22

Caused by: ConnectTimeoutError: Connect Timeout Error
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
Serialized Error: { code: 'UND_ERR_CONNECT_TIMEOUT' }

undergroundwires added a commit to undergroundwires/privacy.sexy that referenced this issue Mar 29, 2024
This commit introduces the `force-ipv4` GitHub action to address
connectivity issues caused by the lack of IPv6 support in GitHub
runners. Details:
- actions/runner#3138
- actions/runner-images#668

This change solves connection problems when Node's `fetch` API fails due
to `UND_ERR_CONNECT_TIMEOUT` errors. Details:
- actions/runner-images#9540
- actions/runner#3213

This action disables IPv6 at the system level, ensuring all outging
requests use IPv4. Resolving connectivity issues when running external
URL checks and Docker build checks.

This solution is a temporary workaround until GitHub runners support
IPv6 or Node `fetch` API has a working solution such as Happy Eyeball.
Detais:
- nodejs/node#41625
- nodejs/undici#1531
@undergroundwires
Copy link

Here's my workaround (open-source and documented) that I hope that can help you too:

After days of research and trial/error, this is how I got this working:

  1. Create a script called force-ipv4.sh, that configures system to prefer IPv4 over IPv6, call it to configure the machine. It was not easy to find a reliable cross-platform solution and I went Cloudflare WARP for DNS resolution along with some system configurations.
  2. To easily use the script in GitHub workflows, create GitHub action called force-ipv4 and call it in GitHub runners.
  3. Fixes the IPv6 request issues, and you can happily run e.g. fetch API from Node.

Related commit introducing this fix: undergroundwires/privacy.sexy@52fadcd

undergroundwires added a commit to undergroundwires/privacy.sexy that referenced this issue Mar 30, 2024
This commit upgrades Node.js version to v20.x in CI/CD environment.

Previously used Node 18.x is moving towards end-of-life, with a planned
date of 2025-04-30. In contrast, Node 20.x has been offering long-term
support (LTS) since 2023-10-24. This makes Node 20.x a stable and
recommended version for production environments.

This commit also configures `actions/setup-node` with the
`check-latest` flag to always use the latest Node 20.x version, keeping
CI/CD setup up-to-date with minimal maintenance.
Details:
- actions/setup-node#165
- actions/setup-node#160

Using Node 20.x in CI/CD environments provides better compatibility with
Electron v29.0 which moves to Node 20.x.
Details:
- electron/electron#40343

This upgrade improves network connection handling in CI/CD pipelines
(where issues occur due to GitHub runners not supporting IPv6).
Details:
- actions/runner#3138
- actions/runner-images#668
- actions/runner#3213
- actions/runner-images#9540

Node 20.x adopts the Happy Eyeballs algorithm for improved IPv6
connectivity.
- nodejs/node#40702
- nodejs/node#41625
- nodejs/node#44731

This mitigates issues like `UND_ERR_CONNECT_TIMEOUT` and localhost DNS
resolution in CI/CD environments:
Details:
- nodejs/node#40537
- actions/runner#3213
- actions/runner-images#9540

Node 20 introduces `setDefaultAutoSelectFamily`, a global function from
Node 19.4.0, enabling better IPv4 support, especially in environments
with limited or problematic IPv6 support.
Details:
- nodejs/node#45777

Node 20.x defaults to the new `autoSelectFamily`, improving network
connection reliability in GitHub runners lacking full IPv6 support.
Details:
- nodejs/node#46790
undergroundwires added a commit to undergroundwires/privacy.sexy that referenced this issue Mar 30, 2024
This commit upgrades Node.js version to v20.x in CI/CD environment.

Previously used Node 18.x is moving towards end-of-life, with a planned
date of 2025-04-30. In contrast, Node 20.x has been offering long-term
support (LTS) since 2023-10-24. This makes Node 20.x a stable and
recommended version for production environments.

This commit also configures `actions/setup-node` with the
`check-latest` flag to always use the latest Node 20.x version, keeping
CI/CD setup up-to-date with minimal maintenance.
Details:
- actions/setup-node#165
- actions/setup-node#160

Using Node 20.x in CI/CD environments provides better compatibility with
Electron v29.0 which moves to Node 20.x.
Details:
- electron/electron#40343

This upgrade improves network connection handling in CI/CD pipelines
(where issues occur due to GitHub runners not supporting IPv6).
Details:
- actions/runner#3138
- actions/runner-images#668
- actions/runner#3213
- actions/runner-images#9540

Node 20.x adopts the Happy Eyeballs algorithm for improved IPv6
connectivity.
- nodejs/node#40702
- nodejs/node#41625
- nodejs/node#44731

This mitigates issues like `UND_ERR_CONNECT_TIMEOUT` and localhost DNS
resolution in CI/CD environments:
Details:
- nodejs/node#40537
- actions/runner#3213
- actions/runner-images#9540

Node 20 introduces `setDefaultAutoSelectFamily`, a global function from
Node 19.4.0, enabling better IPv4 support, especially in environments
with limited or problematic IPv6 support.
Details:
- nodejs/node#45777

Node 20.x defaults to the new `autoSelectFamily`, improving network
connection reliability in GitHub runners lacking full IPv6 support.
Details:
- nodejs/node#46790
journey-ad added a commit to journey-ad/Bitmagnet-Next-Web that referenced this issue Jul 23, 2024
journey-ad added a commit to journey-ad/Bitmagnet-Next-Web that referenced this issue Jul 23, 2024
journey-ad added a commit to journey-ad/Bitmagnet-Next-Web that referenced this issue Jul 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dns Issues and PRs related to the dns subsystem. feature request Issues that request new features to be added to Node.js. libuv Issues and PRs related to the libuv dependency or the uv binding. net Issues and PRs related to the net subsystem.
Projects
Development

Successfully merging a pull request may close this issue.

10 participants