-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider using Happy Eyeballs or similar in SocketsHttpHandler #26177
Comments
cc: @stephentoub |
Thanks @JustArchi for your report! |
@wfurt can you please take a look? |
@karelz I checked .NET Core 2.1 rc1 on Windows and couldn't reproduce the issue, got |
I did quick try on Ubuntu 16.04 and I also get OK. What base OS do you use @JustArchi ? |
It's Debian Testing (currently I also thought the OS could have something to do with it, but then curl handler wouldn't work either, so it's possible that it's some OS-layer incompatibility. |
It's also good to note that I couldn't reproduce it with other I checked my other dev machine running on Debian sid and I also couldn't reproduce my issue there either, I'll set up other one on testing in the meantime to ensure it's not on Debian's end, but then again, if it was some IP-related block or likewise, curl shouldn't work either. |
I tested another machine on Debian Testing and couldn't reproduce the issue either, so it has to be something with one of my machines. I wonder how I can help narrow down this issue and what is the root cause of it in the first place. Do you have any idea what I could provide do to help? In the meantime I'll keep looking, maybe I find some factor that could possibly be causing this on OS side. |
You can always try packet capture. You can also check if the name resolves to same IP address. |
@JustArchi so it seems it is one machine problem (with a specific URL) on Debian development branch. Is that correct? |
Yes, one machine for now, and I found out the first clue. My dev machines that I tried to reproduce this issue on (and had no luck) don't have IPv6, they used IPv4 address of the Since I don't have more IPv6 machines around, could you try to reproduce this bug for me on IPv6-enabled Linux machine? I don't believe Debian has anything to do with it, so it could be any Linux machine. I wonder if curl handler also uses IPv6 by default, this could be the reason why I see behaviour change between those two, gonna keep digging and let you know if I found out something more. It could also be some IPv6-related bug in socket handler, although in this case I'm pretty sure somebody else would find it out much sooner than me, maybe it's some specific combination, I'll try to find out. |
can you do packet capture and trace the attempt? I'm wondering if IPv6 is really used or if it falls back to IPv4 after some time. |
I did the test and I indeed confirmed my initial thought, this is not a bug in socket http handler itself, but rather different behaviour compared to curl handler. Could be also called a regression if we're comparing default behaviour between .NET Core 2.0 and 2.1. In general, my machine can't access What is important is the fact that it seems that curl handler tries to fallback to IPv4 when IPv6 fails, this is why curl handler works. If I force
While
(We're getting 403 but it's irrelevant, the connection being established matters) So what is curl doing by default? Let's find out:
Like you can see, it tries IPv6 first, fails, then tries IPv4 next. Question is, if this is intended behaviour (then we can close the issue), or if perhaps we could do something to improve it (e.g. make it work like curl handler), since I strongly believe that it'd greatly benefit I'm pretty sure that you can reproduce this reliably with any IPv6-enabled machine requesting any domain with |
SocketsHttpHandler should be trying all the IP addresses (IPv6 and IPv4) that are returned from the DNS resolver API calls. On Windows, there is an API to connect to the DNS name which will automatically try IPv6 and IPv4 in parallel to speed up connection result so that it won't have to waste time failing on the IPv6 address before it gets connected on the IPv4. |
I wonder how I can confirm this then or debug the issue further, since this issue doesn't happen if I manually hardcode I took a quick look if perhaps data returned by DNS resolve could something to do with it, but it looks fine:
|
It might have something to do with the fact that IPv4 query is not even being tried since we time out with IPv6, and we don't have enough of time to try IPv4 next. I mean, it's not intended to send 2 requests through 2 different IP addresses right away, so there has to be some kind of timeout to move forward. If CURL got "stuck" on that IPv6, then it'd time out as well, eventually. But it's somehow smart and detects in a fraction of second that IPv6 connection fails, then moves out to IPv4 immediately, making it in time regarding supplied timeout. Socket handler probably times out on IPv6, and doesn't even have enough of time to try out IPv4 next. This is just my theory though, I really don't know socket handler internals, you're the expert here. There is definitely a bit different behaviour regarding handling this though, since switching to curl handler solves the initial issue for me, as well as hardcoding IPv4 address manually in |
just for the record, I verified that it can work via IPv6 - if IPv6 works. |
Yeah the fact that I can't connect through IPv6 is definitely my machine issue and I'll solve it in one way or another, this is a technical difficulty that is on me. I'm just wondering now if there is anything to improve here in this case or we should keep it like that, since technically it's not broken, but it could be improved in a way to work like curl handler did. If people have such "broken" setups (or rather lack of IPv6 without even knowing about it) then they might see regressions on linux with .NET Core 2.1 where curl handler worked just fine with it on .NET Core 2.0, while new socket handler no longer does and times out. |
It would be nice to have SocketsHttpHandler more resilient, however, I don't think we should treat it as compatibility problem between curl handler and SocketsHttpHandler - rather as enhancement. (unless we find out lots of people hitting it) I would strongly recommend to not use the curl handler as workaround, rather use the IPv4 address. Otherwise you will be stuck in the past. Our plan is to eventually get rid of the curl handler entirely (hopefully in next major release). |
It's alright, thank you a lot for your help, I'll leave this issue open as an enhancement, but feel free to close it if you decide that it's not worth it to improve socket handler in this regard. In the meantime I'll see how I can fix IPv6 on my machine 🙂 Have a nice day! EDIT: In case somebody would have similar issue on linux, I added |
From offline discussion: There is a concern that SocketsHttpHandler does not try all entries returned by DNS. We should check it. That would qualify as something we might need to fix (potentially in servicing). |
@rmkerr can you please take a look at SocketsHttpHandler and how it deals with multiple DNS entries? |
Yep, I'll take a look! |
Based purely on code inspection this looks correct on the
There could still be a bug in |
I wonder if we have some logging in Sockets/SocketsHttpHandler that could help us confirm what is happening on @JustArchi's machine. Maybe in combo with Wireshark ... |
How long does it take to try new address @rmkerr ? There was overall 60s timeout on HTTP. Also from diagnostic @karelz
It would be really nice if this clearly states we were not able to connect. According to @JustArchi IPv6 never worked on his machine so TCP would never been established. Cryptic message like the one above is not that helpful for troubleshooting. |
@karelz it might be useful to have logs if I can't reproduce the issue. I have not been able to repro the issue on Windows, but I will try on Linux before taking that approach. @wfurt I'm not sure of the exact time, but it is far less than 60 seconds. When running the app in a console on windows there is no visible delay. I think this is likely a Linux specific issue though, so I will try it there next. |
What APIs? |
This has been resolved via the API added here in .NET 5: #41949 |
|
I know commenting to remind people "why isn't this done yet" isn't the most productive, but I just want to make the weight of this issue clear: As long as this isn't implemented, |
Sorry to shill my own blog, but for the people stumbling upon this, I made a decently robust implementation you can use: https://slugcat.systems/post/24-06-16-ipv6-is-hard-happy-eyeballs-dotnet-httpclient/#the-implementation |
Repro: HttpClientBug.zip
I reproduced this one on Linux and I didn't have much luck on Windows.
Run repro with
dotnet run
. After a default timeout of around 60 seconds, you'll get:Doing the same by forcing older curl handler:
Please note that this issue is specific and not reproducible with just any
https
server, as majority of them work just fine. I encountered this issue when accessinghttps://translate.google.com
, which is what I used in my repro above.I reproduced this bug on latest
master
as well as .NET Core 2.1 rc1.This bug could be some sort of regression because previously my app running
master
SDK worked just fine with this URL, includingSocketHttpHandler
that I used for a longer while. It could also be regression caused by Google's servers configuration change that triggered bug existing in the code since quite some time, which is more likely. Of course this one is not reproducible on .NET Core 2.0, since there is noSocketHttpHandler
there.Thank you in advance for looking into this.
[EDIT] Inline C# source code by @karelz
The text was updated successfully, but these errors were encountered: