-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use AF_UNSPEC for name resolution #389
Conversation
This reverts most of 1ad97fb, but keeps tests which still have general applicability. The reason the original change was made was to try and work around a bug[1] in the eventlet library. Eventlet monkey-patches the socket.getaddrinfo function and replaces it with its own async, eventlet-aware implementation. The reason name resolution was broken in the first place is because eventlet was consulting DNS first, and then if that failed, falling back to /etc/hosts, which is just flat out incorrect behavior. It's important to note that this was *only* when running py-amqp under eventlet, and *only* for specific versions of eventlet that have long been fixed. So this workaround is not even needed anymore. With "normal" (non-eventlet) use, socket.getaddrinfo instead calls into the glibc getaddrinfo implementation, which ultimately uses libnss to resolve hostnames. However, there is an issue with the original workaround when using the default (glibc) getaddrinfo. The workaround (current) implementation explicitly forces resolution to use AF_INET (IPv4) and then only if that does not succeed, it in turn will try with AF_INET6 (IPv6). This generally works well for IPv4-only hosts, but can be unnecessarily slow for dual-stack IPv4/IPv6 hosts. Consider the following: - We want to connect to example.org - The /etc/hosts file contains an IPv6 entry: example.org f00d::1 - The /etc/nsswitch.conf file contains typical (simplified) hosts config: hosts: files dns In this case, the current code will involve nss iterating through the modules: - files (with AF_INET): fails, because there is no IPv4 address in /etc/hosts - dns (with AF_INET): may or may not succeed per-site, depending on how DNS is configured. If DNS is slow/misconfigured, this may incur a delay and block for a significant amount of time. - files (with AF_INET6): succeeds, and getaddrinfo returns f00d::1. Now in the same scenario as before, with this fix which reverts back to using AF_UNSPEC instead: - files (with AF_UNSPEC) succeeds, and getaddrinfo returns f00d::1. There is no need to involve DNS at all. Even a well-configured, quick-to-respond DNS server is going to be many orders of magnitude slower than consulting with /etc/hosts which libnss keeps cached in memory. [1] https://bugs.launchpad.net/neutron/+bug/1696094/comments/22
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the intent of the patch, however can you provide some additional unit & integration tests for the changes proposed?
I've added a unit test to ensure AF_UNSPEC is used with getaddrinfo. I'm not sure if an integration test makes sense here. |
just for sanity, can you dig it to see if integration test is viable in this case? |
Sorry, I somehow missed your reply here. No, I don't think it makes sense to have an integration test for this. What would we even be testing it against? Standing up a DNS server and making sure it replies the way we expect? I think the unit test ensuring that we are using AF_UNSPEC is enough, and to trust that the resolver is working as documented. |
It's 2022, nobody should still be using a vulnerable, unpatched glibc from 2015. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not expert for low level TCP/IP but seems OK and integration tests passed.
This reverts most of 1ad97fb, but
keeps tests which still have general applicability.
The reason the original change was made was to try and work around a
bug[1] in the eventlet library. Eventlet monkey-patches the
socket.getaddrinfo function and replaces it with its own async,
eventlet-aware implementation. The reason name resolution was broken
in the first place is because eventlet was consulting DNS first, and
then if that failed, falling back to /etc/hosts, which is just flat
out incorrect behavior.
It's important to note that this was only when running py-amqp under
eventlet, and only for specific versions of eventlet that have long
been fixed. So this workaround is not even needed anymore.
With "normal" (non-eventlet) use, socket.getaddrinfo instead calls
into the glibc getaddrinfo implementation, which ultimately uses
libnss to resolve hostnames.
However, there is an issue with the original workaround when using the
default (glibc) getaddrinfo. The workaround (current) implementation
explicitly forces resolution to use AF_INET (IPv4) and then only if
that does not succeed, it in turn will try with AF_INET6 (IPv6). This
generally works well for IPv4-only hosts, but can be unnecessarily
slow for dual-stack IPv4/IPv6 hosts.
Consider the following:
We want to connect to example.org
The /etc/hosts file contains an IPv6 entry:
example.org f00d::1
The /etc/nsswitch.conf file contains typical (simplified) hosts
config:
hosts: files dns
In this case, the current code will involve nss iterating through the
modules:
files (with AF_INET): fails, because there is no IPv4 address in
/etc/hosts
dns (with AF_INET): may or may not succeed per-site, depending on
how DNS is configured. If DNS is slow/misconfigured, this may incur
a delay and block for a significant amount of time.
files (with AF_INET6): succeeds, and getaddrinfo returns f00d::1.
Now in the same scenario as before, with this fix which reverts back
to using AF_UNSPEC instead:
There is no need to involve DNS at all. Even a well-configured,
quick-to-respond DNS server is going to be many orders of magnitude
slower than consulting with /etc/hosts which libnss keeps cached in
memory.
[1] https://bugs.launchpad.net/neutron/+bug/1696094/comments/22