Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Network fingerprinting on lo #5498

Closed
benagricola opened this issue Mar 31, 2019 · 5 comments · Fixed by #10404
Closed

Network fingerprinting on lo #5498

benagricola opened this issue Mar 31, 2019 · 5 comments · Fixed by #10404

Comments

@benagricola
Copy link

benagricola commented Mar 31, 2019

Nomad version

Nomad v0.9.0-rc1 (7c00ab4f3f37cfd1e258b38fd2ad99e7bc23e4c3) and I assume everything below

Operating system and Environment details

Linux

Issue

In a clos network (L3, unnumbered BGP routed fabric) setup, it is common to use the loopback interface (lo) to assign the node address.

This loopback IP then appears as a connected address, and can be announced into the BGP fabric by a BGP speaker on the host.

Unfortunately, when configuring the client -> network_interface setting in nomad to lo, the unique.network.ip-address allocated to the host is always 127.0.0.1 - this is because the fingerprinter always picks the first IP address returned on the interface, which for lo is always going to be (and must be) the actual loopback address.

This breaks service discovery as services are announced into consul using 127.0.0.1, which is only valid on the node currently running that service.

Am I missing some way to override the unique.network.ip-address setting?

It seems to me like setting 127.0.0.1 as unique.network.ip-address is almost always the incorrect thing to do when fingerprinting, unless running a single-node dev setup.

IMO the correct fix is to modify the fingerprinting code to prefer any other applied address on lo over 127.0.0.1/8. I also have to specifically set the bind_addr option to the correct loopback IP, so it would seem that defaulting back to the value of bind_addr if set is more sane than allocating 127.0.0.1.

I think it's probably possible to work around this by creating a dummy interface with the same IP as the loopback IP and pointing nomad at that, but would be great to fix this in nomad instead.

Happy to put a PR together for a fix if this is agreed as the correct approach?

@angrycub
Copy link
Contributor

angrycub commented Apr 1, 2019

Have you tried using the go-sockaddr template format options to select a non-127.0.0.1 address from the lo interface? I believe that that should be possible. The [go-sockaddr] package also includes a cli tool that can be used to test selectors.

@benagricola
Copy link
Author

@angrycub AIUI the go-sockaddr format can't just be passed into the client -> network_interface setting (at least from what I understand of the fingerprinting code?)

The bind address / advertise address are already set to the IP in question (which exists on lo), but this seems to have no bearing on the value of ${unique.network.ip-address} and doesn't appear to have an effect on the IP address announced for services into consul.

Leaving client -> network_interface setting blank causes the 'default route' functionality to kick in, except it picks the IP on an interface with an inactive default route (eth0, the management interface), not the default route learned over BGP.

@angrycub
Copy link
Contributor

angrycub commented Apr 1, 2019

Apologies @benagricola, I read too fast and missed that we were talking about network_interface, you're right--no sockaddr there. Will sneak back off to my corner now.

@vasekboch
Copy link

This issue seems kindof similar to this. #3675 basically bad IP is detected for the service IP registered in consul.

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants