-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ServiceAddress in Consul isn't updated when Nomad's client_interface changes #8732
Comments
Thanks for the detailed writeup, @snh! I believe what's happening is we accidentally short-circuit the evaluation of whether the service address has been changed by first checking whether the hash of the service definition has changed - which only checks the address_mode parameter rather than the actual address. |
Hey @snh I started thinking about this one and I'm confused about the intended behavior here. If a job is already running and bound to an addresses on eth0 and Nomad did as you asked and updated all the services/checks to reflect the new IP of eth1, wouldn't that break your application without a restart? For example lets say eth0 and eth1 have addresses 10.0.0.10 and 10.0.0.11 respectively. If an allocation has bound an http server to 10.0.0.10:80 and Nomad reregisters the service/check with eth1's address of 10.0.0.11, Consul would have the incorrect IP/Port without the allocation restarting/rebinding to the new IP address. I'm always hesitant to make a change like this where an inplace networking update to the client is assumed to have the same behavior across various users network configurations. The safest way to do this is to drain the node, make the change and put the node back into service. |
Hey @nickethier
We use Docker Even if we did bind them to a specific interface, we have observed that updating the job so that a new allocation is created (and service restarted) still doesn't appear to update the Consul service registration, and have found we have to stop the existing job and create a new job completely for this to update, which results in a reasonable period of downtime. |
Gotcha that makes sense, thanks for the explanation.
…On Mon, Sep 28, 2020 at 03:25 Steven Honson ***@***.***> wrote:
Hey @nickethier <https://github.com/nickethier>
If a job is already running and bound to an addresses on eth0 and Nomad
did as you asked and updated all the services/checks to reflect the new IP
of eth1, wouldn't that break your application without a restart?
We use Docker host networking for all of our containers, and don't bind
the services to a specific interface, so this isn't an issue in our use
case. We effectively bind the services to all interfaces (*).
Even if we did bind them to a specific interface, we have observed that
updating the job so that a new allocation is created (and service
restarted) still doesn't appear to update the Consul service registration,
and have found we have to stop the existing job and create a new job
completely for this to update, which results in a reasonable period of
downtime.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8732 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAF3EJD2IPNZ42IDWR7RESTSIA27NANCNFSM4QKMPA7Q>
.
|
Hello 👋
I run Nomad in a configuration where I occasionally need to change the Nomad
client_interface
, without requiring any downtime for the services under Nomad's control. The purpose of changing theclient_interface
is to influence the address advertised for these services in Consul.The system in question undergoes occasional changes where the network interface and address used for these service advertisements needs to be updated without an interruption to the underlying services.
I have observed that the registrations in Consul aren't updated to reflect this change unless I completely stop, and re-run the relevant jobs, leading to clients continuing to try and contact these services via the address attached to the previous
client_interface
, rather than the updated one.Is this the intended behaviour? Is there any way to update these Consul service registrations without interrupting these services or registrations?
Apologies if this is a duplicate of an existing issue. I did locate #4815, which is related, and would possibly be a suitable workaround if it was available.
Happy to provide further information and background on this use-case and how to replicate this if needed! Thanks!
Nomad version
Operating system and Environment details
Debian GNU/Linux 10 (buster) AMD64 running in VirtualBox via Vagrant.
Issue
The
ServiceAddress
andServiceTaggedAddresses
for the Service registration in Consul, as well as the address used for associated Checks are not updated in Consul whennetwork_interface
is updated in Nomad's client configuration.Reproduction steps
Deploy a Consul integrated Nomad instance with two Ethernet interfaces (
eth0
andeth1
) with unique addresses. Start Nomad withclient_interface
set toeth0
:Run a new job which contains at least one service and check.
Confirm that the
ServiceAddress
andServiceTaggedAddresses
for the Service registration in Consul, as well as the address used for associated Check(s) in the Consul registration reflect the IP address ofeth0
.Update the Nomad configuration to
eth1
:Restart Nomad.
Observe that the
ServiceAddress
andServiceTaggedAddresses
for the Service registration in Consul, as well as the address used for associated Check(s) in the Consul registration continue to reflect the IP address ofeth0
.Re-run the already allocated job.
Observe that these have still not updated.
Stop and re-run the job.
Observe that these have updated.
The text was updated successfully, but these errors were encountered: