Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client: respect client_auto_join after connection loss #11585

Merged
merged 1 commit into from
Nov 30, 2021

Conversation

tgross
Copy link
Member

@tgross tgross commented Nov 29, 2021

Fixes #11404

The consul.client_auto_join configuration block tells the Nomad
client whether to use Consul service discovery to find Nomad
servers. By default it is set to true, but contrary to the
documentation it was only respected during the initial client
registration. If a client missed a heartbeat, failed a
Node.UpdateStatus RPC, or if there was no Nomad leader, the client
would fallback to Consul even if client_auto_join was set to
false. This changeset returns early from the client's trigger for
Consul discovery if the client_auto_join field is set to false.


To test this I ran a client node and a server node with Consul under Vagrant. Then I blocked access to the server's ports and Consul:

$ sudo iptables -A INPUT -p tcp --destination-port 4647 -j DROP
$ sudo iptables -A INPUT -p tcp --destination-port 8500 -j DROP

Then waited for the client to log that it had lost it's connection to the server, and removed the server rule while leaving the Consul rule in place:

$ sudo iptables --line-numbers -L INPUT

Chain INPUT (policy ACCEPT)
num  target     prot opt source               destination
1    DROP       tcp  --  anywhere             anywhere             tcp dpt:8500
2    DROP       tcp  --  anywhere             anywhere             tcp dpt:4647

$ sudo iptables -D INPUT 2

And after a few moments, the client restored its connection with the server without needing to have Consul, as expected. I also did the same exercise without dropping packets to Consul and watched the logs to ensure that no requests were being made from the client to Consul to get the server address.

The `consul.client_auto_join` configuration block tells the Nomad
client whether to use Consul service discovery to find Nomad
servers. By default it is set to `true`, but contrary to the
documentation it was only respected during the initial client
registration. If a client missed a heartbeat, failed a
`Node.UpdateStatus` RPC, or if there was no Nomad leader, the client
would fallback to Consul even if `client_auto_join` was set to
`false`. This changeset returns early from the client's trigger for
Consul discovery if the `client_auto_join` field is set to `false`.
@tgross
Copy link
Member Author

tgross commented Nov 29, 2021

cc @davemay99 as a heads up, as I think I recall seeing this kind of thing in a support issue as well.

@tgross tgross merged commit d38266a into main Nov 30, 2021
@tgross tgross deleted the b-consul-client-autojoin-config branch November 30, 2021 18:20
lgfa29 pushed a commit that referenced this pull request Jan 17, 2022
The `consul.client_auto_join` configuration block tells the Nomad
client whether to use Consul service discovery to find Nomad
servers. By default it is set to `true`, but contrary to the
documentation it was only respected during the initial client
registration. If a client missed a heartbeat, failed a
`Node.UpdateStatus` RPC, or if there was no Nomad leader, the client
would fallback to Consul even if `client_auto_join` was set to
`false`. This changeset returns early from the client's trigger for
Consul discovery if the `client_auto_join` field is set to `false`.
lgfa29 pushed a commit that referenced this pull request Jan 17, 2022
The `consul.client_auto_join` configuration block tells the Nomad
client whether to use Consul service discovery to find Nomad
servers. By default it is set to `true`, but contrary to the
documentation it was only respected during the initial client
registration. If a client missed a heartbeat, failed a
`Node.UpdateStatus` RPC, or if there was no Nomad leader, the client
would fallback to Consul even if `client_auto_join` was set to
`false`. This changeset returns early from the client's trigger for
Consul discovery if the `client_auto_join` field is set to `false`.
@github-actions
Copy link

github-actions bot commented Nov 3, 2022

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 3, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Nomad agent ignores retry-join server address and uses consul discovery instead on boot.
3 participants