Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

autopilot: include only servers from the same region #15290

Merged
merged 1 commit into from
Nov 17, 2022

Conversation

tgross
Copy link
Member

@tgross tgross commented Nov 17, 2022

Fixes #15085

When we migrated to the updated autopilot library in Nomad 1.4.0, the interface for finding servers changed. Previously autopilot would get the serf members and call IsServer on each of them, leaving it up to the implementor to filter out clients (and in Nomad's case, other regions). But in the "new" autopilot library, the equivalent interface is KnownServers for which we did not filter by region. This causes spurious attempts for the cross-region stats fetching, which results in TLS errors and a lot of log noise.

Filter the member set by region to fix the regression.


Compare old caller (in Consul, which we imported directly): autopilot.go#L215
vs new caller (in the library): reconcile.go#L180

@tgross tgross added this to the 1.4.3 milestone Nov 17, 2022
@tgross tgross marked this pull request as ready for review November 17, 2022 16:29
When we migrated to the updated autopilot library in Nomad 1.4.0, the interface
for finding servers changed. Previously autopilot would get the serf members and
call `IsServer` on each of them, leaving it up to the implementor to filter out
clients (and in Nomad's case, other regions). But in the "new" autopilot
library, the equivalent interface is `KnownServers` for which we did not filter
by region. This causes spurious attempts for the cross-region stats fetching,
which results in TLS errors and a lot of log noise.

Filter the member set by region to fix the regression.
@tgross
Copy link
Member Author

tgross commented Nov 17, 2022

This is pretty trivial code but I just added a test just in case anyways 😀

@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 18, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
backport/1.4.x backport to 1.4.x release line theme/autopilot type/bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

autopilot stats_fetcher gets mTLS errors sending cross-region RPCs
2 participants