Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport of agent: prevent very old servers re-joining a cluster with stale data into release/1.15.x #17354

Conversation

hc-github-team-consul-core
Copy link
Collaborator

Backport

This PR is auto-generated from #17171 to be assessed for backporting due to the inclusion of the label backport/1.15.

🚨

Warning automatic cherry-pick of commits failed. If the first commit failed,
you will see a blank no-op commit below. If at least one commit succeeded, you
will see the cherry-picked commits up to, not including, the commit where
the merge conflict occurred.

The person who merged in the original PR is:
@loshz
This person should manually cherry-pick the original PR into a new backport PR,
and close this one when the manual backport PR is merged in.

merge conflict error: POST https://api.github.com/repos/hashicorp/consul/merges: 409 Merge conflict []

The below text is copied from the body of the original PR.


Description

This PR introduces the concept of Server Metadata, which is server specific information written to a file stored in the configured data directory. During this initial phase, we only store the last seen timestamp in Unix format:

type ServerMetadata struct {
LastSeenUnix int64 `json:"last_seen_unix"`
}

2 new agent methods for attempting to prevent old servers from rejoining an existing cluster.

  • persistServerMetadata(): Periodically write a server's metadata to a file in the configured data directory every hour.
  • checkServerLastSeen(): Attempt to read a server's last seen file and check the Unix timestamp against a configurable max age. If the last seen file does not exist, we treat this as an initial startup and return no error.

Example

We attempt to start a previously running server with the following last seen timestamp 1672531200 (2023-Jan-01 00:00:00).

Setting the new config to server_rejoin_age_max = "3d" will prevent this server from starting and require an operator to manually remove this file from the data directory if they need to force the server to rejoin.

Testing & Reproduction steps

  • Added unit tests for new Server Metatdata helper funcs.
  • Added unit tests for agent Server Metatdata implementation.

Links

PR Checklist

  • Write validation function for new config field.
  • updated test coverage
  • external facing docs updated
  • appropriate backport labels added
  • not a security concern

Overview of commits

@hc-github-team-consul-core hc-github-team-consul-core force-pushed the backport/loshz/NET-3772/entirely-sure-jaybird branch from dbb64d4 to ae8eecb Compare May 15, 2023 11:06
@hc-github-team-consul-core hc-github-team-consul-core force-pushed the backport/loshz/NET-3772/entirely-sure-jaybird branch from ae8eecb to dbb64d4 Compare May 15, 2023 11:06
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto approved Consul Bot automated PR

@github-actions github-actions bot added the theme/config Relating to Consul Agent configuration, including reloading label May 15, 2023
@vercel vercel bot temporarily deployed to Preview – consul-ui-staging May 15, 2023 11:10 Inactive
@vercel vercel bot temporarily deployed to Preview – consul May 15, 2023 11:11 Inactive
@loshz
Copy link
Contributor

loshz commented May 15, 2023

Replaced by #17357

@loshz loshz closed this May 15, 2023
@loshz loshz deleted the backport/loshz/NET-3772/entirely-sure-jaybird branch May 15, 2023 11:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
theme/config Relating to Consul Agent configuration, including reloading
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants