Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bootstrapper: stop join-client earlier #1268

Merged
merged 1 commit into from
Mar 3, 2023

Conversation

daniel-weisse
Copy link
Member

@daniel-weisse daniel-weisse commented Feb 24, 2023

Proposed change(s)

This PR fixes a rare issue where the join-client of the bootstrapper of the initial control-plane node of a new cluster would try to join on itself:

  1. User runs constellation init
  2. The init-server of a node receives the call and initiates the bootstrapping of Kubernetes
  3. The join-client continues to run and keeps requesting the metadata API for available join-services
  4. Kubernetes starts running, the init-server waits for all services to be fully installed and available
  5. The metadata API reports the control-plane node as having a join-service available
  6. The join-client of this control-plane tries to contact the joins-service (which runs on this node) to join the cluster (which it is already a part of)
  7. The init-server returns and stops the join-client

This is fixed by instead of deferring the call to Clean() in the init-server, we call the function after we acquired the node lock (same procedure as in the join-client).

Additional Information

Log looks similar to the following

Feb 24 10:08:02 fedora bootstrapper[5610]: {"level":"INFO","ts":"2023-02-24T10:08:02Z","logger":"bootstrapper.initServer","caller":"initserver/initserver.go:117","msg":"Init called","peer":"35.191.16.151:58148"}
Feb 24 10:08:02 fedora bootstrapper[5610]: {"level":"INFO","ts":"2023-02-24T10:08:02Z","logger":"bootstrapper.initServer","caller":"initserver/initserver.go:199","msg":"Stopping"}
Feb 24 10:08:02 fedora bootstrapper[5610]: {"level":"INFO","ts":"2023-02-24T10:08:02Z","logger":"bootstrapper.join-client","caller":"joinclient/joinclient.go:177","msg":"Stopping"}
Feb 24 10:08:02 fedora bootstrapper[5610]: {"level":"INFO","ts":"2023-02-24T10:08:02Z","logger":"bootstrapper.join-client","caller":"joinclient/joinclient.go:161","msg":"Client stopped"}
Feb 24 10:08:02 fedora bootstrapper[5610]: {"level":"INFO","ts":"2023-02-24T10:08:02Z","logger":"bootstrapper.join-client","caller":"joinclient/joinclient.go:185","msg":"Stopped"}
Feb 24 10:08:15 fedora bootstrapper[5610]: {"level":"INFO","ts":"2023-02-24T10:08:15Z","logger":"bootstrapper.initServer","caller":"kubernetes/kubernetes.go:88","msg":"Installing Kubernetes components","version":"v1.25.6"}

@daniel-weisse daniel-weisse added the bug fix Fixing a bug label Feb 24, 2023
@daniel-weisse daniel-weisse added this to the v2.6.0 milestone Feb 24, 2023
@netlify
Copy link

netlify bot commented Feb 24, 2023

Deploy Preview for constellation-docs canceled.

Name Link
🔨 Latest commit a4e0d74
🔍 Latest deploy log https://app.netlify.com/sites/constellation-docs/deploys/640214827a944a00084ca44f

@daniel-weisse daniel-weisse requested review from katexochen and removed request for 3u13r February 24, 2023 09:49
@daniel-weisse
Copy link
Member Author

daniel-weisse commented Feb 24, 2023

e2e Test Manual

@daniel-weisse daniel-weisse marked this pull request as ready for review February 24, 2023 10:22
Copy link
Member

@3u13r 3u13r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: Daniel Weiße <dw@edgeless.systems>
@daniel-weisse daniel-weisse force-pushed the ref/bootstrapper/join-client-closing branch from f0ca2f7 to a4e0d74 Compare March 3, 2023 15:38
@daniel-weisse daniel-weisse merged commit 2023eda into main Mar 3, 2023
@daniel-weisse daniel-weisse deleted the ref/bootstrapper/join-client-closing branch March 3, 2023 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug fix Fixing a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants