Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Machine Migration #176

Merged
merged 23 commits into from
Mar 7, 2023

Conversation

jonathan-innis
Copy link
Member

@jonathan-innis jonathan-innis commented Jan 26, 2023

Fixes #

Description

This PR is the karpenter-core switchover to managing the cloudprovider machine lifecycle through Karpenter as mentioned in https://github.com/aws/karpenter/blob/main/designs/node-ownership.md. At a high-level this PR does the following actions:

  1. Provisioning now creates a v1alpha5.Machine to represent a scheduling decision at the end of the provisioning loop
  2. Deprovisioning now uses the state.StateNode representation for Candidates to handle the scheduling modeling and the deprovisioning mapping to get the Machine control-plane representation of the Node
  3. Inflight Checks now reconciles on the Machine and compares the Machine and the Nodes to perform validations
  4. Adds machine events to go along with Node events when the Node doesn't exist
  5. Activates the machine.Controller to reconcile Machines with requirements and launch them at the cloudprovider
  6. Activate the informer.Machine to reconcile Machines into the state.Cluster
  7. Adds indexes for node.spec.providerID and machine.Status.providerID
  8. Allows node finalizer termination to perform the Machine finalizer flow before fully terminating the node. This provides a compatability path so that users can still perform kubectl delete node -l karpenter.sh/provisioner-name but the termination flow is driven through the machine
  9. Enables ttlAfterNotRegistered value in karpenter-global-settings which self-terminates a v1alpha5.Machine if the Machine doesn't have a corresponding Node within the ttl duration.

How was this change tested?

  • Manual Testing of:
    • Large Scale Ups (1000 nodes)
    • Machine Linking
    • Garbage Collection
  • make e2etests
  • make presubmit

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@jonathan-innis jonathan-innis force-pushed the machine-migration branch 30 times, most recently from 20ba1cd to 4ae59b6 Compare February 2, 2023 08:11
jonathan-innis added a commit to jonathan-innis/karpenter that referenced this pull request Mar 15, 2023
jonathan-innis added a commit to jonathan-innis/karpenter that referenced this pull request Mar 15, 2023
jonathan-innis added a commit to jonathan-innis/karpenter that referenced this pull request Mar 15, 2023
jonathan-innis added a commit to jonathan-innis/karpenter that referenced this pull request Mar 15, 2023
jonathan-innis added a commit to jonathan-innis/karpenter that referenced this pull request Mar 15, 2023
jonathan-innis added a commit to jonathan-innis/karpenter that referenced this pull request Mar 16, 2023
jonathan-innis added a commit to jonathan-innis/karpenter that referenced this pull request Mar 16, 2023
jonathan-innis added a commit to jonathan-innis/karpenter that referenced this pull request Apr 4, 2023
jonathan-innis added a commit to jonathan-innis/karpenter that referenced this pull request Apr 5, 2023
jonathan-innis added a commit to jonathan-innis/karpenter that referenced this pull request Apr 5, 2023
jonathan-innis added a commit to jonathan-innis/karpenter that referenced this pull request Apr 5, 2023
jonathan-innis added a commit to jonathan-innis/karpenter that referenced this pull request Apr 5, 2023
jonathan-innis added a commit to jonathan-innis/karpenter that referenced this pull request Apr 5, 2023
jonathan-innis added a commit to jonathan-innis/karpenter that referenced this pull request Apr 5, 2023
jonathan-innis added a commit to jonathan-innis/karpenter that referenced this pull request Apr 5, 2023
jonathan-innis added a commit to jonathan-innis/karpenter that referenced this pull request Apr 5, 2023
jonathan-innis added a commit to jonathan-innis/karpenter that referenced this pull request Apr 5, 2023
jonathan-innis added a commit to jonathan-innis/karpenter that referenced this pull request Apr 8, 2023
jonathan-innis added a commit to jonathan-innis/karpenter that referenced this pull request Apr 10, 2023
jonathan-innis added a commit to jonathan-innis/karpenter that referenced this pull request Apr 11, 2023
jonathan-innis added a commit to jonathan-innis/karpenter that referenced this pull request Apr 12, 2023
jonathan-innis added a commit to jonathan-innis/karpenter that referenced this pull request Apr 13, 2023
jonathan-innis added a commit to jonathan-innis/karpenter that referenced this pull request Apr 14, 2023
jonathan-innis added a commit to jonathan-innis/karpenter that referenced this pull request Apr 18, 2023
jonathan-innis added a commit to jonathan-innis/karpenter that referenced this pull request Apr 20, 2023
jonathan-innis added a commit that referenced this pull request Apr 21, 2023
* Revert "Revert machine migration changes (#176) (#241)"

This reverts commit 9973eac.

* Change owner reference to blockOwnerDeletion

* Remove string logging for machine launch

* Removing FailedInit since machine statusCondition captures it

* Updated eventing on machines
njtran pushed a commit to njtran/karpenter that referenced this pull request May 3, 2023
* Revert "Revert machine migration changes (kubernetes-sigs#176) (kubernetes-sigs#241)"

This reverts commit 9973eac.

* Change owner reference to blockOwnerDeletion

* Remove string logging for machine launch

* Removing FailedInit since machine statusCondition captures it

* Updated eventing on machines
njtran added a commit that referenced this pull request May 4, 2023
* chore: cleanup deprovisioning types

* add inc

* fix: add more visibility for when we can't consolidate (#292)

* feat: Machine Migration (#273)

* Revert "Revert machine migration changes (#176) (#241)"

This reverts commit 9973eac.

* Change owner reference to blockOwnerDeletion

* Remove string logging for machine launch

* Removing FailedInit since machine statusCondition captures it

* Updated eventing on machines

* action const

* testfix

* testfix2

* comments

* rename

* last

* enum

* fixcompile

* export

* fixredundancy

* fixlogic

* comment

* renameVar

---------

Co-authored-by: Todd Neal <tnealt@amazon.com>
Co-authored-by: Jonathan Innis <jonathan.innis.ji@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants