Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ implement failure domain picking #2169

Merged
merged 1 commit into from
Jan 28, 2020

Conversation

chuckha
Copy link
Contributor

@chuckha chuckha commented Jan 27, 2020

Signed-off-by: Chuck Ha chuckh@vmware.com

What this PR does / why we need it:
This PR adds failure domain picking to the KubeadmControlPlane. There is no change to logic if failure domain is not defined. If a failure domain is defined then the KubeadmControlPlane controller will evenly distribute machines across all defined failure domains.

This is a naive approach. Please see #2131 for more details.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #2131
Related to #1756

/assign @detiber

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 27, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: chuckha

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 27, 2020
Copy link
Member

@detiber detiber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chuckha since this PR is marked as Fixing #2131, do we need another issue to track setting the Failure domain for a Machine post-creation if a provider later adds failure domain support?

controlplane/kubeadm/internal/failure_domain.go Outdated Show resolved Hide resolved
@chuckha
Copy link
Contributor Author

chuckha commented Jan 28, 2020

@chuckha since this PR is marked as Fixing #2131, do we need another issue to track setting the Failure domain for a Machine post-creation if a provider later adds failure domain support?

This is really two issues.

One requires documentation for infrastructure providers saying that if they decide to add support for failure domains then the provider also needs to update the InfraMachines with their existing failure domain as the machine-controller in cluster-api would have no way to figure out which failure domain the machine lives in.

The second part is having the machine controller copy the infra machine failure domain into the machine.

I'll open these issues later on.

@detiber
Copy link
Member

detiber commented Jan 28, 2020

This is really two issues.

One requires documentation for infrastructure providers saying that if they decide to add support for failure domains then the provider also needs to update the InfraMachines with their existing failure domain as the machine-controller in cluster-api would have no way to figure out which failure domain the machine lives in.

The second part is having the machine controller copy the infra machine failure domain into the machine.

I'll open these issues later on.

sgtm, just wanted to make sure we don't drop tracking for it :)

@chuckha chuckha force-pushed the failure-domain branch 2 times, most recently from 5f05777 to 0f904d5 Compare January 28, 2020 18:05
Signed-off-by: Chuck Ha <chuckh@vmware.com>
@detiber
Copy link
Member

detiber commented Jan 28, 2020

/lgtm
/assign @vincepri

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 28, 2020
@ncdc ncdc added this to the v0.3.0 milestone Jan 28, 2020
@k8s-ci-robot k8s-ci-robot merged commit 1e4af0d into kubernetes-sigs:master Jan 28, 2020
Copy link
Member

@vincepri vincepri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@chuckha chuckha deleted the failure-domain branch January 28, 2020 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Control plane failure domain handling
5 participants