Skip to content
This repository has been archived by the owner on Jan 11, 2023. It is now read-only.

Add cloud-controller-manager support for Kubernetes cluster #1584

Merged
merged 2 commits into from
Nov 21, 2017

Conversation

karataliu
Copy link
Contributor

What this PR does / why we need it:
Kubernetes v1.8 supports cloud-controller-manager which runs cloud provider related logic, and it is planned in v1.10 that using cloud-controller-manager becomes the only way to interact with clouds.

This PR adds an option 'UseCloudControllerManager' in apimodel for Kubernetes v1.8 version. If turned on, the cluster will be using cloud controller manager mode.

Currently it uses the common image 'cloud-controller-manager-amd64:v1.8.0' which includes all cloud provider implementation. Will later switch to azure dedicated cloud-controller-manager image once ready.

One TODO item is RBAC support, which pends on upstream Kubernetes#53511 .

Please find more details here: kubernetes/kubernetes#50752

Which issue this PR fixes

Special notes for your reviewer:

Release note:

Add cloud-controller-manager support for Kubernetes cluster

@karataliu karataliu force-pushed the accm18n branch 3 times, most recently from 6d0d214 to 34651f8 Compare October 16, 2017 03:24
@karataliu
Copy link
Contributor Author

cc @anhowe @jdumars This PR is for supporting standalone Azure cloud provider from acs-engine side. Could you please take a look? Thanks!

- "--cloud-provider=azure"
- "--cloud-config=/etc/kubernetes/azure.json"
- "--leader-elect=true"
# TODO: RBAC support
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's needed or outstanding for enabling RBAC?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually meant for enabling 'use-service-account-credentials', which does not work now, pending on kubernetes#53511

sed -i "s|<kubernetesCcmImageSpec>|{{WrapAsVariable "kubernetesCcmImageSpec"}}|g; s|<masterFqdnPrefix>|{{WrapAsVariable "masterFqdnPrefix"}}|g; s|<allocateNodeCidrs>|{{WrapAsVariable "allocateNodeCidrs"}}|g; s|<kubeClusterCidr>|{{WrapAsVariable "kubeClusterCidr"}}|g; s|<kubernetesCtrlMgrRouteReconciliationPeriod>|{{WrapAsVariable "kubernetesCtrlMgrRouteReconciliationPeriod"}}|g" \
/etc/kubernetes/manifests/cloud-controller-manager.yaml

sed -i "/--\(cloud-config\|cloud-provider\|route-reconciliation-period\)=/d" \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this stripping off cloud-provider and cloud-config from apiserver? Doesn't that also need to be stripped off of controller-manager too?

Copy link
Contributor Author

@karataliu karataliu Oct 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, both need to be updated. LIne323~324 is for kube-controller-manager, Line325~326 is for apiserver.

CLOUD_PROVIDER=external
{{else}}
CLOUD_PROVIDER=azure
{{end}}

- path: "/etc/systemd/system/kubelet.service"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there no change needed for kubelet.service to include the --provider-id flag?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@colemickens
Copy link
Contributor

What's the status of this @karataliu? What level of functionality does the cluster have when this is enabled? I'm guessing volumes don't work, but does networking + load balancing?

I'm trying to understand how many of the items mentioned in the upstream issue need to happen before CCM + Azure is functional for most non-volume use-cases. Thanks.

@karataliu
Copy link
Contributor Author

@colemickens Yes, volumes feature does not work. I've tried with e2e conformance tests and most cases would pass excepted for the volume ones.

The upstream issue is more about migration work, Azure provider running with cloud-controller-manager could already work on a cluster. But since the whole feature is under beta stage (kubernetes/enhancements#88), it is not recommended for production use.

@colemickens
Copy link
Contributor

colemickens commented Oct 26, 2017 via email

@karataliu
Copy link
Contributor Author

Could someone please take a look? Thanks.

@colemickens
Copy link
Contributor

(FWIW, this looks good to me. And in a few weeks hopefully, there should be alpha support for CCM under hyperkube as well, so there won't need to be a separate image/process... at least until CCM totally splits out of the main repo and is yanked back out of hyperkube, but that could be a while).

@karataliu
Copy link
Contributor Author

@colemickens Thanks, I saw the ongoing work for hyperkube incorporating CCM.
For the change here, a dedicate parameter for CCM image is still required as it could be set to an out of band CCM release. Also during CI test, a new CCM image is built and tested every time.

@jackfrancis
Copy link
Member

Thanks for your patience @karataliu, and apologies for the rebase needs. Not a few things have changed in the conflicting surface area, I'd be happy to spawn a new PR with this work if that's easier.

@karataliu
Copy link
Contributor Author

@jackfrancis Updated, please take a look. Thanks.

@jackfrancis
Copy link
Member

Thanks @karataliu ! Running E2E now.

Copy link
Member

@jackfrancis jackfrancis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. The only suggestion I'd make is to make the "enable/disable" property a *bool, so that we can more easily default to true at a later date. See this PR for an example:

https://github.com/Azure/acs-engine/pull/1763/files

If I had to do that PR again I'd make the return value here reference a boolean constant, and not a boolean literal, for easier future maintenance:

https://github.com/Azure/acs-engine/pull/1763/files#diff-ff04be9832c108c30374068a6cfe4953R792

e.g.:

if <enableThing) == nil {
    return DefaultThingEnabled

Let me know if that doesn't make sense! Thanks again!

@karataliu
Copy link
Contributor Author

In fact the default value should be dependent on k8s version, and as planned for 1.10+ release it should only have true value.

Anyway it is good to switch to ternary here, I've updated accordingly.

Copy link
Member

@jackfrancis jackfrancis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks @karataliu !

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants