Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Install CSI driver by default in preparation of CSI Migration #4166

Closed
Jiawei0227 opened this issue Feb 9, 2021 · 29 comments
Closed

Install CSI driver by default in preparation of CSI Migration #4166

Jiawei0227 opened this issue Feb 9, 2021 · 29 comments
Assignees
Labels
area/dependency Issues or PRs related to dependency changes area/provider/aws Issues or PRs related to aws provider area/provider/azure Issues or PRs related to azure provider area/provider/digitalocean Issues or PRs related to digitalocean provider area/provider/gcp Issues or PRs related to gcp provider area/provider/ibmcloud Issues or PRs related to ibmcloud provider area/provider/openstack Issues or PRs related to openstack provider area/provider/vmware Issues or PRs related to vmware provider kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@Jiawei0227
Copy link

Jiawei0227 commented Feb 9, 2021

1. Describe IN DETAIL the feature/behavior/change you would like to see.

CSI Migration is a Kubernetes feature that when turn on, it will redirect in-tree plugin traffic to the corresponding CSI driver. It has been Beta in k8s since v1.17 without turning on by default.

Recently, we decide to push this feature forward and it will be turn on by default in v1.22 for a lot of plugins according to our plan.

It would be good if cluster-api can prepare for this upcoming change. Specifically, cluster-api should deploy the corresponding CSI drivers by default for the corresponding cloud. The driver is a requisite for CSI migration to work.

  • GCP - GCE PD CSI Driver
  • AWS - AWS EBS CSI Driver
  • Azure - Azuredisk/Azurefile CSI Driver

2. Background

/kind feature
/cc @msau42

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Feb 9, 2021
@fabriziopandini fabriziopandini added area/dependency Issues or PRs related to dependency changes area/provider/aws Issues or PRs related to aws provider area/provider/azure Issues or PRs related to azure provider area/provider/digitalocean Issues or PRs related to digitalocean provider area/provider/gcp Issues or PRs related to gcp provider area/provider/ibmcloud Issues or PRs related to ibmcloud provider area/provider/openstack Issues or PRs related to openstack provider area/provider/vmware Issues or PRs related to vmware provider labels Feb 10, 2021
@fabriziopandini
Copy link
Member

@CecileRobertMichon @nader-ziada @randomvariable @yastij

setting milestone v0.4.0/important soon for now, but probably this should be discussed at the CAPI meeting
/milestone v0.4.0
/priority important-soon
/area

@k8s-ci-robot k8s-ci-robot added this to the v0.4.0 milestone Feb 10, 2021
@k8s-ci-robot k8s-ci-robot added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Feb 10, 2021
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 11, 2021
@vincepri
Copy link
Member

What's the status of this issue?

@fabriziopandini
Copy link
Member

Might be we should raise priority on this topic:

@CecileRobertMichon @nader-ziada @randomvariable @yastij @gab-satchi
opinions?

@vincepri
Copy link
Member

vincepri commented Jun 2, 2021

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 2, 2021
@randomvariable
Copy link
Member

AWS issue is here kubernetes-sigs/cluster-api-provider-aws#1475 with corresponding PR for e2e.

@sbueringer
Copy link
Member

sbueringer commented Jun 10, 2021

@fabriziopandini I wonder if we have the same issue with ccm (not sure when the internal cloud provider is removed though)

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 8, 2021
@vincepri
Copy link
Member

vincepri commented Sep 8, 2021

Are there any action items to follow up here?

@neolit123
Copy link
Member

neolit123 commented Sep 8, 2021

CAPI infra providers need to log issues and adapt to the change. otherwise it can be a breaking change to users of the same providers.

@vincepri
Copy link
Member

vincepri commented Sep 8, 2021

@vincepri
Copy link
Member

vincepri commented Sep 8, 2021

/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 8, 2021
@vincepri
Copy link
Member

vincepri commented Sep 8, 2021

  • Changes are required starting from Kubernetes 1.23(-ish)
  • How do we install (and lifecycle manage?) these types of addons? Should we look at the addons project, what about ClusterResourceSet?
  • How do we upgrade current users?

One short term solution is to include the CRS manifests as part of provider infrastructure template.

@vincepri
Copy link
Member

vincepri commented Sep 8, 2021

/assign @CecileRobertMichon @yastij
to collaborate on goals/non-goals/requirements

@CecileRobertMichon
Copy link
Contributor

note: take a look at issue on runtime extensions

@neolit123
Copy link
Member

kops tracking issue:
kubernetes/kops#10777

might be worth getting some feedback from them on a zoom call.

@fabriziopandini
Copy link
Member

@CecileRobertMichon @yastij trying to review this thread, with a sligtly expanded scope including CPI and whatever add on with a lifecycle we consider linked to the cluster lifecycle (CPI, CSI, CNI?).
Is it possible to define what are the requirement for the lifecycle of those addons:

  • when should they be installed, upgraded, deleted
  • how the configuration gets passed to those addons? are there configuration changes outside upgrade downgrade?
  • more?

@CecileRobertMichon
Copy link
Contributor

@andyzhangx @feiskyer to help answer these questions for Azure CSI drivers and Azure CCM

@feiskyer
Copy link
Member

feiskyer commented Oct 13, 2021

They are not required yet as in-tree drivers are still supported, but we suggest enabling them when upgrading the cluster to k8s v1.22+ as both CSI and CCM have been GA for a while and new feature are only landing in out-of-tree CSI and CCM implementations.

Refer https://github.com/kubernetes-sigs/azuredisk-csi-driver#project-status-ga and https://github.com/kubernetes-sigs/cloud-provider-azure#current-status for CSI/CCM version matrix.

@andyzhangx
Copy link
Member

and make sure the CSI drivers could be disabled by switch in cluster creation, we want to install latest versions(or even master branch) of CSI driver in some testing scenario.

@vincepri
Copy link
Member

/milestone v1.2

@k8s-ci-robot k8s-ci-robot modified the milestones: v1.1, v1.2 Jan 31, 2022
@fabriziopandini fabriziopandini added the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Jul 29, 2022
@fabriziopandini fabriziopandini removed this from the v1.2 milestone Jul 29, 2022
@fabriziopandini fabriziopandini removed the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Jul 29, 2022
@fabriziopandini
Copy link
Member

/triage accepted

@CecileRobertMichon @yastij PTAL and update

@k8s-ci-robot k8s-ci-robot added the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Oct 3, 2022
@dtzar
Copy link
Contributor

dtzar commented Jan 4, 2023

This is an important one to implement. Perhaps we can discuss where this is at on next CAPI call?

@k8s-triage-robot
Copy link

This issue is labeled with priority/important-soon but has not been updated in over 90 days, and should be re-triaged.
Important-soon issues must be staffed and worked on either currently, or very soon, ideally in time for the next release.

You can:

  • Confirm that this issue is still relevant with /triage accepted (org members only)
  • Deprioritize it with /priority important-longterm or /priority backlog
  • Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. and removed triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Apr 4, 2023
@fabriziopandini
Copy link
Member

/triage accepted

@CecileRobertMichon @yastij PTAL and update

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 14, 2023
@CecileRobertMichon
Copy link
Contributor

I don't think there is anything in scope for CAPI itself here, CAPI can't "install" CSI drivers by default since those are provider specific. It's similar to CNI and CPI/CCM/CNM (Cloud Controller Manager for out of tree cloud provider). These should be installed just like other critical cluster addons (maybe that's where https://github.com/kubernetes-sigs/cluster-api-addon-provider-helm can help) and up to each provider to have docs/reference templates (eg. https://capz.sigs.k8s.io/topics/addons.html#storage-drivers).

Not sure there's much more we can do here @fabriziopandini, not without adding a way for providers to hook into CAPI to provider provider-specific instructions (is that something runtimeExtensions could help with?)

@dtzar
Copy link
Contributor

dtzar commented Apr 17, 2023

Agree it would be good to standardize the direction for the driver installation process - i.e. with CAAPH. If CAAPH doesn't meet the requirements for some reason, it would be good to know what's missing.

@fabriziopandini
Copy link
Member

/close

Based on the fact that there is not much to do in core CAPI.
The CAPI office hours, where we are already giving room for updates to CAAPH and providers, could be the avenue to carry on this discussion; I will add a point to the next office hours agenda

@k8s-ci-robot
Copy link
Contributor

@fabriziopandini: Closing this issue.

In response to this:

/close

Based on the fact that there is not much to do in core CAPI.
The CAPI office hours, where we are already giving room for updates to CAAPH and providers, could be the avenue to carry on this discussion; I will add a point to the next office hours agenda

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/dependency Issues or PRs related to dependency changes area/provider/aws Issues or PRs related to aws provider area/provider/azure Issues or PRs related to azure provider area/provider/digitalocean Issues or PRs related to digitalocean provider area/provider/gcp Issues or PRs related to gcp provider area/provider/ibmcloud Issues or PRs related to ibmcloud provider area/provider/openstack Issues or PRs related to openstack provider area/provider/vmware Issues or PRs related to vmware provider kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests