Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CreateVolumeRequest should provide CO related volume resource information #248

Closed
princerachit opened this issue Jun 6, 2018 · 33 comments
Closed

Comments

@princerachit
Copy link

This issue is tightly coupled with k8s. Some storage provisioning mechanism heavily depends on the CO itself. They leverage the architecture of CO and its resources to provision volume. Before CSI the volume representative information in CO(eg. pvc in kubernetes) was passed to the custom provisioner. The onus of extracting the volume related information was offloaded to the provisioner.
The current CSI spec restricts the free flow of CO's volume resource information(eg PVC/SC in K8s) to the driver. It is however possible to pass the information through redundant declaration in the storage class parameters but this becomes hard to maintain.
I propose that the CO's should pass their volume related claims information either as a map or a more defined struct to the CSI driver.

Related issue kubernetes-csi/external-provisioner#93

@e4jet
Copy link

e4jet commented Jun 6, 2018

I agree. This follows the flexvolume model.
Example:

..."kubernetes.io/fsType":"ext3","kubernetes.io/pod.name":"mariadb-84546b4696-foobar","kubernetes.io/pod.namespace":"ateam","kubernetes.io/pod.uid":"d0f3af91-63bc-11e8-9f73-0242ac110006"

The plugin can then choose what to do with the CO namespaced keys (ie: ignore them or use them).

@jieyu
Copy link
Member

jieyu commented Jun 6, 2018

I am a little worried that plugins start to depend on those CO specific keys. One of the goal of CSI is to allow storage vendors to just write one plugin that works for most COs. Relying on CO specific keys is against that goal. A better approach would be to first class those fields that we think are absolutely necessary (e.g., topology requirements, filesystem types, etc.) in the CreateVolumeRequest in a CO agnostic way.

@princerachit
Copy link
Author

princerachit commented Jun 6, 2018

I don't think that providing CO specific keys would harm the CSI's intention. It is solely the responsibility of storage vendor to consume the spec. A storage provider may choose to develop plugins only for a particular CO leveraging the architecture of CO itself or they may choose a CO agnostic way. I believe that putting a constraint on CO specific keys would harm the existing methodologies being used by some vendors. While most of the information provided by the CO in the CSI spec is sufficient for many vendors, it is not sufficient for all.

@jieyu
Copy link
Member

jieyu commented Jun 6, 2018

@princerachit Can you provide some examples here? For instance, what information that's not opaque to CO needs to be passed to the plugin during CreateVolumeRequest that's not yet first classed? And why the plugin needs that for volume provisioning purpose.

What if the provided storage is an abstraction to CO native storage resource

Can you elaborate a bit more here? I don't quite follow this question.

@princerachit
Copy link
Author

An example is openebs(which I am working on). This issue kubernetes-csi/external-provisioner#93 talks about NFS and gluster.

Can you provide some examples here? For instance, what information that's not opaque to CO needs to be passed to the plugin during CreateVolumeRequest that's not yet first classed?

The driver needs information such as pvc namespace and storage class name.

I kind of deleted that line because it did not make much sense. 😄

Can you elaborate a bit more here? I don't quite follow this question.

@jieyu
Copy link
Member

jieyu commented Jun 6, 2018

I guess I am more curious about the why the plugin needs those information for volume provisioning purpose. For instance, i looked at the NFS provisioner . It uses pvc namespace and name to construct the name of the volume. However, the CSI spec already has a name field for the CreateVolumeRequest?

You also mentioned storage class name, why the plugin needs that? Wouldn't volume_capabilities and parameters be sufficient?

@princerachit
Copy link
Author

princerachit commented Jun 6, 2018

The driver wants to launch pods in the same namespace as that of pvc.

You also mentioned storage class name, why the plugin needs that? Wouldn't volume_capabilities and parameters be sufficient?

Yes I think storage params would be enough. ^^^

@vladimirvivien
Copy link

@princerachit So if I understand, are you asking to insert CO-specific keyed values in the Parameters that are passed when csi.CreateVolume(CreateVolumeRequest.Parameters[kubernetes.io/sfType:ext4]) is invoked. If so, that would be more of CO-specific ask, no?

@princerachit
Copy link
Author

@vladimirvivien Yes. But I am wondering if we could have a separate message inside the request which carries this information. eg. instead of receiving it in the CreateVolumeRequest.Parameters map we could have a separate map CreateVolumeRequest.COParameters.

@jdef
Copy link
Member

jdef commented Jun 12, 2018 via email

@princerachit
Copy link
Author

@jdef noted. I will put this issue in k8s SIG-storage.

@vladimirvivien
Copy link

@princerachit what @jdef says. Please bring your concern to the Kubernetes CSI working group as this is a CO-specific ask.

@kfox1111
Copy link

My 2 cents here... I wrote a csi driver intended to fit into the k8s view of pods. I want a csi volume to track the lifecycle of the pod. it gets created when the pod is created. it goes away when the pod goes away. I plumbed it up assuming that the podid would be used as the volumeid. In my reading of the spec I didn't see anything that prevented this behaviour. But there currently is a lack of plumbing on the k8s side to do so. Perhaps rather then passing coe specific info into the csi driver, the plumbing on the k8s side decides what volumeid passing behavior to use?

@jsafrane
Copy link
Contributor

While I agree that CSI should be CO agnostic and allow storage vendors to provide CO independent plugins, it's going to be main extension point for anything that looks like volumes in COs.

For example, Kubernetes uses volumes (i.e. bind mounts) for passing configuration or secrets to containers, so a container gets say /etc/mysqld.conf from Kubernetes API object and users don't need to bake it in the container image. There are 3rd party vendors who implement their own configuration and especially secret sources, it can read secrets from highly protected vault. For that, it needs to know CO specific workload definition (pod namespace, name, UUID, UID, GID, ...)

Can we define an extension mechanism for such CO specific volumes? CSI plugin that uses these extensions would be naturally tightly bound to CO. On the other hand it would just re-use existing CSI interfaces and messages so CO doesn't need to implement something new or deviate from CSI in incompatible way.

I know very little about gRPC / protocol buffers, but could we for example declare that field numbers > 100000 in every message are reserved for CO extensions?

@vladimirvivien
Copy link

While I don't like the idea of introducing CO-specific params in CSI calls, maybe what is needed is to extend the spec with the ability for the driver to query the CO. Thus have the CO expose gRPC service endpoints to be queried by the driver for CO-specific info (i.e. a driver registry request, storage metada request for a specific volumeHandle, etc)

@jdef
Copy link
Member

jdef commented Jun 13, 2018

@jsafrane, @vladimirvivien thanks, each of you, for your thoughts.

... it's going to be main extension point for anything that looks like volumes in COs...

I don't think I agree with this statement. The examples provided (configuration as a volume, secrets as a volume) are illustrations of solving Some Other Problem (e.g. how do we expose configuration objects to containers?) by overloading a pre-existing API, because of reasons... and in this case, that's a "volumes" API. IMO they're not shining examples of the primary use cases that CSI is intending to address. Does that make it impossible to use/abuse CSI for such things? No, not really. It's entirely plausible to write a CSI plugin whose backend is some secret store. And, I'd argue, it's possible to do that without the addition of CO-specific, first-class fields to the CSI spec. If a vendor ONLY needs/wants to support a specific CO, then that's basically up to the vendor. If the vendor is k8s, and "k8s as the vendor" only wants to support "k8s as the CO" - then that could be reasonable. The CO owners have previously discussed "shim" plugins that are CO-specific and that they might be useful in some cases, and also that they're NOT the norm. In general, since one of CSI's goals is to make less work for vendors, a vendor writing a CO-specific plugin sounds like someone maybe making more work for themselves if their overall goal is to support multiple COs.

There are 3rd party vendors who implement their own configuration and especially secret sources, it can read secrets from highly protected vault...

This sounds like a CO/{config,secrets}-vendor integration problem, and not a CSI problem. How do you think this relates to CSI?

I know very little about gRPC / protocol buffers, but could we for example declare that field numbers > 100000 in every message are reserved for CO extensions?

I think this is putting the cart before the horse. The first question is "should CSI expose CO-specific extension points?". Let's answer that first. I haven't yet seen any convincing, concrete use cases to support this.

... have the CO expose gRPC service endpoints to be queried by the driver...

We've successfully avoided, so far, defining any gRPC services on the CO-side of the house. I think that the "CO-specific info" you're asking for assumes that CO's can align on some subset of control/data elements CO's could/should provide to a plugin. I'm not very optimistic that we'll find much overlap among CO's. So far, we can't even agree how to identify a "user". I also quite like that the CSI spec is tightly focused on the storage control plane. Exposing gRPC services on the CO would begin to define a CO control plane and that feels a bit out of scope and not very aligned with the goals of the spec.

@smarterclayton
Copy link

An interesting point (from the Kubernetes point of view) is that CSI was intended to be a storage abstraction for Kubernetes (which quite wisely grew to be a storage abstraction for multiple platforms). One type of storage is remote attachable storage. Another type is local attached storage. If CSI evolved as a spec into something that is much more about remote attachable storage, maybe it's just a sign that CSI in Kubernetes needs to mean something different than what the spec has evolved to here.

@jdef
Copy link
Member

jdef commented Jun 13, 2018

@smarterclayton thanks for your comments. Was there something in my response that suggested CSI is not applicable to "local storage"? That's definitely not the picture that I was trying to paint. Though, upon further thought, I guess that depends what the scope of "local storage" is defined to be. CSI doesn't attempt to define that term (and I also don't think that it should try). CSI generally tries to focus, primarily, on a control plane protocol targeting volume life cycle management and accessibility. There's some non-protocol stuff in the spec as written, but we've tried to keep it minimal and mostly as recommendations (vs requirements).

The "configuration as a volume" and "secrets as a volume" cases presented earlier - are those considered to be part of the "local storage" arena in k8s? When I think "local storage" I think of things like LVM, whole disks, disk partitions, virtual storage masquerading as local block devices, etc. Using CSI to solve configuration and secrets integration feels like shoehorning "the tool you have on hand" because it happens to be the nearest fit. And this may be perfectly acceptable on a per-CO basis... CSI doesn't pass judgement here. That said, I have doubts about baking in first-class support for such into the spec, or recommending such as a generalized, cross-CO compatible approach.

@childsb
Copy link

childsb commented Jun 13, 2018

Though it would be nice for every CSI plugin to be CO agnostic, this won't be the case. I've had multiple CSI plugin developers want to specifically focus on a single platform, or provide an enhanced feature set for one platform thats not possible on the other.

  • There are file oriented extensions that fit very nicely through the CSI interface as extensions to the CO.. Kube wants to do these things, other COs may want to too.. These CSI/CO functional shims probably wouldn't be CO agnostic... I can provide multiple examples of Kubernetes FLEX plugins which copy files into a container, manipulate a container as part of attaching the storage or similar.

  • Things like 'local storage' are common across COs but may have a different feature set depending on the CO or Host.

I think we should step back and not get too stuck in the details of what each CO may or may not do with the interface. This function may be achievable with existing, or enhancements of the metadata mechanism.

@saad-ali
Copy link
Member

We discussed this at the CSI community meeting today.

Conclusion was that on the CreateVolumeRequest the existing opaque parameters field should be sufficient to allow passing in workload information (it would require the driver be paired with a sidecar container, external-provisioner, that knows how to set the field).

On the NodePublishVolumeRequest (mount) side, we agreed there should be a way to pass in workload information. We agreed on adding a new Workload struct that contains an opaque map that the CO can fill with whatever makes sense for it (e.g. for Kubernetes, pod namespace/name/uid).

@jsafrane
Copy link
Contributor

There is no recording of the meeting, so from the comment above the solution is:

message NodePublishVolumeRequest {
    // <snip>

    // CO workload that requests the volume.
    // This is an OPTIONAL field.
    Workload workload = 9;
}

// Parameters of a CO workload that lead to CSI request.
message Workload {
    // CO specific parameters of the workload, such as workload name.
    map<string, string> parameters = 1;
}

Did you agree on its handling? Does the plugin need to use workload when it's specified? Does it need to understand all provided parameters?

@jieyu
Copy link
Member

jieyu commented Jun 20, 2018

@jsafrane yes, this is what most of us agreed. We might need to think a bit more on the structure of Workload to add things like CO type information so that the SP can more easily distinguish the type of the CO (rather than relying on the key value of the map).

@jsafrane
Copy link
Contributor

Fix: #252

@cpuguy83
Copy link
Contributor

This use case is feature creep at its finest.

@jdef
Copy link
Member

jdef commented Jun 29, 2018 via email

@cpuguy83
Copy link
Contributor

I will go out on a limb here and say I am vehemently opposed to this solution.

In the same way that SP's provide different APIs for k/v stores, block storage, file storage, etc we should not try and cram every use case for some kind of storage into CSI.

@gnufied
Copy link
Contributor

gnufied commented Jul 2, 2018

My 2 cents on this is - I think any attempt to discourage users from passing custom workload specific data to plugin is far removed from reality. In last 2 months or so, I have reviewed several flexvolume plugins (of which CSI is natural successor) that use workload information for very reasonable things. For example - someone wants to mount HDFS file system over fuse for workload specific map-reduce job.

Either these users will abuse one of the existing parameters or simply stay away from CSI.

@jdef
Copy link
Member

jdef commented Jul 3, 2018 via email

@childsb
Copy link

childsb commented Jul 9, 2018

(In kubernetes) there are multiple scenarios where the CSI driver needs to know information about the "workload" thats using the storage to provide advanced security or other workload manipulation related to storage.. One of the founding goals of CSI (though not specifically documented) is to replace the FLEX interface in kubernetes in a way that's useful to other COs. The kube-flex currently provides workload information, and there are multiple FLEX plugins which rely heavily on it..

Some specific kubernetes use cases--

  • Injection of data into pod that volume is attaching to
  • Injection of secrets / credentials / certificates into running POD (kerberose, secret valut)
  • UID / GID / Permissions of running container
  • Storage class / performance characteristics of storage request.
  • Passing of data/metadata from PV/PVC to volume driver (mount options, credentials, performance characteristics, quota)

I realize that formalizing some of these things is a 'non goal' in the spec, so we are really looking for enough workload information to implement this ad-hoc in COs that support/need it. I would expect these CSI plugins to be CO dependent (not-agnostic) or CSI plugins to special case CO behavior based on supplied (or not) workload metadata.

Related kube issue: kubernetes/kubernetes#64984

And here's a small set of existing FLEX plugins which require 'workload' identifying information to work:

Kubernetes moving "service account secret" generation out of tree:

Hashicorp vault injecting a secret into a pod based on the info about the pod:

Flexvolume for custodia (to inject a fuse mount into a pod that has secrets)

Mounting something from the container runtime into a pod

Flexvolume to mount an image at a directory in a pod

Mounting a directory that a node local daemon watches to abstract container runtime operations like getting a snapshot of a container as a file

Mounting things that might not want to have PVs but which need info from the pod (more inline pod)

Flexvolume S3 fuse mount

ZFS flexvolume mount:

There are quite a few more examples not listed... Though these aren't 'pure' storage integrations, they are 'storage like' enough and coupled or related to the storage plugin interface & attached storage that its necessary for the more advanced use cases / security beyond simple attach/detach/provision/delete.

I understand the reluctance of opening up workload metadata-- and i dont see a good generic way to express 'workload information' which is CO agnostic and which all COs could agree.. but if we dont provide the possibility for COs to express workload identifying data through CSI then the kube-CSI implementation will be impotent.

@cpuguy83
Copy link
Contributor

cpuguy83 commented Jul 9, 2018

  1. Why are we defining an API here and at the same time planning to circumvent it before it's even released?
  2. Most of these examples seem to want to cram what Kube (or Flexvolume drivers) is currently doing into the CSI model, this cannot work.
  3. Some (all?) of these things make sense to expose as a first class entity in CSI
    i. Storage class has been at least discussed in the past, I can't remember what the outcome was
    ii. Injection of secrets / credentials / certificates into running POD (kerberose, secret vault) -- Why does this need some workload/CO dependent options to be passed to the plugin?
    iii. UID/GID -- Discussed, but IIRC required further thought on how to properly abstract this
    iv. mount options, credentials, performance characteristics, quota -- these are all very different things... I do not consider them metadata.
    • Credentials should be handled in the API today, is what is there not satisfactory?
    • Mount options should be handled, but looking at NodePublish maybe not well enough for per-publish options

@childsb
Copy link

childsb commented Jul 10, 2018

Why are we defining an API here and at the same time planning to circumvent it before it's even released?

What do you mean, what API are we defining/circumventing?

Most of these examples seem to want to cram what Kube (or Flexvolume drivers) is currently doing into the CSI model, this cannot work.

FLEX evolved through customer requirements and is pretty well field tested. We started the CSI project to offer the same function as FLEX but in a manor that other COs could consume and participate. I dont see any features in FLEX that we would want to exclude from CSI-- only features and function that haven't made it to CSI yet.

Injection of secrets / credentials / certificates into running POD (kerberose, secret vault) -- Why does this need some workload/CO dependent options to be passed to the plugin?

Attaching kerberos ticket to a running container/node, augmenting a running container with a supplemental group, injecting certificates into the container (which may not be located at the same path as the mounted volume, or we dont want to pre-populate the volume with these certificates).. Other issues may be SELINUX or cgroup settings dependent on the PID, enabling fine-grained firewall rules per container etc..

The 'lessons learned' in Kube regarding storage, security, secrets etc.. is that there's no one size fits all security solution... And instead of trying to pick one, or trying to add them all its better to allow a pluggable interface which would allow multiple solutions (which may be closed source or vendor specific)

Some (all?) of these things make sense to expose as a first class entity in CSI

I agree with this, but I think we can all agree it will take some time to achieve and probably wouldn't be correct on the first pass.. Ideally we would allow these thing freely as 'alpha', then codify well typed API once we realize whats actually useful and way to represent it generically from all COs

@cpuguy83
Copy link
Contributor

What do you mean, what API are we defining/circumventing?

If someone is passing CO specific values, this is circumventing the API. It means there is now a contract between the CO and the plugin that extends beyond just CSI.

I don't mean to say that we shouldn't bring on the things that Flex is doing, but we should be more deliberate about how we do it and define a well-formed API around it rather than CO specific values (e.g. a Kubernetes annotation like io.k8s.<foo>.<bar>=<baz>... also not to say that the same style of API shouldn't exist, just that it ought to be defined in CSI so that the contract between the CO and the plugin is well defined here.

@saad-ali
Copy link
Member

saad-ali commented Nov 5, 2018

We are addressing the use cases for Kubernetes on a case by case basis and adding support via side car containers. For example we added pod workload info last quarter. We should avoid modifying the CSI spec to whole sale pass opaque data. Please follow up with SIG-Storage on any additional use cases you may have.

@saad-ali saad-ali closed this as completed Nov 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests