Removed config maps and replaced with rados omaps #312

ShyamsundarR · 2019-04-08T11:26:48Z

Existing config maps are now replaced with rados omaps that help
store information regarding the requested volume names and the rbd
image names backing the same.

Further to detect cluster, pool and which image a volume ID refers
to, changes to volume ID encoding has been done as per provided
design specification in the stateless ceph-csi proposal.

ShyamsundarR · 2019-04-08T11:28:38Z

Testing done includes,

Create delete PVC
Create pods using PVC
Create/promote/use/delete snapshots
Create/delete 5 PVCs
csi-sanity test suite

Tech Debt left for later (i.e optimizations and cleanup that does not hamper functionality and are not going to be a part of this PR):

Create a cache for pool IDs, instead of fetching them each time
Unit testing cephcmds.go
Fetch omaps in a list than one by one
Provide tools to inspect omaps against the Ceph cluster
Provide ability to use sharded omaps for volumes and snaps at scale

ShyamsundarR · 2019-04-08T11:31:30Z

Moved the PR against master from the older one against v1.0 branch.

@gman0 @dillaman @Madhu-1 PTAL

ShyamsundarR · 2019-04-11T21:24:19Z

@gman0 @Madhu-1 @dillaman PTAL, added one more commit to change documentation and sample manifests. Pending change now would be Helm charts (and catching upto master).

There is an urgency to try and get this in, so that we can move toward a 1.0 release and also adopt the same into the Rook projects 1.0 release (in 2 weeks as of this writing). Request attention with this consideration in mind. Thanks!

dillaman · 2019-04-11T23:59:33Z

@ShyamsundarR Ack -- I'll start reviewing this tonight.

dillaman

Initial quick pass comments since the more I dig, the more I am confused about the secrets...

docs/deploy-rbd.md

pkg/rbd/rbd_util.go

examples/rbd/storageclass.yaml

pkg/rbd/rbd_util.go

ShyamsundarR · 2019-04-12T11:03:21Z

Secrets/keys/IDs discussion

So to begin with, some of the questions you raise are concerns to me as well, and (partly?) covered in this issue. I am summarizing various comments around the topic here to provide a current, and possibly discuss a (near) future, answer to the same.

I would start referring to the 2 secrets (they are both kubernetes secrets, yes) as the secret and the config, where the secret is the one mentioned in the StorageClass [1] and config is the Ceph cluster config (ceph-cluster-) provided to the CSI plugin. Keeps the further discussion less ambiguous referring to the terms.

I am also throwing in the term credential in the discussion to refer to an ID/key (as in Ceph user ID and key) in the discussion below.

The issue of the ID being separate from the key in the StorageClass and in the secret.yaml, is a usability issue IMO, and this should be fixed (as a part of addressing Suggested changes to ID and key requirements for rbd CSI plugin #270)
Would the credentials in the secret be stored in the user's namespace?

No. The secret names and namespaces that are passed down from the StorageClass need to exist in a namespace, that is read by the node level services that invoke the CSI plugin RPCs with the data from these secrets. Hence, user namespaces need not have access to these secrets.

Also, the config secret needs to be readable by the CSI plugin on the nodes, and hence needs appropriate access to the secret in whichever namespace they exist.

In both cases, the user namespace requesting the volume provisioning or mounting service, do not need access to the secrets, nor are these present in their namespace.

Why credentials in 2 places (secrets and config)?

If we want to support fine grained credentials per pool, I.e credentials that have access to do CRUD operations on a pool and not across pools in use by the CSI plugin, then we would want to use the credentials from the secrets in the StorageClass, else we can use the credentials in the config itself.

For list operations, we will not get the credentials from the secrets and have to rely on the credentials from the config solely. If the config credentials can be crafted to have list permissions (IOW metadata read) then we can use the same for listing, else listing would have to remain unsupported.

Why 2 credentials, the admin and the user/mounter?

Again, if we can segregate permissions for these credentials, with the user credential having only mount permissions (post mount reads and writes to the image should be possible) and the admin credential having CRUD permissions, it makes sense to have 2, else we should just fold these into a single credential. (part of #270)

Where are credentials pulled from?

The order of using credentials in various RPCs are as follows,

Use credentials passed in by the RPC if they exist
Fall back to using credentials from the config if the RPC does not provide the credentials
Error, if both locations to not have credentials

I think we need to continue this in #270 and look at the need for 2 credentials (user/admin) and then see if we want per-pool level credential segregation and if so retain the current scheme, else throw away the credentials from being passed in as secrets.

[1] https://kubernetes-csi.github.io/docs/print.html#createdelete-volume-secret

dillaman · 2019-04-12T12:03:43Z

I would start referring to the 2 secrets (they are both kubernetes secrets, yes) as the secret and the config, where the secret is the one mentioned in the StorageClass [1] and config is the Ceph cluster config (ceph-cluster-) provided to the CSI plugin. Keeps the further discussion less ambiguous referring to the terms.

My concern is that they both provide credentials and that is confusing. I don't think there should be two different places to store the keys and up to three places to store the user ID. The fact that the code sometimes pulls the keys from the k8sconfig store via the cluster-id means it's switching back-and-forth between credentials potentially.

Would the credentials in the secret be stored in the user's namespace?

No.

Ack -- cool, thx.

Why credentials in 2 places (secrets and config)?

If we want to support fine grained credentials per pool, I.e credentials that have access to do CRUD operations on a pool and not across pools in use by the CSI plugin, then we would want to use the credentials from the secrets in the StorageClass, else we can use the credentials in the config itself.

Sorry, I wasn't really asking about the difference between "admin" and "user" -- just the fact that the "admin"/"user" secrets are stored twice (and the admin/user IDs are also specified multiple times).

Where are credentials pulled from?

The order of using credentials in various RPCs are as follows,

Use credentials passed in by the RPC if they exist

Fall back to using credentials from the config if the RPC does not provide the credentials

Error, if both locations to not have credentials

... and then fallback to the ID overrides from the StorageClass if not specified in the config secret?

I think we need to continue this in #270 and look at the need for 2 credentials (user/admin) and then see if we want per-pool level credential segregation and if so retain the current scheme, else throw away the credentials from being passed in as secrets.

Thanks for the pointer to that issue. I agree the difference between admin/user credentials is a separate issue. I am just worried about how confusing it is w/ all the different places to store a given set of credentials and how some CSI operations would pull credentials from the secret vs others that would pull it from the config.

ShyamsundarR · 2019-04-12T16:08:09Z

Thanks for the pointer to that issue. I agree the difference between admin/user credentials is a separate issue. I am just worried about how confusing it is w/ all the different places to store a given set of credentials and how some CSI operations would pull credentials from the secret vs others that would pull it from the config.

Discussed this further with @dillaman and here is the gist of the same:

We remove credentials from the config (secret) store ceph-cluster-<clusterid>
We will not support List operations,
- These are currently CSI driver generated operations, and do not impact CO systems
- Further are optional from a CSI support perspective as well
The config store will also remove pools as this was present for listing
Thus the config store will no longer be a secret as it contains only monitors, and this is public information that need not be stored as a secret
The config store will also move to a single config map as in solution alternative 1 as we are not storing secrets and hence the con regarding the secret management will not happen.
- The advantage of the same is that we need not implement any kubernetes watchers and such in code to keep the in-memory config store upto date with changed monitor information
The StorageClass will continue to mention ClusterID, and this ID would be found in the config map, with required monitor information in it

dillaman · 2019-04-12T17:11:51Z

@ShyamsundarR I was digging through the CSI spec and noticed the EXPAND_VOLUME cap. That seems like something that the RBD CSI should support (the in-tree driver supports resizing). However, the request seems to only include the VolumeID and no additional parameters/secrets to connect to the storage system. The workaround might be to eventually require STAGE_UNSTAGE_VOLUME on the node controller and cache the credentials in a temp file for use by the expand request.

The GET_CAPACITY and GET_VOLUME_STATS seem like another odd case where we won't get any secrets. The capacity RPC at least includes the StorageClass parameters so it could make a request of the secrets perhaps. The volume stats RPC gives a path to a published volume, so like the expand volume RPC, perhaps the credentials would be cached to a local file?

ShyamsundarR · 2019-04-12T18:35:42Z

The RPC ControllerExpandVolume supports secrets afaict 1 2, are you talking about NodeExpandVolume? If so,
What would we need to do on the node to expand the volume? (this is if we support online resizing, which I think we do)

The rbd image is already mapped
If it is not a Block volume, there is an FS mounted on the image

I was assuming the image expansion is auto-detected on the rbd clients, and the FS may need growfs like operations. Thus, we really do not need credentials on the NodeExpandVolume. Am I missing something here?

NodeGetVolumeStats is fetching stats for already mounted volumes, and the returned data seems to want, inodes OR bytes used/available/total per mounted volume. For which it can use the mounted image to get the required data, right? It again does not need credentials to fetch data from the Ceph cluster I would assume.
GetCapacity stands unimplemented as of now (and is optional as noted). Also, the passed in parameters field (i.e storage class details) are further optional in the spec. so there is no real manner in which it can be determined which entities capacity is required afaict. I would leave this unimplemented for the future as well.

dillaman · 2019-04-12T20:19:01Z

Re: (1) I think I was just looking at the NodeExpandVolumeRequest and didn't see the ControllerExpandVolumeRequest. Disregard.

Re: (2) I know there is a PR to get "metrics", but it looks like it's only being added to the CephFS driver for NodeGetVolumeStats. If there isn't any need to know the provisioned vs actual byte specs for RBD block PVs, I guess it's not needed.

Re: (3) If it's not used by k8s, no worries. I am just trying to think ahead for worst-case.

ShyamsundarR · 2019-04-12T20:28:28Z

Re: (1) I think I was just looking at the NodeExpandVolumeRequest and didn't see the ControllerExpandVolumeRequest. Disregard.

Re: (2) I know there is a PR to get "metrics", but it looks like it's only being added to the CephFS driver for NodeGetVolumeStats. If there isn't any need to know the provisioned vs actual byte specs for RBD block PVs, I guess it's not needed.

For block, I would assume we get metrics from the filesystem (or consumer) on the block device. Does it make sense to return used/free bytes from a block device?

Re: (3) If it's not used by k8s, no worries. I am just trying to think ahead for worst-case.

This is useful, so thanks. I had done this exercise earlier (i.e looking at secrets and which RPCs etc.) but helps with another review.

kfox1111 · 2019-04-15T15:49:20Z

I do really like the idea of breaking the somewhat strange loop in the driver that calls back to k8s. Big thank you for working on a fix. :)

I didn't see it called out up above, but one critical piece will be a migration path for those of us already using it. Is this planned?

batrick · 2019-04-15T19:16:11Z

pkg/rbd/rbd.go

+  - stores keys named using the CO generated names for volume requests
+  - keys are named "csi.volume."+[CO generated VolName]
+  - Key value contains the RBD image uuid that is created or will be created, for the CO provided
+  name


Whenever I see applications making raw use of omaps I get nervous. The reason for this is mostly around the de facto limit of ~100,000 k/v pairs per object. Are we confident that a cluster won't have more than 100k volumes?

Fair enough :) The initial numbers I discussed with @dillaman were 100K to be exact.

Anyway, as this information is new to me, and we added to omap to support listing (i.e bunch up the csi volumes under an omap to help with listing). My initial solution to the problem (as we can never answer what scale this may reach) would be to remove the csi.volumes. and csi.snaps. omap, but retain the volname and volid based omaps and keys, as we are no longer intending to support listing RPCs. Thoughts?

(the other knee jerk reaction was, let's create chapters of csi.volumes and use the same, but that depends on if we do not have to maintain counters around number of keys in an omap and can work with errors returned from a omap setomapval instead, that can help understand it is "full")

@ShyamsundarR with this approach how many PVC and snapshots we can create?

Currently we can scale upto 100K, which is not essentially a hard ceiling. IOW, the updates to the objects may slow down, but not fail at that exact number.

This was discussed, and a possible path for the future is noted in this comment.

IOW, we decided to continue with the current scale, and in the future if required shard the objects, which is feasible under the current scheme.

batrick · 2019-04-15T19:17:08Z

pkg/rbd/rbd.go

+  - stores keys named using the CO generated names for snapshot requests
+  - keys are named "csi.snap."+[CO generated SnapName]
+  - Key value contains the RBD snapshot uuid that is created or will be created, for the CO
+  provided name


Same concern: k/v pair per snapshot won't scale I think.

batrick · 2019-04-15T19:26:09Z

examples/README.md

+  * Key is typically the output of, `ceph auth get-key client.admin` where
+    `admin` is the client Admin ID
+  * This is used to substitute `[admin|user]id` and `[admin|user]key` values in
+    the template file


I'm not seeing a particular reason why the ceph-csi needs the admin key. Is it so that it can do ceph osd lspools?

The reason was a simple carryover of what existed prior to this PR. The exact credentials and its permissions are being worked out in issue #270 when we also need to think about what rights are required for the various operations being performed (like ceph osd lspools).

ShyamsundarR · 2019-04-16T00:25:54Z

Thanks for the pointer to that issue. I agree the difference between admin/user credentials is a separate issue. I am just worried about how confusing it is w/ all the different places to store a given set of credentials and how some CSI operations would pull credentials from the secret vs others that would pull it from the config.

Discussed this further with @dillaman and here is the gist of the same:

* We remove credentials from the config (secret) store `ceph-cluster-<clusterid>`

* We will not support List operations,
  
  * These are currently CSI driver generated operations, and do not impact CO systems
  * Further are optional from a CSI support perspective as well

* The config store will also remove `pools` as this was present for listing

* Thus the config store will no longer be a secret as it contains only monitors, and this is public information that need not be stored as a secret

* The config store will also move to a single config map as in [solution alternative 1](https://github.com/ceph/ceph-csi/blob/e6139baaff4eca8209b38c514d37ac49b544a567/docs/design/proposals/stateless-csi-plugin.md#solution-alternative-1) as we are not storing secrets and hence the con regarding the secret management will not happen.
  
  * The advantage of the same is that we need not implement any kubernetes watchers and such in code to keep the in-memory config store upto date with changed monitor information

* The StorageClass will continue to mention `ClusterID`, and this ID would be found in the config map, with required monitor information in it

@dillaman Addressed the dual credential issue with the latest commit, PTAL. The issue of still having to specify 2 credentials (user and admin) and also the ID separate from the Key would be dealt with as a part of fixing #270

@gman0 @Madhu-1 Configuration requirements have changed from requiring a series of values in a secret to a config map of cluster IDs containing the monitors.

ShyamsundarR · 2019-04-16T00:37:15Z

I do really like the idea of breaking the somewhat strange loop in the driver that calls back to k8s. Big thank you for working on a fix. :)

I didn't see it called out up above, but one critical piece will be a migration path for those of us already using it. Is this planned?

Yes, migration is not planned as yet. This has cropped up from a 0.3 version to this instance as well in #296

Should we discuss the same in the other issue to arrive at a more generic solution? Also, at present as migration solutions I do not have any thoughts to offer.

dillaman · 2019-04-18T14:46:51Z

Yes, migration is not planned as yet. This has cropped up from a 0.3 version to this instance as well in #296

Should we discuss the same in the other issue to arrive at a more generic solution? Also, at present as migration solutions I do not have any thoughts to offer.

It would be great if a generic migration solution would fit here assuming the v0.3 driver used a different storage class. That same solution could then be used for migrating volumes between different storage tiers (i.e. bronze, silver, gold) or between non-mirrored and mirrored.

ShyamsundarR · 2019-04-22T21:48:54Z

Updated the Helm charts and squashed all commits into one.

Among the open issues, is only deciding about the omap scale as per this comment.

I am hoping we can get this in this week, and start work on credential changes to RBD and also working on the CephFS plugin to incorporate similar changes.

Requesting reviews, @dillaman @Madhu-1 @gman0 (also, @batrick from the omap usage perspective (among others)).

kfox1111 · 2019-04-22T21:58:31Z

Kind of concerned.... what is the migration path here? There are existing users of the chart.

ShyamsundarR · 2019-04-30T13:20:07Z

Among the open issues, is only deciding about the omap scale as per this comment.

On further discussions regarding the omap scale issue, the resolution would be to apply sharding (as suggested by @liewegas) to the volumes and snaps omaps when required.

The ability to shard can be based on selecting a shard using the volumename and the snapname hash, to map to a shard number, where the shards are omaps with the specified shard number in their name.

Also, when sharding is introduced, the existing omaps may need a one-time "split into shards" operation to be performed, but this will not impact the metadata exchanged with the CO environment.

Added sharding to the TechDebt list as well.

dillaman

lgtm -- a few nits and a question about how one updates the "ceph-csi-config"

docs/deploy-rbd.md

examples/rbd/csi-config-map-sample.yaml

examples/rbd/snapshotclass.yaml

examples/rbd/storageclass.yaml

pkg/rbd/rbd_util.go

ShyamsundarR · 2019-05-02T14:11:20Z

@dillaman I am adding the DNM flag to this. As per my understanding the following actions need to be closed before we can merge this, (reviews and acks still welcome)

Finalize CephFS design for the same (action is on me to take this to a resolution)
Get a reviewer who can review, with added focus, golang specifics in the change (@Madhu-1 @gman0 @humblec requesting a review with this added focus).

dillaman · 2019-05-02T14:50:57Z

@ShyamsundarR Thanks!

kfox1111 · 2019-05-03T15:30:08Z

We found an issue in the helm chart where it is writing configmaps to the wrong namespace.(#342)

We could fix it but then would require steps to move the configmaps from the old location to the new location.

But if we have a migration plan for getting configmaps to omaps, soon, we could skip fixing the configmap issue and recommend just switching to omaps.

How soon do we thing this pr is:

ready for production
have a migration tool from configmaps to omaps

ShyamsundarR · 2019-05-07T14:32:50Z

* Get a reviewer who can review, with added focus, golang specifics in the change (@Madhu-1 @gman0 @humblec requesting a review with this added focus).

Requesting reviews @humblec @Madhu-1 @gman0

@phlogistonjohn any help is appreciated.

cmd/cephcsi.go

pkg/rbd/controllerserver.go

humblec · 2019-05-09T15:20:34Z

pkg/rbd/nodeserver.go

+	if req.GetVolumeId() == "" {
+		return nil, status.Error(codes.InvalidArgument, "Empty volume ID in request")
+	}
+


s/Empty/s in above if's..

All existing status.Error routines seem to capitalize the first letter of the error message. Is this not the convention?

NOTE: Retaining one comment on the capitalization of first letter of messages in status.Error for ease of discussion.

As pointed out by @Madhu-1 error strings need to start with a lower case letter in golang. This is also enforced by the linter.

These errors seem to be RPC error messages, should they or do they follow the same convention is something to consider before fixing it.

I assume we open a new issue against the code, to fix this across rather than address part of them here and rest later. IOW, I intend to leave added status.Errors in the current form than correcting them to adhere to the first letter being lower case.

@humblec or @Madhu-1 if you feel strongly that this should be corrected in this PR, for messages added with this PR, let me know.

pkg/rbd/rbd.go

pkg/rbd/rbd_util.go

humblec · 2019-05-09T15:25:06Z

pkg/rbd/rbd_util.go

 	if err != nil {
-		return err
+		klog.Errorf("failed getting information for image (%s): (%s)", poolName+"/"+imageName, err)
+		if strings.Contains(string(stdout), "rbd: error opening image "+imageName+


Why we need this check here ?

The check is to detect missing images, and segregate them from other classes of errors. We need this to return success for successive DeleteVolume requests in case the first one times out, but the work is actually done. If we cannot detect image missing errors it would appear to the CO system as internal errors or other errors.

The above is one case of the same, which is needed in the code. The error itself can be used generically to detect that such an image does not exist.

pkg/util/csiconfig.go

humblec · 2019-05-09T15:28:46Z

@ShyamsundarR Started the review, there is also conflict. Can you correct the conflict and update the PR. Btw, I see there are good number of changes which are just indentation correction. Can you remove those changes from this PR for better review look?

ShyamsundarR · 2019-05-09T21:22:57Z

@ShyamsundarR Started the review, there is also conflict. Can you correct the conflict and update the PR. Btw, I see there are good number of changes which are just indentation correction. Can you remove those changes from this PR for better review look?

@humblec I would be updating the PR addressing any review comments and squashing the commits into one, at which point I intend to address the conflicts as well and also would rebase to tip of master. Is this fine? Or, do you want me to rebase right away?

I went through the PR attempting to find the indentation corrections only, and found about 18 lines that are such changes (out of ~2200 additions in all). Some of these 18 are in YAML where there were trailing white spaces, some for lines that were too long, and a few to introduce a line break at some logical point in the code.

Further, I used the diff settings in github to ignore whitespaces and counted line deltas, other than 5 files no other files contain whitespace only changes when comparing the changed lines without the said diff settings.

Some structures and assignment blocks are auto indented when adding or removing a variable whose length is different, these white space changes are out of my control (unless I use a plain text editor to correct them back to their original settings).

I do understand the problem white space only changes can cause in reviews, but the quantum in this PR seems quite low (~18 as I saw it).

Do you really want me to remove these in the PR? As this calls for very careful looking and removing just these changes. If they are hindering review, could you change the diff settings for this PR to ignore white spaces and take a look?

deploy/rbd/helm/templates/nodeplugin-daemonset.yaml

docs/deploy-rbd.md

pkg/cephfs/volumemounter.go

pkg/rbd/controllerserver.go

pkg/rbd/rbd_util.go

pkg/util/cephcmds.go

ShyamsundarR · 2019-05-12T16:33:54Z

@Madhu-1 @humblec completed addressing review comments provided till now.

Added some comments on reviews not addressed for discussion.

Requesting further reviews and discussion to take this to completion. Thanks!

humblec · 2019-05-15T10:57:59Z

docs/deploy-rbd.md

-`monitors` | one of `monitors`, `clusterID` or `monValueFromSecret` must be set | Comma separated list of Ceph monitors (e.g. `192.168.100.1:6789,192.168.100.2:6789,192.168.100.3:6789`)
-`monValueFromSecret` | one of `monitors`, `clusterID` or and `monValueFromSecret` must be set | a string pointing the key in the credential secret, whose value is the mon. This is used for the case when the monitors' IP or hostnames are changed, the secret can be updated to pick up the new monitors.
-`clusterID` | one of `monitors`, `clusterID` or `monValueFromSecret` must be set | String representing a Ceph cluster, must be unique across all Ceph clusters in use for provisioning, cannot be greater than 36 bytes in length, and should remain immutable for the lifetime of the Ceph cluster in use
+`clusterID` | yes | String representing a Ceph cluster, must be unique across all Ceph clusters in use for provisioning, cannot be greater than 36 bytes in length, and should remain immutable for the lifetime of the Ceph cluster in use


s/String/string

Almost all first letters of the description column in the deploy-rbd.md doc is capitalized, as seen here, are you suggesting the first "String" word be in lower case as in "string"

humblec · 2019-05-15T10:59:49Z

docs/deploy-rbd.md

-specified in `examples/rbd/template-ceph-cluster-ID-secret.yaml` needs to be
-created, with the secret name matching the string value provided as the
-`clusterID`.
-
 ## Deployment with Kubernetes

 Requires Kubernetes 1.11


May be 'Requires Kubernetes >= v1.11 ?'

This need to be kube >=1.13

This is not updated as a part of this patch, I am testing against v1.13 and also do not have any specific Kube version dependent changes to call out a different version.

Do you want me to change this as a part of this PR, or should it be a separate edit?

Agreed, This can a separate patch or it can be updated as part of this patch

humblec · 2019-05-15T11:03:23Z

pkg/rbd/controllerserver.go

@@ -87,10 +52,17 @@ func (cs *ControllerServer) validateVolumeReq(req *csi.CreateVolumeRequest) erro
 	if req.VolumeCapabilities == nil {
 		return status.Error(codes.InvalidArgument, "Volume Capabilities cannot be empty")
 	}
+	options := req.GetParameters()
+	if value, ok := options["clusterID"]; !ok || len(value) == 0 {
+		return status.Error(codes.InvalidArgument, "Missing or empty cluster ID to provision volume from")


s/Missing/missing

I thought we are discussing this here, IOW, do you want me to change the strings that I added in this patch to capitalized first letter OR are we dealing with this in a separate patch as the code base needs a revision on the same? Please let me know so that I can handle it appropriately.

pkg/rbd/controllerserver.go

humblec · 2019-05-15T11:12:02Z

pkg/rbd/errors.go

+	err       error
+}
+
+func (e ErrImageNotFound) Error() string {


Hmm.. this does not look like a common pattern for describing errors and method on top.

Can you point to a common pattern to adopt? I do not know of any hence asking.

@ShyamsundarR can we move this file outside of rbd, s that we can use same for cephfs also.
is this possible?

Some errors move out to util when they are shared by CephFS (in the [refactor]ShyamsundarR@c987e8b) patch) some stay (e.g ErrImageNotFound) as they are RBD specific. So in this PR I cannot move it to a common file.

BTW, there is another common file in util with other errors already.

@Madhu-1 pointed out to this source for error definition https://golang.org/src/errors/errors.go and as per this, the way errors are defined looks right. @humblec can you clarify if this looks right?

Existing config maps are now replaced with rados omaps that help store information regarding the requested volume names and the rbd image names backing the same. Further to detect cluster, pool and which image a volume ID refers to, changes to volume ID encoding has been done as per provided design specification in the stateless ceph-csi proposal. Additional changes and updates, - Updated documentation - Updated manifests - Updated Helm chart - Addressed a few csi-test failures Signed-off-by: ShyamsundarR <srangana@redhat.com>

Madhu-1

its s long pending PR we can take this \ in, if any minor comments are there can be addressed in next PR

Madhu-1 · 2019-05-17T11:37:57Z

@humblec PTAL.

kfox1111 · 2019-05-17T17:27:36Z

Is there a migration path defined yet?

ShyamsundarR · 2019-05-17T19:40:13Z

Is there a migration path defined yet?

I want to address this in a few ways, hence opened an issue #378 that talks about the same with the label that it is part of the release 1.1.

humblec · 2019-05-19T12:28:45Z

@ShyamsundarR To summarize my thoughts, we need to do some more code improvement here and I would like to sort out the locking stuff in another PR. I am making a design for the same. However these can be follow up PRs and lets unblock this for now. So, I am fine with addressing rest in subsequent work. In short, approved.

Syncing latest changes from upstream devel for ceph-csi

ShyamsundarR mentioned this pull request Apr 8, 2019

Replaced config map usage with, omaps and volume ID generation #288

Closed

ShyamsundarR mentioned this pull request Apr 9, 2019

How stable is the csi-v1.0 implementation currently? #311

Closed

dillaman reviewed Apr 12, 2019

View reviewed changes

docs/deploy-rbd.md Outdated Show resolved Hide resolved

pkg/rbd/rbd_util.go Show resolved Hide resolved

examples/rbd/storageclass.yaml Outdated Show resolved Hide resolved

pkg/rbd/rbd_util.go Show resolved Hide resolved

ShyamsundarR mentioned this pull request Apr 15, 2019

Implement metrics for cephfs CSI driver. #310

Merged

batrick reviewed Apr 15, 2019

View reviewed changes

ShyamsundarR force-pushed the stateless-removeconfigmap branch from 7888eac to 2b45d3d Compare April 22, 2019 21:38

ShyamsundarR changed the title ~~Stateless removeconfigmap~~ Removed config maps and replaced with rados omaps Apr 22, 2019

ShyamsundarR mentioned this pull request Apr 26, 2019

How to migrate from csi-driver 0.3 -> csi-driver csi-1.0 #296

Closed

dillaman reviewed May 1, 2019

View reviewed changes

ShyamsundarR force-pushed the stateless-removeconfigmap branch from 2b45d3d to b72caa6 Compare May 2, 2019 13:59

ShyamsundarR added the DNM DO NOT MERGE label May 2, 2019

ShyamsundarR mentioned this pull request May 8, 2019

[Proposal] Maintain a stateless CSI implementation #224

Closed

humblec added the release-1.1.0 Track release 1.1 items label May 9, 2019

humblec reviewed May 9, 2019

View reviewed changes

Madhu-1 reviewed May 10, 2019

View reviewed changes

ShyamsundarR force-pushed the stateless-removeconfigmap branch 2 times, most recently from 6b8cffb to bd9492c Compare May 12, 2019 16:27

ShyamsundarR mentioned this pull request May 13, 2019

Fix error string as per golang standard #372

Merged

humblec reviewed May 15, 2019

View reviewed changes

ShyamsundarR mentioned this pull request May 15, 2019

All CLI error and output needs to be audited for possible key leaks in the messages #376

Closed

ShyamsundarR force-pushed the stateless-removeconfigmap branch from bd9492c to 4f787f8 Compare May 16, 2019 14:38

Madhu-1 approved these changes May 17, 2019

View reviewed changes

ShyamsundarR mentioned this pull request May 17, 2019

Backward compatibility and migration to 1.1 volumes from existing 1.0 versions #378

Closed

humblec removed the DNM DO NOT MERGE label May 19, 2019

humblec approved these changes May 19, 2019

View reviewed changes

mergify bot merged commit d02e50a into ceph:master May 19, 2019

ShyamsundarR deleted the stateless-removeconfigmap branch June 12, 2019 19:05

Cynerva mentioned this pull request Sep 4, 2019

Update CoreDNS and Ceph CSI for 1.16 charmed-kubernetes/cdk-addons#143

Merged

ShyamsundarR mentioned this pull request Feb 4, 2020

Support sharding the CSI volumes OMAP, to overcome large (>200k) volume counts per pool/cephfs instance #818

Closed

Madhu-1 pushed a commit to Madhu-1/ceph-csi that referenced this pull request Jun 20, 2024

Merge pull request ceph#312 from red-hat-storage/sync_us--devel

d63e429

Syncing latest changes from upstream devel for ceph-csi

Removed config maps and replaced with rados omaps #312

Removed config maps and replaced with rados omaps #312

Conversation

ShyamsundarR commented Apr 8, 2019

ShyamsundarR commented Apr 8, 2019 • edited Loading

ShyamsundarR commented Apr 8, 2019

ShyamsundarR commented Apr 11, 2019

dillaman commented Apr 11, 2019

dillaman left a comment

Choose a reason for hiding this comment

ShyamsundarR commented Apr 12, 2019

dillaman commented Apr 12, 2019

ShyamsundarR commented Apr 12, 2019

dillaman commented Apr 12, 2019

ShyamsundarR commented Apr 12, 2019

dillaman commented Apr 12, 2019

ShyamsundarR commented Apr 12, 2019

kfox1111 commented Apr 15, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ShyamsundarR commented Apr 16, 2019

ShyamsundarR commented Apr 16, 2019

dillaman commented Apr 18, 2019

ShyamsundarR commented Apr 22, 2019

kfox1111 commented Apr 22, 2019

ShyamsundarR commented Apr 30, 2019

dillaman left a comment

Choose a reason for hiding this comment

ShyamsundarR commented May 2, 2019

dillaman commented May 2, 2019

kfox1111 commented May 3, 2019

ShyamsundarR commented May 7, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

humblec commented May 9, 2019 • edited Loading

ShyamsundarR commented May 9, 2019

ShyamsundarR commented May 12, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Madhu-1 left a comment

Choose a reason for hiding this comment

Madhu-1 commented May 17, 2019

kfox1111 commented May 17, 2019

ShyamsundarR commented May 17, 2019

humblec commented May 19, 2019

ShyamsundarR commented Apr 8, 2019 •

edited

Loading

humblec commented May 9, 2019 •

edited

Loading