Skip to content
This repository has been archived by the owner on Oct 22, 2024. It is now read-only.

avoid double creation of same volume, serialize with mutex #106

Merged
merged 3 commits into from
Jan 7, 2019

Conversation

okartau
Copy link
Contributor

@okartau okartau commented Dec 12, 2018

Sometimes, CSI master sends 2nd publish request just
after first one, either there was timeout or some other reason.
Node driver should withstand repeated requests without side effects.
In ControllerPublishVolume in Node part, if we receive
a second request with same volumeID, we attempt to create a next phys.volume
because we do not check have we already created and published.
There is local registry where we store published ID, so
let's check there and avoid double operation.

@okartau
Copy link
Contributor Author

okartau commented Dec 12, 2018

diff may look big but change itself is small, diff gets confused by indent changes. I use indent changes only to reuse return of OK-response part, as we have to response with OK in such case.

@okartau okartau force-pushed the use-published-registry branch from dffad7c to 1b72c0c Compare December 12, 2018 12:26
@avalluri
Copy link
Contributor

@olev, The complete(real) fix to this issue would be to avoid repeated calls to node controller. i.e, from master controller, we are blindly passing all attach/publish calls to node controller without checking the cs.pmemVolumes[req.VolumeId].Status if its already 'Attached' or not.

Can you please if that also part of this change.

@pohly
Copy link
Contributor

pohly commented Dec 12, 2018

The fix must include protection against concurrent calls. As it stands now, a second call that arrives before the first one has updated cs.publishVolumeInfo would still create a new volume, wouldn't it?

The simplest solution is to lock a single mutex at the beginning of each call.

@okartau
Copy link
Contributor Author

okartau commented Dec 12, 2018

ok, I will add mutex in Node part.
But what about Controller part of same code i.e. just above: if cs.mode == Controller
where we initiate that RPC call to node.
Is controller part also able to run multiple instances in parallel?
Or is simple check there enough for already Attached state?

@pohly
Copy link
Contributor

pohly commented Dec 12, 2018 via email

@okartau okartau force-pushed the use-published-registry branch 2 times, most recently from 82cbc5c to 4b5f4b0 Compare December 19, 2018 13:07
@okartau
Copy link
Contributor Author

okartau commented Dec 19, 2018

I added check for double publish request on Controller side, as that would catch that condition already before RPC to Node, if Controller had reached the point setting the Volume state to Attached. The open question here is, what to return on Controller, the nil is likely not that good? But as real return structure is not coming from Node, should we make up a fake one locally and return it?
The use of mutex still to be added in more commits to come

@pohly
Copy link
Contributor

pohly commented Dec 20, 2018 via email

@okartau okartau force-pushed the use-published-registry branch from 4b5f4b0 to 11a434f Compare December 31, 2018 10:25
@okartau
Copy link
Contributor Author

okartau commented Dec 31, 2018

I gave up on plan checking for repeated creation on Controller side, as it has to return with valid structure and it feels not right to start synthesize response structure from empty. Lets pass the request to Node which will respond with real response.
Also, I added use of mutex to protect Add, Delete operations.
Added mutex import forced pass of "dep ensure" which introduced many changes on vendor/ side,
there's separate commit with those

@okartau
Copy link
Contributor Author

okartau commented Dec 31, 2018

tested on unified mode with scripts in util/, also in a cluster with system under test/

@pohly
Copy link
Contributor

pohly commented Dec 31, 2018

Which version of dep did you use? Please try with 0.5, the vendor commit should be smaller then.

@avalluri
Copy link
Contributor

avalluri commented Jan 2, 2019

I gave up on plan checking for repeated creation on Controller side, as it has to return with valid structure and it feels not right to start synthesize response structure from empty. Lets pass the request to Node which will respond with real response.

So you don't feel like below code at Controller side avoid some level of repeated calls to Node:

                if !ok {
                        return nil, status.Error(codes.NotFound, fmt.Sprintf("No volume with id '%s' found", req.VolumeId))
                }
+               if vol.Status == Attached && vol.NodeID == req.NodeId  {
+                       return &csi.ControllerPublishVolumeResponse{
+                               PublishInfo: map[string]string{
+                                       "name": vol.Name,
+                                       "size": strconv.FormatUint(vol.Size, 10),
+                               },
+                       }, nil
+               }
                node, err := cs.rs.GetNodeController(req.NodeId)
                if err != nil {
                        return nil, status.Error(codes.NotFound, err.Error())


@okartau
Copy link
Contributor Author

okartau commented Jan 2, 2019

I agree this code may avoid some repeated RPC calls, but my thinking is, overhead of one RPC call is not that significant that it justifies adding a 2nd instance of creating PublishVolumeRespone. Creating response on Controller feels like layering violation, as creating of response normally done on Node. My idea thus is, let all responses be created in one place in code.
But this is just one proposal.
If we reach agreement that such optimization/check is useful, I'm okay adding such code.
@pohly you want to give your vote?

@okartau
Copy link
Contributor Author

okartau commented Jan 2, 2019

another point with the mutex code is: is it OK to have it in controllerserver.go file as in current commit, or should we create a separate file as was in oim example I used. Current approach kind of limits mutex use in one file only, right.

@avalluri
Copy link
Contributor

avalluri commented Jan 2, 2019

I agree this code may avoid some repeated RPC calls, but my thinking is, overhead of one RPC call is not that significant that it justifies adding a 2nd instance of creating PublishVolumeRespone. Creating response on Controller feels like layering violation, as creating of response normally done on Node

When i was writing this code, I used Controller{Publish,Unpublish}Volume at Node just for convince, but more suitable calls would be(at some point of time i think we should move to use) {Create,Delete}Volume. So i never felt like PublishVolumeResponse was Node's job.

@pohly
Copy link
Contributor

pohly commented Jan 2, 2019

I would not try to avoid a second gRPC call. If it is the node which does the work and creates the response, then it is that code which should detect an idempotent call and return the same response as before.

Phrasing it differently, all gRPC calls should be idempotent, even if they are only used internally. Then callers can always rely on that instead of having to implement additional logic.

@pohly
Copy link
Contributor

pohly commented Jan 2, 2019 via email

@okartau
Copy link
Contributor Author

okartau commented Jan 2, 2019

Which version of dep did you use?

dep version:
dep:
 version     : devel

not very helpful.
By package version it is 0.3.2 as provided on Ubuntu 18.04 and has not been updated, and likely will not be, so 8 months old mainstream distro is hopelessly out of date for development.
But I guess we have to adapt to that in Go world. Seems the whole dep tool is about to be overtaken by next wave.
Does dep BTW record its own version somewhere in created structure? I havent find such yet.
If different developers use whatever tools/versions they happen to use, what about controlled and managed development toolchain.
Leads to question, should we document dep version requirement in README

@avalluri
Copy link
Contributor

avalluri commented Jan 2, 2019

      I would not try to avoid a second gRPC call. If it is the node which does the work and creates the response, then it is that code which should detect an idempotent call and return the same response as before.

Phrasing it differently, all gRPC calls should be idempotent, even if they are only used internally. Then callers can always rely on that instead of having to implement additional logic.

I am not against making Node calls idempotent, but my point is why can't the caller(Controller) detect duplicates when its already having enough information to take decision.

@pohly
Copy link
Contributor

pohly commented Jan 3, 2019 via email

@pohly
Copy link
Contributor

pohly commented Jan 3, 2019 via email

@okartau okartau force-pushed the use-published-registry branch from 11a434f to a4107d7 Compare January 3, 2019 07:52
@okartau
Copy link
Contributor Author

okartau commented Jan 3, 2019

I replaced 3rd commit so that "dep ensure" was applied using dep 0.5.0, and indeed change size drops drastically (to few thousands from million+ lines)

@okartau okartau force-pushed the use-published-registry branch 2 times, most recently from 1699ab5 to dca2872 Compare January 3, 2019 09:00
@okartau okartau changed the title controllerserver.go: avoid double creation of same volume avoid double creation of same volume, serialize with mutex Jan 3, 2019
@okartau
Copy link
Contributor Author

okartau commented Jan 3, 2019

I added more mutex protection in 2nd commit to also cover state-changing operations on Node side, like mkfs, mount,umount

@okartau okartau force-pushed the use-published-registry branch from dca2872 to 090b48d Compare January 3, 2019 09:53
@avalluri
Copy link
Contributor

avalluri commented Jan 3, 2019

Amarnath Valluri notifications@github.com writes:
I am not against making Node calls idempotent, but my point is why can't the caller(Controller) detect duplicates when its already having enough information to take decision.
Can it be 100% sure that this information is up-to-date? If both controller and node have the same information, how is it kept in sync?

It should be in sync, as all changes should go via Controller -> Node, and there is no way to sync back state change if any at Node side.

When ever Controller gets PublishVolume call it sends that request to Node, and on success updates it
cache that the volume state is 'Attached', the same way on UnpublishVolume call it updates cache to Detached. So i believe it should be upto date and can be trusted.

@okartau okartau force-pushed the use-published-registry branch 2 times, most recently from db4ce7d to 7e49d03 Compare January 3, 2019 11:32
Sometimes, CSI master sends 2nd publish request just
after first one, either there was timeout or some other reason.
Node should withstand repeated requests without side effects.
In ControllerPublishVolume in Node part, if we receive
a second request with same volumeID, we attempt to create a next phys.volume
because we do not check have we already created and published.
There is local registry where we store published ID, so
let's check there and avoid double operation.
As gRPC requests may get served in arbitrary order and also
in parallel, and master can send repeated requests, we
use mutex with key=VolumeID to protect operations that
change state, like add/delete volumes/devices, mkfs, mount.
Importing mutex forced run of "dep ensure" which created
this set of changes. dep v.0.5.0 was used.
@okartau okartau force-pushed the use-published-registry branch from 7e49d03 to 6b2b9d1 Compare January 7, 2019 09:02
@okartau
Copy link
Contributor Author

okartau commented Jan 7, 2019

So what is the decision now, should we merge this as is now, or add another check in Controller side. To get this moving, I propose that we merge it now, fixing actual issue, and treat need for additional check in separate thread

@avalluri
Copy link
Contributor

avalluri commented Jan 7, 2019

I am ok to merge this, lets deal the other issue later.

@okartau okartau merged commit 0189627 into intel:devel Jan 7, 2019
@okartau okartau deleted the use-published-registry branch January 25, 2019 08:31
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants