-
Notifications
You must be signed in to change notification settings - Fork 55
avoid double creation of same volume, serialize with mutex #106
Conversation
diff may look big but change itself is small, diff gets confused by indent changes. I use indent changes only to reuse return of OK-response part, as we have to response with OK in such case. |
dffad7c
to
1b72c0c
Compare
@olev, The complete(real) fix to this issue would be to avoid repeated calls to node controller. i.e, from master controller, we are blindly passing all attach/publish calls to node controller without checking the Can you please if that also part of this change. |
The fix must include protection against concurrent calls. As it stands now, a second call that arrives before the first one has updated cs.publishVolumeInfo would still create a new volume, wouldn't it? The simplest solution is to lock a single mutex at the beginning of each call. |
ok, I will add mutex in Node part. |
Olev Kartau <notifications@github.com> writes:
ok, I will add mutex in Node part.
But what about Controller part of same code i.e. just above: if cs.mode == Controller
where we initiate that RPC call to node.
Is controller part also able to run multiple instances in parallel?
Or is simple check there enough for already Attached state?
Don't assume that any data can be accessed safely unless explicitly
documented as threadsafe. That means you need to check all gRPC calls
and if they use data that might be modified in parallel, add a mutex for
that data.
|
82cbc5c
to
4b5f4b0
Compare
I added check for double publish request on Controller side, as that would catch that condition already before RPC to Node, if Controller had reached the point setting the Volume state to Attached. The open question here is, what to return on Controller, the nil is likely not that good? But as real return structure is not coming from Node, should we make up a fake one locally and return it? |
Olev Kartau <notifications@github.com> writes:
I added check for double publish request on Controller side, as that
would catch that condition already before RPC to Node, if Controller
had reached the point setting the Volume state to Attached. The open
question here is, what to return on Controller, the _nil_ is likely
not that good? But real return structure would come from Node, should
we make up a fake one locally and return it?
I haven't looked in code, but the idea behind idempotency is that a
function returns exactly the same thing as after the initial call.
|
4b5f4b0
to
11a434f
Compare
I gave up on plan checking for repeated creation on Controller side, as it has to return with valid structure and it feels not right to start synthesize response structure from empty. Lets pass the request to Node which will respond with real response. |
tested on unified mode with scripts in util/, also in a cluster with system under test/ |
Which version of dep did you use? Please try with 0.5, the vendor commit should be smaller then. |
So you don't feel like below code at Controller side avoid some level of repeated calls to Node:
|
I agree this code may avoid some repeated RPC calls, but my thinking is, overhead of one RPC call is not that significant that it justifies adding a 2nd instance of creating PublishVolumeRespone. Creating response on Controller feels like layering violation, as creating of response normally done on Node. My idea thus is, let all responses be created in one place in code. |
another point with the mutex code is: is it OK to have it in controllerserver.go file as in current commit, or should we create a separate file as was in oim example I used. Current approach kind of limits mutex use in one file only, right. |
When i was writing this code, I used Controller{Publish,Unpublish}Volume at Node just for convince, but more suitable calls would be(at some point of time i think we should move to use) {Create,Delete}Volume. So i never felt like PublishVolumeResponse was Node's job. |
I would not try to avoid a second gRPC call. If it is the node which does the work and creates the response, then it is that code which should detect an idempotent call and return the same response as before. Phrasing it differently, all gRPC calls should be idempotent, even if they are only used internally. Then callers can always rely on that instead of having to implement additional logic. |
Olev Kartau <notifications@github.com> writes:
another point with the mutex code is: is it OK to have it in
controllerserver.go file as in current commit, or should we create a
separate file as was in oim example I used. Current approach kind of
limits mutex use in one file only, right.
If it's only used in a single file, then there's no need to declare it
separately. In OIM I decided to use a separate file because the
volumeNameMutex is used by both node and controller server.
|
not very helpful. |
I am not against making Node calls idempotent, but my point is why can't the caller(Controller) detect duplicates when its already having enough information to take decision. |
Amarnath Valluri <notifications@github.com> writes:
I am not against making Node calls idempotent, but my point is why
can't the caller(Controller) detect duplicates when its already having
enough information to take decision.
Can it be 100% sure that this information is up-to-date? If both
controller and node have the same information, how is it kept in sync?
|
Olev Kartau <notifications@github.com> writes:
By package version it is 0.3.2 as provided on Ubuntu 18.04 and has not been updated, and likely will not be, so 8 months old mainstream distro is hopelessly out of date for development.
But I guess we have to adapt to that in Go world. Seems the whole dep
tool is about to be overtaken by next wave.
I chatted with the main dep author and it is not certain that dep will
disappear. I personally find the arguments convincing (see
https://sdboyer.io/vgo/failure-modes/) that a more elaborate algorithm
for dependency resolution than the simplistic one chosen by core Go is
needed.
Does dep BTW record its own version somewhere in created structure? I
havent find such yet.
I don't think it does. Compatibility between versions seems to be good
enough that this hasn't been necessary, even if the result then is
different.
If different developers use whatever tools/versions they happen to
use, what about controlled and managed development toolchain.
They are already allowed to use different Go versions, right? My point
is, there is no perfect control over what developers use, just
recommendations.
If it has an effect on what gets checked into the repo, then this has to
be caught by code review or an automated CI check - but we don't have a
CI at the moment.
Leads to question, should we document dep version requirement in
README
Doesn't hurt, but IMHO isn't that important because not many people need
that information.
|
11a434f
to
a4107d7
Compare
I replaced 3rd commit so that "dep ensure" was applied using dep 0.5.0, and indeed change size drops drastically (to few thousands from million+ lines) |
1699ab5
to
dca2872
Compare
I added more mutex protection in 2nd commit to also cover state-changing operations on Node side, like mkfs, mount,umount |
dca2872
to
090b48d
Compare
It should be in sync, as all changes should go via Controller -> Node, and there is no way to sync back state change if any at Node side. When ever Controller gets PublishVolume call it sends that request to Node, and on success updates it |
db4ce7d
to
7e49d03
Compare
Sometimes, CSI master sends 2nd publish request just after first one, either there was timeout or some other reason. Node should withstand repeated requests without side effects. In ControllerPublishVolume in Node part, if we receive a second request with same volumeID, we attempt to create a next phys.volume because we do not check have we already created and published. There is local registry where we store published ID, so let's check there and avoid double operation.
As gRPC requests may get served in arbitrary order and also in parallel, and master can send repeated requests, we use mutex with key=VolumeID to protect operations that change state, like add/delete volumes/devices, mkfs, mount.
Importing mutex forced run of "dep ensure" which created this set of changes. dep v.0.5.0 was used.
7e49d03
to
6b2b9d1
Compare
So what is the decision now, should we merge this as is now, or add another check in Controller side. To get this moving, I propose that we merge it now, fixing actual issue, and treat need for additional check in separate thread |
I am ok to merge this, lets deal the other issue later. |
Sometimes, CSI master sends 2nd publish request just
after first one, either there was timeout or some other reason.
Node driver should withstand repeated requests without side effects.
In ControllerPublishVolume in Node part, if we receive
a second request with same volumeID, we attempt to create a next phys.volume
because we do not check have we already created and published.
There is local registry where we store published ID, so
let's check there and avoid double operation.