-
Notifications
You must be signed in to change notification settings - Fork 673
Handling concurrent updates to configmap and IP reclamation #3724
Conversation
…le weave instances Signed-off-by: mmerrill3 <michael.merrill@vonage.com>
bab158c
to
5378157
Compare
I've pushed your branch to the |
Thanks for the PR. |
@bboreham, I agree that the sleep is not deterministic. The solution you have come up with makes sense, but I just have one point. The determination that a peer is to be removed was already make within the reclaimRemovedPeers() function, which is called after the sleep. At that time, cml.getPeerList() is called, and the peer is within the annotation for the existing peer list. So, chances are, if we call cml.getPeerList() again shortly after, the annotation will probably still be there. |
…y multiple weave instances" This reverts commit 5378157.
Signed-off-by: mmerrill3 <michael.merrill@vonage.com>
Hi @bboreham, |
Yes, that's what I meant - re-fetch from the api-server.
Here's what I think the failure at #3722 looks like - time goes from top to bottom:
The reason why I think it is more bullet-proof is we get this instead:
Can you show in a similar notation the sequence(s) where we still get an error? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nits
storedPeerList.remove(peerName) | ||
if err := cml.UpdatePeerList(*storedPeerList); err != nil { | ||
return err | ||
if storedPeerList.contains(peerName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you think of a sequence where this is false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it's false, there's nothing to do from an annotation perspective, so maybe just log it and then proceed to remove the locking annotation.
But, the remove will be a no/op if there's nothing in the list, so the check for contains() is unnecessary, unless we want to catch an unthought of edge case where the peer is really not in the list.
I'd prefer to keep this, but log if the peer is not in the list so we have evidence that this edge case occurred.
Signed-off-by: mmerrill3 <michael.merrill@vonage.com>
Just pushed changed from your feedback @bboreham , thanks for that. I agree with your diagram about what happened. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks!
Handling concurrent updates to configmap and IP reclamation by multiple weave instances to fix #3722
Signed-off-by: mmerrill3 michael.merrill@vonage.com