client: defensive against getting stale alloc updates #5906

notnoop · 2019-06-29T15:17:38Z

When fetching node alloc assignments, be defensive against a stale read before
killing local nodes allocs.

The bug is when both client and servers are restarting and the client requests
the node allocation for the node, it might get stale data as server hasn't
finished applying all the restored raft transaction to store.

Consequently, client would kill and destroy the alloc locally, just to fetch it
again moments later when server store is up to date.

The bug can be reproduced quite reliably with single node setup (configured with
persistence). I suspect it's too edge-casey to occur in production cluster with
multiple servers, but we may need to examine leader failover scenarios more closely.

In this commit, we only remove and destroy allocs if the removal index is more
recent than the alloc index. This seems like a cheap resiliency fix we already
use for detecting alloc updates.

A more proper fix would be to ensure that a nomad server only serves
RPC calls when state store is fully restored or up to date in leadership
transition cases.

When fetching node alloc assignments, be defensive against a stale read before killing local nodes allocs. The bug is when both client and servers are restarting and the client requests the node allocation for the node, it might get stale data as server hasn't finished applying all the restored raft transaction to store. Consequently, client would kill and destroy the alloc locally, just to fetch it again moments later when server store is up to date. The bug can be reproduced quite reliably with single node setup (configured with persistence). I suspect it's too edge-casey to occur in production cluster with multiple servers, but we may need to examine leader failover scenarios more closely. In this commit, we only remove and destroy allocs if the removal index is more recent than the alloc index. This seems like a cheap resiliency fix we already use for detecting alloc updates. A more proper fix would be to ensure that a nomad server only serves RPC calls when state store is fully restored or up to date in leadership transition cases.

schmichael

Great find! Since we remove allocs based on their absence there's no ModifyIndex to check for freshness. This appears to bring alloc removal correctness in line with alloc updates.

schmichael · 2019-07-01T15:19:16Z

client/client.go

@@ -1944,6 +1947,7 @@ OUTER:
 			filtered:      filtered,
 			pulled:        pulledAllocs,
 			migrateTokens: resp.MigrateTokens,
+			index:         resp.Index,


L1942 updates req.MinQueryIndex if and only if resp.Index is greater, so I wonder if there's some reason we should use req.MinQueryIndex here instead. I'm honestly not sure L1942 is reachable. Perhaps there's a timeout that could cause a response before resp.Index is greater than MinQueryIndex?

Not a blocker as I think at worst it's an edge case of an edge case that when hit will negate the correctness improvement of this PR. It can't make the behavior worse than before the PR AFAICT.

I don't think we should be using req.MinQueryIndex here. It's simpler to reason that we reconciling local state against pulled state (at index resp.Index) without worrying much about the indirection or interference of req.MinQueryIndex` (i.e. if resp.Index is earlier than req.MinQueryIndex, using req.MinQueryIndex risks us believing the server state is more recent than it actually is). We expect the reconciler to work even if resp.Index went back in time expectedly.

As for req.MinQueryIndex, it seems that we are protecting against servers state going back in time! That feels quite odd and I wonder if it's just being defensive or a case we hit at some point.

github-actions · 2023-02-07T02:15:43Z

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

notnoop requested review from schmichael and preetapan June 29, 2019 15:24

schmichael approved these changes Jul 1, 2019

View reviewed changes

notnoop merged commit 22960f8 into master Jul 2, 2019

notnoop deleted the b-alloc-stale-updates branch July 2, 2019 06:53

notnoop mentioned this pull request Jul 2, 2019

Block rpc handling until state store is caught up #5911

Merged

This was referenced Jul 5, 2019

Constraint/count is not respected after Nomad cluster restart (previously failed allocs) #5921

Open

Dead service after Nomad cluster restart #5919

Closed

notnoop mentioned this pull request Jun 19, 2020

Revert "client: defensive against getting stale alloc updates" #8217

Merged

github-actions bot locked as resolved and limited conversation to collaborators Feb 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

client: defensive against getting stale alloc updates #5906

client: defensive against getting stale alloc updates #5906

notnoop commented Jun 29, 2019

schmichael left a comment

schmichael Jul 1, 2019

notnoop Jul 2, 2019

github-actions bot commented Feb 7, 2023

client: defensive against getting stale alloc updates #5906

client: defensive against getting stale alloc updates #5906

Conversation

notnoop commented Jun 29, 2019

schmichael left a comment

Choose a reason for hiding this comment

schmichael Jul 1, 2019

Choose a reason for hiding this comment

notnoop Jul 2, 2019

Choose a reason for hiding this comment

github-actions bot commented Feb 7, 2023