Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcdserver: apply() sets consistIndex for any entry type #7856

Merged
merged 2 commits into from
May 3, 2017

Conversation

fanminshi
Copy link
Member

previously, apply() doesn't set consistIndex for EntryConfChange type.
this causes a misalignment between consistIndex and applied index
where EntryConfChange entry results setting applied index but not consistIndex.

suppose that addMember() is called and leader reflects that change.
1. applied index and consistIndex is now misaligned.
2. a new follower node joined.
3. leader sends the snapshot to follower
where the applied index is the snapshot metadata index.
4. follower node saves the snapshot and database(includes consistIndex) from leader.
5. restarting follower loads snapshot and database.
6. follower checks snapshot metadata index(same as applied index) and database consistIndex,
finds them don't match, and then panic.

FIXES #7834

@gyuho
Copy link
Contributor

gyuho commented May 2, 2017

bunch of CI failures. Is your git branch out-of-date?

@fanminshi
Copy link
Member Author

@gyuho my branch is synced today. let me see what those failures are.

@@ -1292,8 +1294,6 @@ func (s *EtcdServer) apply(es []raftpb.Entry, confState *raftpb.ConfState) (appl
func (s *EtcdServer) applyEntryNormal(e *raftpb.Entry) {
shouldApplyV3 := false
if e.Index > s.consistIndex.ConsistentIndex() {
// set the consistent index of current executing entry
s.consistIndex.setConsistentIndex(e.Index)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently applyEntryNormal sets the consistent index before it begins modifying store, now the consistent index is set after completing the modification and updating apply index. That will admit cases where applied index > consistent index, meaning apply is ahead of raft, which should never happen.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is consistent index used in the store and v3 applier? Initially, I thought consistent index is used to keep track of latest applied entry, and it is used to guard v3 applier so that we don't re-apply applied entries.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ConsistentIndex tracks the raft index, the AppliedIndex tracks the last index that has been completely applied to the store. The applied index is necessarily <= the raft index since raft is where the data for the apply is coming from.

@@ -1284,6 +1284,8 @@ func (s *EtcdServer) apply(es []raftpb.Entry, confState *raftpb.ConfState) (appl
atomic.StoreUint64(&s.r.term, e.Term)
appliedt = e.Term
appliedi = e.Index
// set the consistent index of current executing entry
s.consistIndex.setConsistentIndex(appliedi)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this probably belongs at the beginning of applyConfChange, similar to the setConsistentIndex/setAppliedIndex path that's already in applyEntryNormal

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed. will change.

@fanminshi
Copy link
Member Author

all fixed. PTAL

}

// create EntryConfChange entry
now := time.Now()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now is not needed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh nvm. it's used in NewMember

// set the consistent index of current executing entry
if e.Index > s.consistIndex.ConsistentIndex() {
s.consistIndex.setConsistentIndex(e.Index)
}
var cc raftpb.ConfChange
pbutil.MustUnmarshal(&cc, e.Data)
removedSelf, err := s.applyConfChange(cc, confState)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s.setAppliedIndex(e.Index)?

previously, apply() doesn't set consistIndex for EntryConfChange type.
this causes a misalignment between consistIndex and applied index
where EntryConfChange entry results setting applied index but not consistIndex.

suppose that addMember() is called and leader reflects that change.
1. applied index and consistIndex is now misaligned.
2. a new follower node joined.
3. leader sends the snapshot to follower
	where the applied index is the snapshot metadata index.
4. follower node saves the snapshot and database(includes consistIndex) from leader.
5. restarting follower loads snapshot and database.
6. follower checks snapshot metadata index(same as applied index) and database consistIndex,
	finds them don't match, and then panic.

FIXES etcd-io#7834
@fanminshi fanminshi force-pushed the fix_consistent_index_update branch 2 times, most recently from 2495eee to 89d7bb9 Compare May 2, 2017 23:35
@fanminshi
Copy link
Member Author

all fixed. PTAL

@fanminshi fanminshi force-pushed the fix_consistent_index_update branch from 89d7bb9 to e33b10a Compare May 2, 2017 23:51
@fanminshi
Copy link
Member Author

merging when green.

@codecov-io
Copy link

codecov-io commented May 3, 2017

Codecov Report

Merging #7856 into master will increase coverage by 0.29%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #7856      +/-   ##
==========================================
+ Coverage   75.81%   76.11%   +0.29%     
==========================================
  Files         332      332              
  Lines       26183    26186       +3     
==========================================
+ Hits        19851    19931      +80     
+ Misses       4909     4842      -67     
+ Partials     1423     1413      -10
Impacted Files Coverage Δ
etcdserver/server.go 80.59% <100%> (+0.06%) ⬆️
pkg/testutil/recorder.go 66.66% <0%> (-3.71%) ⬇️
etcdserver/api/v3rpc/lease.go 96.42% <0%> (-3.58%) ⬇️
etcdserver/api/v3election/election.go 64.7% <0%> (-2.95%) ⬇️
clientv3/concurrency/election.go 80% <0%> (-2.41%) ⬇️
clientv3/watch.go 95.46% <0%> (-0.51%) ⬇️
etcdserver/v3_server.go 70.3% <0%> (+0.47%) ⬆️
pkg/schedule/schedule.go 83.09% <0%> (+1.4%) ⬆️
rafthttp/peer.go 91.6% <0%> (+1.52%) ⬆️
etcdserver/api/v3rpc/watch.go 95.33% <0%> (+2.07%) ⬆️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 72d2adc...e33b10a. Read the comment docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants