Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

index out of range panic on MemberRemove #5482

Closed
purpleidea opened this issue May 28, 2016 · 6 comments
Closed

index out of range panic on MemberRemove #5482

purpleidea opened this issue May 28, 2016 · 6 comments

Comments

@purpleidea
Copy link
Contributor

I've been able to trigger the following panic:

panic: runtime error: index out of range

goroutine 63 [running]:
github.com/coreos/etcd/raft.(*raft).maybeCommit(0xc820230a90, 0xc8201df9e0)
    /home/james/code/mgmt/gopath/src/github.com/coreos/etcd/raft/raft.go:385 +0x2fd
github.com/coreos/etcd/raft.(*raft).removeNode(0xc820230a90, 0x8927110dc66458af)
    /home/james/code/mgmt/gopath/src/github.com/coreos/etcd/raft/raft.go:903 +0x5d
github.com/coreos/etcd/raft.(*node).run(0xc8202446e0, 0xc820230a90)
    /home/james/code/mgmt/gopath/src/github.com/coreos/etcd/raft/node.go:330 +0xc4e
created by github.com/coreos/etcd/raft.StartNode
    /home/james/code/mgmt/gopath/src/github.com/coreos/etcd/raft/node.go:203 +0x731

What follows is a long messy trace. The code I have that causes this embeds the etcd server, and as soon as I have cleaned it up a little I will post it, but maybe it's obvious to someone familiar with the code base what the bug is.

I think this might be due to me calling MemberRemove on myself, the last machine in the cluster. I can work around that, but I think it probably shouldn't panic.

The code it refers to is the mci:= line...

func (r *raft) maybeCommit() bool {
    // TODO(bmizerany): optimize.. Currently naive
    mis := make(uint64Slice, 0, len(r.prs))
    for id := range r.prs {
        mis = append(mis, r.prs[id].Match)
    }
    sort.Sort(sort.Reverse(mis))
    mci := mis[r.quorum()-1]
    return r.raftLog.maybeCommit(mci, r.Term)
}

/cc @bmizerany because mentioned here in the code.

Cheers!

@bmizerany
Copy link
Contributor

Can you provide the team with a reduced test case that can consistently reproduce what you're seeing?

@purpleidea
Copy link
Contributor Author

@bmizerany I don't know the etcd test case framework, but call whatever you do to startup a server and then call the memberapi as i mentioned and it should panic in v3. If you can't reproduce, let me know.

@purpleidea
Copy link
Contributor Author

FWIW i'm using v.3.0.0-beta.0

@gyuho
Copy link
Contributor

gyuho commented May 28, 2016

This seems fixed in master branch by #5366. Xiang should know better.

@xiang90
Copy link
Contributor

xiang90 commented May 28, 2016

@purpleidea Please try with the master branch. I believe this is fixed by #5366 already.

@xiang90 xiang90 closed this as completed May 28, 2016
@purpleidea
Copy link
Contributor Author

Ah, great to hear. Sorry for the dupe. If there is a branch that is always meant to work and be building that would be better for V3 testing and whatnot, please let me know. Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants