Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic: runtime error: slice bounds out of range #829

Closed
garthk opened this issue Jun 5, 2014 · 7 comments
Closed

panic: runtime error: slice bounds out of range #829

garthk opened this issue Jun 5, 2014 · 7 comments

Comments

@garthk
Copy link

garthk commented Jun 5, 2014

etcd 0.4.2 is infinitely looping after a CoreOS upgrade reboot on one of my three machines:

Jun 05 04:52:39 host2 systemd[1]: Starting etcd...
Jun 05 04:52:39 host2 systemd[1]: Started etcd.
Jun 05 04:52:40 host2 etcd[8086]: [etcd] Jun  5 04:52:40.150 INFO      | host2: peer added: 'host0'
Jun 05 04:52:40 host2 etcd[8086]: [etcd] Jun  5 04:52:40.152 INFO      | host2: peer added: 'host1'
Jun 05 04:52:40 host2 etcd[8086]: panic: runtime error: slice bounds out of range
Jun 05 04:52:40 host2 etcd[8086]: goroutine 1 [running]:
Jun 05 04:52:40 host2 etcd[8086]: runtime.panic(0x7ea400, 0xef40aa)
Jun 05 04:52:40 host2 etcd[8086]: /usr/lib/go/src/pkg/runtime/panic.c:266 +0xb6
Jun 05 04:52:40 host2 etcd[8086]: github.com/coreos/etcd/third_party/github.com/goraft/raft/protobuf.(*LogEntry).Unmarshal(0xc21040db40, 0xc2101a6aa0, 0x3, 0x3, 0xc21040db40, ...)
Jun 05 04:52:40 host2 etcd[8086]: /build/amd64-usr/tmp/portage/dev-db/etcd-0.4.2/work/etcd-0.4.2/gopath/src/github.com/coreos/etcd/third_party/github.com/goraft/raft/protobuf/log_entry.pb.go:188 +0x50d
Jun 05 04:52:40 host2 systemd[1]: etcd.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Jun 05 04:52:40 host2 systemd[1]: Unit etcd.service entered failed state.
Jun 05 04:52:40 host2 etcd[8086]: github.com/coreos/etcd/third_party/code.google.com/p/gogoprotobuf/proto.UnmarshalMerge(0xc2101a6aa0, 0x3, 0x3, 0x7f9f61792df8, 0xc21040db40, ...)
Jun 05 04:52:40 host2 etcd[8086]: /build/amd64-usr/tmp/portage/dev-db/etcd-0.4.2/work/etcd-0.4.2/gopath/src/github.com/coreos/etcd/third_party/code.google.com/p/gogoprotobuf/proto/decode.go:323 +0x7c
Jun 05 04:52:40 host2 etcd[8086]: github.com/coreos/etcd/third_party/code.google.com/p/gogoprotobuf/proto.Unmarshal(0xc2101a6aa0, 0x3, 0x3, 0x7f9f61792df8, 0xc21040db40, ...)
Jun 05 04:52:40 host2 etcd[8086]: /build/amd64-usr/tmp/portage/dev-db/etcd-0.4.2/work/etcd-0.4.2/gopath/src/github.com/coreos/etcd/third_party/code.google.com/p/gogoprotobuf/proto/decode.go:311 +0x68
Jun 05 04:52:40 host2 etcd[8086]: github.com/coreos/etcd/third_party/github.com/goraft/raft.(*LogEntry).Decode(0xc21057d7e0, 0x7f9f61786098, 0xc2104618f0, 0xb132b, 0x0, ...)
Jun 05 04:52:40 host2 etcd[8086]: /build/amd64-usr/tmp/portage/dev-db/etcd-0.4.2/work/etcd-0.4.2/gopath/src/github.com/coreos/etcd/third_party/github.com/goraft/raft/log_entry.go:102 +0x1fb
Jun 05 04:52:40 host2 etcd[8086]: github.com/coreos/etcd/third_party/github.com/goraft/raft.(*Log).open(0xc21004a850, 0xc21007b200, 0x11, 0x7f9f61787a40, 0xc2104606f0)
Jun 05 04:52:40 host2 etcd[8086]: /build/amd64-usr/tmp/portage/dev-db/etcd-0.4.2/work/etcd-0.4.2/gopath/src/github.com/coreos/etcd/third_party/github.com/goraft/raft/log.go:166 +0x523
Jun 05 04:52:40 host2 etcd[8086]: github.com/coreos/etcd/third_party/github.com/goraft/raft.(*server).Init(0xc210052000, 0x19, 0x7f9f615e63d0)
Jun 05 04:52:40 host2 etcd[8086]: /build/amd64-usr/tmp/portage/dev-db/etcd-0.4.2/work/etcd-0.4.2/gopath/src/github.com/coreos/etcd/third_party/github.com/goraft/raft/server.go:506 +0x714
Jun 05 04:52:40 host2 etcd[8086]: github.com/coreos/etcd/server.(*PeerServer).SetRaftServer(0xc2100a2000, 0x7f9f61787ed0, 0xc210052000, 0x1)
Jun 05 04:52:40 host2 etcd[8086]: /build/amd64-usr/tmp/portage/dev-db/etcd-0.4.2/work/etcd-0.4.2/gopath/src/github.com/coreos/etcd/server/peer_server.go:141 +0x4cc
Jun 05 04:52:40 host2 etcd[8086]: github.com/coreos/etcd/etcd.(*Etcd).Run(0xc210058700)
Jun 05 04:52:40 host2 etcd[8086]: /build/amd64-usr/tmp/portage/dev-db/etcd-0.4.2/work/etcd-0.4.2/gopath/src/github.com/coreos/etcd/etcd/etcd.go:215 +0x1bd3
Jun 05 04:52:40 host2 etcd[8086]: main.main()
Jun 05 04:52:40 host2 etcd[8086]: /build/amd64-usr/tmp/portage/dev-db/etcd-0.4.2/work/etcd-0.4.2/gopath/src/github.com/coreos/etcd/main.go:43 +0x2b2
Jun 05 04:52:40 host2 etcd[8086]: goroutine 4 [syscall]:
Jun 05 04:52:40 host2 etcd[8086]: os/signal.loop()
Jun 05 04:52:40 host2 etcd[8086]: /usr/lib/go/src/pkg/os/signal/signal_unix.go:21 +0x1e
Jun 05 04:52:40 host2 etcd[8086]: created by os/signal.init·1
Jun 05 04:52:40 host2 etcd[8086]: /usr/lib/go/src/pkg/os/signal/signal_unix.go:27 +0x31
Jun 05 04:52:50 host2 systemd[1]: etcd.service holdoff time over, scheduling restart.
Jun 05 04:52:50 host2 systemd[1]: Stopping etcd...
Jun 05 04:52:50 host2 systemd[1]: Starting etcd...
Jun 05 04:52:50 host2 systemd[1]: Started etcd.
Jun 05 04:52:50 host2 etcd[8093]: [etcd] Jun  5 04:52:50.397 INFO      | host2: peer added: 'host0'
Jun 05 04:52:50 host2 etcd[8093]: [etcd] Jun  5 04:52:50.400 INFO      | host2: peer added: 'host1'
Jun 05 04:52:50 host2 etcd[8093]: panic: runtime error: slice bounds out of range
@yichengq
Copy link
Contributor

yichengq commented Jun 5, 2014

@garthk Could you share the etcd data dir with me? I could dig more with that.
I think you could find the location through the variable ETCD_DATA_DIR in systemctl cat etcd.service.

@garthk
Copy link
Author

garthk commented Jun 5, 2014

I'm afraid it's full of internal IP addresses and host names.

@philips
Copy link
Contributor

philips commented Jun 5, 2014

@garthk would you be OK sharing this privately? We suspect that this might be an incomplete write before reboot or something but would really like to confirm.

@jonboulle
Copy link
Contributor

Another similar encounter - https://twitter.com/benbangert/status/474592958017056770

@douzzi
Copy link

douzzi commented Jun 5, 2014

+1

I've run into something similar too, multiple times. Just haven't found a reliable repro.

[etcd] Jun  4 10:43:23.942 INFO      | machine4 starting in standby mode
[etcd] Jun  4 10:43:28.944 INFO      | join through leader http://127.0.0.1:7001
[etcd] Jun  4 10:43:31.618 INFO      | machine4: peer added: 'machine3'
[etcd] Jun  4 10:43:31.631 INFO      | machine4: peer added: 'machine1'
[etcd] Jun  4 10:43:31.635 INFO      | machine4 starting in peer mode
[etcd] Jun  4 10:43:31.635 INFO      | machine4: state changed from 'initialized' to 'follower'.
[etcd] Jun  4 10:43:31.959 INFO      | machine4: state changed from 'follower' to 'candidate'.
[etcd] Jun  4 10:43:31.960 INFO      | removed during cluster re-configuration
[etcd] Jun  4 10:43:31.960 INFO      | machine4: state changed from 'candidate' to 'follower'.
[etcd] Jun  4 10:43:31.960 INFO      | machine4: term #18446744073709551615 started.
[etcd] Jun  4 10:43:31.960 INFO      | machine4: state changed from 'follower' to 'stopped'.
[etcd] Jun  4 10:43:31.960 INFO      | machine4: state changed from 'stopped' to 'stopped'.
[etcd] Jun  4 10:43:31.961 INFO      | set cluster([http://127.0.0.1:7001 http://127.0.0.1:7002]) for standby server
[etcd] Jun  4 10:43:31.962 INFO      | machine4 starting in standby mode
[etcd] Jun  4 10:43:36.964 INFO      | Send Join Request to http://127.0.0.1:7001/join
[etcd] Jun  4 10:43:37.012 INFO      | join through leader http://127.0.0.1:7001
[etcd] Jun  4 10:43:39.725 INFO      | machine4: peer added: 'machine3'
[etcd] Jun  4 10:43:39.726 INFO      | machine4: peer added: 'machine1'
[etcd] Jun  4 10:43:39.735 INFO      | machine4 starting in peer mode
[etcd] Jun  4 10:43:39.735 INFO      | machine4: state changed from 'initialized' to 'follower'.
[etcd] Jun  4 10:43:39.762 INFO      | machine4: state changed from 'follower' to 'snapshotting'.
[etcd] Jun  4 10:43:40.972 INFO      | machine4: peer added: 'machine2'
[etcd] Jun  4 10:43:40.973 INFO      | machine4: peer added: 'machine3'
[etcd] Jun  4 10:43:40.974 INFO      | machine4: peer added: 'machine1'
panic: runtime error: slice bounds out of range

goroutine 83 [running]:
runtime.panic(0x7ea400, 0xef30aa)
        /usr/local/go/src/pkg/runtime/panic.c:266 +0xb6
github.com/coreos/etcd/third_party/github.com/goraft/raft.(*Log).compact(0xc21c88a2a0, 0x3ab3b, 0x2, 0x0, 0x0)
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/third_party/github.com/goraft/raft/log.go:583 +0x4bd
github.com/coreos/etcd/third_party/github.com/goraft/raft.(*server).processSnapshotRecoveryRequest(0xc211cfe360, 0xc21bddceb0, 0xc21bddceb0)
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/third_party/github.com/goraft/raft/server.go:1321 +0x28c
github.com/coreos/etcd/third_party/github.com/goraft/raft.(*server).snapshotLoop(0xc211cfe360)
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/third_party/github.com/goraft/raft/server.go:882 +0x222
github.com/coreos/etcd/third_party/github.com/goraft/raft.(*server).loop(0xc211cfe360)
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/third_party/github.com/goraft/raft/server.go:612 +0x3f4
github.com/coreos/etcd/third_party/github.com/goraft/raft.func·007()
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/third_party/github.com/goraft/raft/server.go:471 +0x5c
created by github.com/coreos/etcd/third_party/github.com/goraft/raft.(*server).Start
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/third_party/github.com/goraft/raft/server.go:472 +0x3c0

goroutine 87 [select]:
github.com/coreos/etcd/server.(*PeerServer).monitorPeerActivity(0xc21009d000)
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/server/peer_server.go:827 +0x687
github.com/coreos/etcd/server.*PeerServer.(github.com/coreos/etcd/server.monitorPeerActivity)·fm()
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/server/peer_server.go:281 +0x26
github.com/coreos/etcd/server.func·002()
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/server/peer_server.go:725 +0x5c
created by github.com/coreos/etcd/server.(*PeerServer).startRoutine
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/server/peer_server.go:726 +0xb4

goroutine 4 [syscall]:
os/signal.loop()
        /usr/local/go/src/pkg/os/signal/signal_unix.go:21 +0x1e
created by os/signal.init·1
        /usr/local/go/src/pkg/os/signal/signal_unix.go:27 +0x31

goroutine 15 [IO wait]:
net.runtime_pollWait(0x7f008bc90258, 0x72, 0x0)  
        /usr/local/go/src/pkg/runtime/netpoll.goc:116 +0x6a
net.(*pollDesc).Wait(0xc2100d9680, 0x72, 0x7f008bc8f0e8, 0xb)
        /usr/local/go/src/pkg/net/fd_poll_runtime.go:81 +0x34
net.(*pollDesc).WaitRead(0xc2100d9680, 0xb, 0x7f008bc8f0e8)
        /usr/local/go/src/pkg/net/fd_poll_runtime.go:86 +0x30
net.(*netFD).accept(0xc2100d9620, 0x951e18, 0x0, 0x7f008bc8f0e8, 0xb)
        /usr/local/go/src/pkg/net/fd_unix.go:382 +0x2c2
net.(*TCPListener).AcceptTCP(0xc210000280, 0x519ddb, 0x7f008bb15e88, 0x519ddb)
        /usr/local/go/src/pkg/net/tcpsock_posix.go:233 +0x47
net.(*TCPListener).Accept(0xc210000280, 0x7f008bc91090, 0xc21c617868, 0xc21c90ad80, 0x0)
        /usr/local/go/src/pkg/net/tcpsock_posix.go:243 +0x27
net/http.(*Server).Serve(0xc2100cf6e0, 0x7f008bc91800, 0xc210000280, 0x0, 0x0)
        /usr/local/go/src/pkg/net/http/server.go:1622 +0x91
github.com/coreos/etcd/etcd.func·001()
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/etcd/etcd.go:273 +0x93
created by github.com/coreos/etcd/etcd.(*Etcd).Run
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/etcd/etcd.go:278 +0x31b0

goroutine 16 [IO wait]:
net.runtime_pollWait(0x7f008bc901b0, 0x72, 0x0)  
        /usr/local/go/src/pkg/runtime/netpoll.goc:116 +0x6a
net.(*pollDesc).Wait(0xc2100d9760, 0x72, 0x7f008bc8f0e8, 0xb)
        /usr/local/go/src/pkg/net/fd_poll_runtime.go:81 +0x34
net.(*pollDesc).WaitRead(0xc2100d9760, 0xb, 0x7f008bc8f0e8)
        /usr/local/go/src/pkg/net/fd_poll_runtime.go:86 +0x30
net.(*netFD).accept(0xc2100d9700, 0x951e18, 0x0, 0x7f008bc8f0e8, 0xb)
        /usr/local/go/src/pkg/net/fd_unix.go:382 +0x2c2
net.(*TCPListener).AcceptTCP(0xc2100002a8, 0x519ddb, 0x7f008bafde88, 0x519ddb)
        /usr/local/go/src/pkg/net/tcpsock_posix.go:233 +0x47
net.(*TCPListener).Accept(0xc2100002a8, 0x7f008bc91090, 0xc21c617dd0, 0xc2175e7800, 0x0)
        /usr/local/go/src/pkg/net/tcpsock_posix.go:243 +0x27
net/http.(*Server).Serve(0xc2100cf730, 0x7f008bc91800, 0xc2100002a8, 0x0, 0x0)
        /usr/local/go/src/pkg/net/http/server.go:1622 +0x91
github.com/coreos/etcd/etcd.func·002()
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/etcd/etcd.go:282 +0x93
created by github.com/coreos/etcd/etcd.(*Etcd).Run
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/etcd/etcd.go:287 +0x3228

goroutine 79 [IO wait]:
net.runtime_pollWait(0x7f008bc90300, 0x72, 0x0)
        /usr/local/go/src/pkg/runtime/netpoll.goc:116 +0x6a
net.(*pollDesc).Wait(0xc21c87ad10, 0x72, 0x7f008bc8f0e8, 0xb)
        /usr/local/go/src/pkg/net/fd_poll_runtime.go:81 +0x34
net.(*pollDesc).WaitRead(0xc21c87ad10, 0xb, 0x7f008bc8f0e8)
        /usr/local/go/src/pkg/net/fd_poll_runtime.go:86 +0x30
net.(*netFD).Read(0xc21c87acb0, 0xc21c886000, 0x1000, 0x1000, 0x0, ...)
        /usr/local/go/src/pkg/net/fd_unix.go:204 +0x2a0
net.(*conn).Read(0xc21c617b48, 0xc21c886000, 0x1000, 0x1000, 0x30, ...)
        /usr/local/go/src/pkg/net/net.go:122 +0xc5
bufio.(*Reader).fill(0xc221ffb840)
        /usr/local/go/src/pkg/bufio/bufio.go:91 +0x110
bufio.(*Reader).Peek(0xc221ffb840, 0x1, 0x0, 0x0, 0x0, ...)
        /usr/local/go/src/pkg/bufio/bufio.go:119 +0xcb
net/http.(*persistConn).readLoop(0xc21c90af80)   
        /usr/local/go/src/pkg/net/http/transport.go:687 +0xb7
created by net/http.(*Transport).dialConn
        /usr/local/go/src/pkg/net/http/transport.go:528 +0x607

goroutine 88 [select]:
github.com/coreos/etcd/server.(*PeerServer).monitorSnapshot(0xc21009d000)
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/server/peer_server.go:733 +0x19e
github.com/coreos/etcd/server.*PeerServer.(github.com/coreos/etcd/server.monitorSnapshot)·fm()
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/server/peer_server.go:285 +0x26
github.com/coreos/etcd/server.func·002()
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/server/peer_server.go:725 +0x5c
created by github.com/coreos/etcd/server.(*PeerServer).startRoutine
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/server/peer_server.go:726 +0xb4

goroutine 84 [select]:
github.com/coreos/etcd/server.(*PeerServer).monitorSync(0xc21009d000)
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/server/peer_server.go:753 +0x1c9
github.com/coreos/etcd/server.*PeerServer.(github.com/coreos/etcd/server.monitorSync)·fm()
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/server/peer_server.go:278 +0x26
github.com/coreos/etcd/server.func·002()
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/server/peer_server.go:725 +0x5c
created by github.com/coreos/etcd/server.(*PeerServer).startRoutine
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/server/peer_server.go:726 +0xb4

goroutine 80 [select]:
net/http.(*persistConn).writeLoop(0xc21c90af80)  
        /usr/local/go/src/pkg/net/http/transport.go:791 +0x271
created by net/http.(*Transport).dialConn
        /usr/local/go/src/pkg/net/http/transport.go:529 +0x61e

goroutine 82 [IO wait]:
net.runtime_pollWait(0x7f008bc90108, 0x72, 0x0)
        /usr/local/go/src/pkg/runtime/netpoll.goc:116 +0x6a
net.(*pollDesc).Wait(0xc21c88af40, 0x72, 0x7f008bc8f0e8, 0xb)
        /usr/local/go/src/pkg/net/fd_poll_runtime.go:81 +0x34
net.(*pollDesc).WaitRead(0xc21c88af40, 0xb, 0x7f008bc8f0e8)
        /usr/local/go/src/pkg/net/fd_poll_runtime.go:86 +0x30
net.(*netFD).Read(0xc21c88aee0, 0xc21178d000, 0x1000, 0x1000, 0x0, ...)
        /usr/local/go/src/pkg/net/fd_unix.go:204 +0x2a0
net.(*conn).Read(0xc21c617dd0, 0xc21178d000, 0x1000, 0x1000, 0xc2100494c0, ...)
        /usr/local/go/src/pkg/net/net.go:122 +0xc5
net/http.(*liveSwitchReader).Read(0xc2175e7828, 0xc21178d000, 0x1000, 0x1000, 0x738ea0, ...)
        /usr/local/go/src/pkg/net/http/server.go:204 +0xa5
io.(*LimitedReader).Read(0xc222165820, 0xc21178d000, 0x1000, 0x1000, 0x2, ...)
        /usr/local/go/src/pkg/io/io.go:398 +0xbb
bufio.(*Reader).fill(0xc210c693c0)
        /usr/local/go/src/pkg/bufio/bufio.go:91 +0x110
bufio.(*Reader).ReadSlice(0xc210c693c0, 0x100a, 0x0, 0x0, 0x0, ...)
        /usr/local/go/src/pkg/bufio/bufio.go:274 +0x204
bufio.(*Reader).ReadLine(0xc210c693c0, 0x0, 0x0, 0x0, 0x0, ...)
        /usr/local/go/src/pkg/bufio/bufio.go:305 +0x63
net/textproto.(*Reader).readLineSlice(0xc216f82c60, 0x7f008bc86000, 0x728580, 0x7f00893bece8, 0x421a92, ...)
        /usr/local/go/src/pkg/net/textproto/reader.go:55 +0x61
net/textproto.(*Reader).ReadLine(0xc216f82c60, 0xc21bd625b0, 0x0, 0xc211781000, 0x0)
        /usr/local/go/src/pkg/net/textproto/reader.go:36 +0x27
net/http.ReadRequest(0xc210c693c0, 0xc21bd625b0, 0x0, 0x0)
        /usr/local/go/src/pkg/net/http/request.go:526 +0x88
net/http.(*conn).readRequest(0xc2175e7800, 0x0, 0x0, 0x0)
        /usr/local/go/src/pkg/net/http/server.go:575 +0x1bb
net/http.(*conn).serve(0xc2175e7800)
        /usr/local/go/src/pkg/net/http/server.go:1123 +0x3b4
created by net/http.(*Server).Serve
        /usr/local/go/src/pkg/net/http/server.go:1644 +0x28b

goroutine 24 [select]:
github.com/coreos/etcd/third_party/github.com/goraft/raft.(*server).send(0xc211cfe360, 0x7ec3c0, 0xc21bddceb0, 0x7f00893dcb88, 0xc2130e4000, ...)
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/third_party/github.com/goraft/raft/server.go:631 +0x209
github.com/coreos/etcd/third_party/github.com/goraft/raft.(*server).SnapshotRecoveryRequest(0xc211cfe360, 0xc21bddceb0, 0x7f00893dcbb8)
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/third_party/github.com/goraft/raft/server.go:1295 +0x3b
github.com/coreos/etcd/server.(*PeerServer).SnapshotRecoveryHttpHandler(0xc21009d000, 0x7f008bc91940, 0xc2100d2280, 0xc211d6dc30)
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/server/peer_server_handlers.go:128 +0x318
github.com/coreos/etcd/server.*PeerServer.SnapshotRecoveryHttpHandler·fm(0x7f008bc91940, 0xc2100d2280, 0xc211d6dc30)
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/server/peer_server.go:352 +0x44
net/http.HandlerFunc.ServeHTTP(0xc2100cb6c0, 0x7f008bc91940, 0xc2100d2280, 0xc211d6dc30)
        /usr/local/go/src/pkg/net/http/server.go:1220 +0x40
github.com/coreos/etcd/third_party/github.com/gorilla/mux.(*Router).ServeHTTP(0xc2100c8dc0, 0x7f008bc91940, 0xc2100d2280, 0xc211d6dc30)
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/third_party/github.com/gorilla/mux/mux.go:98 +0x217
github.com/coreos/etcd/http.(*CORSHandler).ServeHTTP(0xc2100d45a0, 0x7f008bc91940, 0xc2100d2280, 0xc211d6dc30)
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/http/cors.go:78 +0x1f9
github.com/coreos/etcd/etcd.(*ModeHandler).ServeHTTP(0xc2100d5420, 0x7f008bc91940, 0xc2100d2280, 0xc211d6dc30)
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/etcd/etcd.go:405 +0x72
net/http.serverHandler.ServeHTTP(0xc2100cf730, 0x7f008bc91940, 0xc2100d2280, 0xc211d6dc30)
        /usr/local/go/src/pkg/net/http/server.go:1597 +0x16e
net/http.(*conn).serve(0xc210059700)
        /usr/local/go/src/pkg/net/http/server.go:1167 +0x7b7
created by net/http.(*Server).Serve
        /usr/local/go/src/pkg/net/http/server.go:1644 +0x28b

goroutine 85 [select]:
github.com/coreos/etcd/server.(*PeerServer).monitorTimeoutThreshold(0xc21009d000)
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/server/peer_server.go:768 +0x26c
github.com/coreos/etcd/server.*PeerServer.(github.com/coreos/etcd/server.monitorTimeoutThreshold)·fm()
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/server/peer_server.go:279 +0x26
github.com/coreos/etcd/server.func·002()
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/server/peer_server.go:725 +0x5c
created by github.com/coreos/etcd/server.(*PeerServer).startRoutine
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/server/peer_server.go:726 +0xb4

goroutine 86 [select]:
github.com/coreos/etcd/server.(*PeerServer).monitorActiveSize(0xc21009d000)
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/server/peer_server.go:791 +0x76b
github.com/coreos/etcd/server.*PeerServer.(github.com/coreos/etcd/server.monitorActiveSize)·fm()
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/server/peer_server.go:280 +0x26
github.com/coreos/etcd/server.func·002()
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/server/peer_server.go:725 +0x5c
created by github.com/coreos/etcd/server.(*PeerServer).startRoutine
        /home/douzzi/code/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/server/peer_server.go:726 +0xb4

goroutine 37 [IO wait]:
net.runtime_pollWait(0x7f008bc8ff10, 0x72, 0x0)
        /usr/local/go/src/pkg/runtime/netpoll.goc:116 +0x6a
net.(*pollDesc).Wait(0xc21dcea8b0, 0x72, 0x7f008bc8f0e8, 0xb)
        /usr/local/go/src/pkg/net/fd_poll_runtime.go:81 +0x34
net.(*pollDesc).WaitRead(0xc21dcea8b0, 0xb, 0x7f008bc8f0e8)
        /usr/local/go/src/pkg/net/fd_poll_runtime.go:86 +0x30
net.(*netFD).Read(0xc21dcea850, 0xc211ca1000, 0x1000, 0x1000, 0x0, ...)
        /usr/local/go/src/pkg/net/fd_unix.go:204 +0x2a0
net.(*conn).Read(0xc2177490c8, 0xc211ca1000, 0x1000, 0x1000, 0x30, ...)
        /usr/local/go/src/pkg/net/net.go:122 +0xc5
bufio.(*Reader).fill(0xc217945f60)
        /usr/local/go/src/pkg/bufio/bufio.go:91 +0x110
bufio.(*Reader).Peek(0xc217945f60, 0x1, 0x0, 0x0, 0x0, ...)
        /usr/local/go/src/pkg/bufio/bufio.go:119 +0xcb
net/http.(*persistConn).readLoop(0xc21c90a680)
        /usr/local/go/src/pkg/net/http/transport.go:687 +0xb7
created by net/http.(*Transport).dialConn
        /usr/local/go/src/pkg/net/http/transport.go:528 +0x607

goroutine 35 [IO wait]:
net.runtime_pollWait(0x7f008bc8ffb8, 0x72, 0x0)
        /usr/local/go/src/pkg/runtime/netpoll.goc:116 +0x6a
net.(*pollDesc).Wait(0xc21dcea760, 0x72, 0x7f008bc8f0e8, 0xb)
        /usr/local/go/src/pkg/net/fd_poll_runtime.go:81 +0x34
net.(*pollDesc).WaitRead(0xc21dcea760, 0xb, 0x7f008bc8f0e8)
        /usr/local/go/src/pkg/net/fd_poll_runtime.go:86 +0x30
net.(*netFD).Read(0xc21dcea700, 0xc211cb1000, 0x1000, 0x1000, 0x0, ...)
        /usr/local/go/src/pkg/net/fd_unix.go:204 +0x2a0
net.(*conn).Read(0xc2177490a0, 0xc211cb1000, 0x1000, 0x1000, 0x30, ...)
        /usr/local/go/src/pkg/net/net.go:122 +0xc5
bufio.(*Reader).fill(0xc217945ea0)
        /usr/local/go/src/pkg/bufio/bufio.go:91 +0x110
bufio.(*Reader).Peek(0xc217945ea0, 0x1, 0x0, 0x0, 0x0, ...)
        /usr/local/go/src/pkg/bufio/bufio.go:119 +0xcb
net/http.(*persistConn).readLoop(0xc21c90a600)
        /usr/local/go/src/pkg/net/http/transport.go:687 +0xb7
created by net/http.(*Transport).dialConn
        /usr/local/go/src/pkg/net/http/transport.go:528 +0x607

goroutine 36 [select]:
net/http.(*persistConn).writeLoop(0xc21c90a600)
        /usr/local/go/src/pkg/net/http/transport.go:791 +0x271
created by net/http.(*Transport).dialConn
        /usr/local/go/src/pkg/net/http/transport.go:529 +0x61e

goroutine 38 [select]:
net/http.(*persistConn).writeLoop(0xc21c90a680)
        /usr/local/go/src/pkg/net/http/transport.go:791 +0x271
created by net/http.(*Transport).dialConn
        /usr/local/go/src/pkg/net/http/transport.go:529 +0x61e

@xiang90
Copy link
Contributor

xiang90 commented Jun 6, 2014

@garthk We confirmed this is caused by #830 and an issue in 3rd/gogopprotobuf. We need to bump gogoprotobuf to fully fix this issue.

@yichengq
Copy link
Contributor

yichengq commented Jul 3, 2014

@garthk
The long-term bug fix has been merged.
To make the cluster work again, you could stop etcd, remove the data directory of that node, and restart it. The node should be able to connect back through -peers or -discovery flag.
Let me know if it is still a problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

6 participants