Skip to content
This repository has been archived by the owner on Oct 4, 2019. It is now read-only.

Geth 3.5.0 Constantly Crashing - concurrent map read and map write #284

Closed
ethought opened this issue Jun 30, 2017 · 4 comments · Fixed by #614
Closed

Geth 3.5.0 Constantly Crashing - concurrent map read and map write #284

ethought opened this issue Jun 30, 2017 · 4 comments · Fixed by #614

Comments

@ethought
Copy link

Our Geth 3.5.0 instance seems to be crashing after just a few hours.

Running on Debain, 32 GB RAM.

go version go1.7.3 linux/amd64

I0629 15:19:04.117996 eth/handler.go:295] Peer f3675296843aec4c [eth/63]: timed out fork-check, dropping
I0629 15:19:29.425872 eth/handler.go:295] Peer e3e0cf816cfea25e [eth/63]: timed out fork-check, dropping
I0629 15:19:49.120562 core/blockchain.go:947] imported 1 block(s) (0 queued 0 ignored) including 1 txs in 5.006234ms. #4001720 [b1d7ceb2 / b1d7ceb2]
fatal error: concurrent map read and map write

goroutine 46110 [running]:
runtime.throw(0xf13fb3, 0x21)
/home/travis/.gimme/versions/go1.8.linux.amd64/src/runtime/panic.go:596 +0x95 fp=0xc44edef338 sp=0xc44edef318
runtime.mapaccess1(0xdfa680, 0xc46637e480, 0xc44edef41c, 0xc4647bac60)
/home/travis/.gimme/versions/go1.8.linux.amd64/src/runtime/hashmap.go:319 +0x23a fp=0xc44edef380 sp=0xc44edef338
github.com/ethereumproject/go-ethereum/core/state.(*StateDB).GetStateObject(0xc42781a0f0, 0x29688d6b8aa2fa70, 0xa2c96ed249e6b1a4, 0xc413a49ba3, 0xd52710)
/home/travis/gopath/src/github.com/ethereumproject/go-ethereum/core/state/statedb.go:372 +0xab fp=0xc44edef598 sp=0xc44edef380
github.com/ethereumproject/go-ethereum/core/state.(*StateDB).GetNonce(0xc42781a0f0, 0x29688d6b8aa2fa70, 0xa2c96ed249e6b1a4, 0x13a49ba3, 0xc44edef600)
/home/travis/gopath/src/github.com/ethereumproject/go-ethereum/core/state/statedb.go:230 +0x3f fp=0xc44edef5d0 sp=0xc44edef598
github.com/ethereumproject/go-ethereum/core/state.(*ManagedState).GetNonce(0xc46637e570, 0x29688d6b8aa2fa70, 0xa2c96ed249e6b1a4, 0xc413a49ba3, 0x0)
/home/travis/gopath/src/github.com/ethereumproject/go-ethereum/core/state/managed_state.go:94 +0x13f fp=0xc44edef638 sp=0xc44edef5d0
github.com/ethereumproject/go-ethereum/core.(*TxPool).checkQueue(0xc4202f90e0)
/home/travis/gopath/src/github.com/ethereumproject/go-ethereum/core/tx_pool.go:474 +0x1c1 fp=0xc44edef9d8 sp=0xc44edef638
github.com/ethereumproject/go-ethereum/core.(*TxPool).GetTransactions(0xc4202f90e0, 0x0, 0x0, 0x0)
/home/travis/gopath/src/github.com/ethereumproject/go-ethereum/core/tx_pool.go:396 +0xcc fp=0xc44edefa88 sp=0xc44edef9d8
github.com/ethereumproject/go-ethereum/eth.(*ProtocolManager).syncTransactions(0xc4202d1ba0, 0xc420634240)
/home/travis/gopath/src/github.com/ethereumproject/go-ethereum/eth/sync.go:47 +0x49 fp=0xc44edefb70 sp=0xc44edefa88
github.com/ethereumproject/go-ethereum/eth.(*ProtocolManager).handle(0xc4202d1ba0, 0xc420634240, 0x0, 0x0)
/home/travis/gopath/src/github.com/ethereumproject/go-ethereum/eth/handler.go:278 +0x6c7 fp=0xc44edefdf0 sp=0xc44edefb70
github.com/ethereumproject/go-ethereum/eth.NewProtocolManager.func1(0xc4304c63c0, 0x16d5bc0, 0xc4663abb20, 0x0, 0x0)
/home/travis/gopath/src/github.com/ethereumproject/go-ethereum/eth/handler.go:134 +0x17f fp=0xc44edefee8 sp=0xc44edefdf0
github.com/ethereumproject/go-ethereum/p2p.(*Peer).startProtocols.func1(0xc4663abb20, 0xc4304c63c0)
/home/travis/gopath/src/github.com/ethereumproject/go-ethereum/p2p/peer.go:303 +0x5d fp=0xc44edeffd0 sp=0xc44edefee8
runtime.goexit()
/home/travis/.gimme/versions/go1.8.linux.amd64/src/runtime/asm_amd64.s:2197 +0x1 fp=0xc44edeffd8 sp=0xc44edeffd0
created by github.com/ethereumproject/go-ethereum/p2p.(*Peer).startProtocols
/home/travis/gopath/src/github.com/ethereumproject/go-ethereum/p2p/peer.go:312 +0x238

goroutine 1 [chan receive, 14 minutes]:
github.com/ethereumproject/go-ethereum/node.(*Node).Wait(0xc4200ac500)
/home/travis/gopath/src/github.com/ethereumproject/go-ethereum/node/node.go:491 +0x8b
main.geth(0xc420322780, 0x0, 0x0)
/home/travis/gopath/src/github.com/ethereumproject/go-ethereum/cmd/geth/main.go:257 +0x6e
reflect.Value.call(0xdd2820, 0x1003d90, 0x13, 0xeebd68, 0x4, 0xc42005dd20, 0x1, 0x1, 0xc42005dca8, 0xed3700, ...)
/home/travis/.gimme/versions/go1.8.linux.amd64/src/reflect/value.go:434 +0x91f
reflect.Value.Call(0xdd2820, 0x1003d90, 0x13, 0xc42005dd20, 0x1, 0x1, 0xc4200601e0, 0xc420022000, 0xeefa2c)
/home/travis/.gimme/versions/go1.8.linux.amd64/src/reflect/value.go:302 +0xa4
github.com/ethereumproject/go-ethereum/vendor/gopkg.in/urfave/cli%2ev1.HandleAction(0xdd2820, 0x1003d90, 0xc420322780, 0x0, 0x0)
/home/travis/gopath/src/github.com/ethereumproject/go-ethereum/vendor/gopkg.in/urfave/cli.v1/app.go:480 +0x198
github.com/ethereumproject/go-ethereum/vendor/gopkg.in/urfave/cli%2ev1.(*App).Run(0xc42015d800, 0xc4200100c0, 0xc, 0xc, 0x0, 0x0)
/home/travis/gopath/src/github.com/ethereumproject/go-ethereum/vendor/gopkg.in/urfave/cli.v1/app.go:241 +0x560
main.main()
/home/travis/gopath/src/github.com/ethereumproject/go-ethereum/cmd/geth/main.go:245 +0x57

goroutine 17 [syscall, 14 minutes, locked to thread]:

....

whilei added a commit to whilei/go-ethereum that referenced this issue Jun 30, 2017
solution: sync locks for getStateObject and setStateObject

This is an initial hotfix for ethereumproject#284.

--
Note: StateDB uses stateObjects and stateObjectsDirty as maps,
apparently without any sync locks for granular actions (only statedb New, Reset, copy)

This seems like a red flag to me.
ETHF also doesn't use granular locks for map r/w's.

TODO: research this more.
@whilei
Copy link
Contributor

whilei commented Jun 30, 2017

@ethought Thanks - I've got a PR #285 in to address your issue. I'm researching further changes along these lines, so it will be helpful to know if the current patch resolves it for you.

@whilei
Copy link
Contributor

whilei commented Jul 13, 2017

Closing via merge of #285

@whilei whilei closed this as completed Jul 13, 2017
@Crypto2
Copy link

Crypto2 commented Jun 2, 2018

It looks like there may be some resurgence of this bug in 5.4.0? It seems to stop syncing after a while and I see this:

2018-06-01 20:11:46 Import #5919320 ad064e11 24/ 0 txs/mgas 6/25 peers
fatal error: concurrent map read and map write

goroutine 1123604 [running]:
runtime.throw(0x1176ef4, 0x21)
/usr/local/go/src/runtime/panic.go:605 +0x95 fp=0xc43a4bd200 sp=0xc43a4bd1e0 pc=0x42ca25
runtime.mapaccess1(0x10389c0, 0xc4bed29e30, 0xc43a4bd2fc, 0xc442426f00)
/usr/local/go/src/runtime/hashmap.go:355 +0x238 fp=0xc43a4bd248 sp=0xc43a4bd200 pc=0x4091f8
github.com/ethereumproject/go-ethereum/core/state.(*StateDB).getStateObject(0xc4d55b2900, 0xf25b9458b085817b, 0xc27d884649591ff5, 0xff31f5c0, 0xc43a4bd470)
/home/circleci/go/src/github.com/ethereumproject/go-ethereum/core/state/statedb.go:352 +0xab fp=0xc43a4bd3a8 sp=0xc43a4bd248 pc=0xa3c47b
github.com/ethereumproject/go-ethereum/core/state.(*StateDB).GetCode(0xc4d55b2900, 0xf25b9458b085817b, 0xc27d884649591ff5, 0xff31f5c0, 0xc4268673e0, 0xc4d55b2900, 0xc509577ef0)
/home/circleci/go/src/github.com/ethereumproject/go-ethereum/core/state/statedb.go:223 +0x43 fp=0xc43a4bd3e8 sp=0xc43a4bd3a8 pc=0xa3b153
github.com/ethereumproject/go-ethereum/eth.(*PublicBlockChainAPI).GetCode(0xc420064f00, 0xf25b9458b085817b, 0xc27d884649591ff5, 0xff31f5c0, 0xfffffffffffffffe, 0x0, 0x0, 0x0, 0x0)
/home/circleci/go/src/github.com/ethereumproject/go-ethereum/eth/api.go:734 +0xc0 fp=0xc43a4bd440 sp=0xc43a4bd3e8 pc=0xaf3ae0
runtime.call128(0xc43cc1c960, 0xc42e35d778, 0xc49ee96ff0, 0x2800000048)
/usr/local/go/src/runtime/asm_amd64.s:511 +0x52 fp=0xc43a4bd4d0 sp=0xc43a4bd440 pc=0x459b22
reflect.Value.call(0xc424208a80, 0xc42e35d778, 0x13, 0x1157cb4, 0x4, 0xc49ee96fa0, 0x3, 0x3, 0xc42f3bdde0, 0xc49ee96fa0, ...)
/usr/local/go/src/reflect/value.go:434 +0x905 fp=0xc43a4bd7a8 sp=0xc43a4bd4d0 pc=0x4ba325
reflect.Value.Call(0xc424208a80, 0xc42e35d778, 0x13, 0xc49ee96fa0, 0x3, 0x3, 0x2, 0x2, 0x2)
/usr/local/go/src/reflect/value.go:302 +0xa4 fp=0xc43a4bd810 sp=0xc43a4bd7a8 pc=0x4b9904
github.com/ethereumproject/go-ethereum/rpc.(*Server).handle(0xc42fac1b00, 0x1c41a20, 0xc42cf72500, 0x1c47100, 0xc49ee96eb0, 0xc447c18d80, 0x3, 0xc44a22f9e4, 0x7)
/home/circleci/go/src/github.com/ethereumproject/go-ethereum/rpc/server.go:328 +0x6e4 fp=0xc43a4bd9b8 sp=0xc43a4bd810 pc=0x788a94

@whilei
Copy link
Contributor

whilei commented Jun 2, 2018

@Crypto2

Thanks for noticing and saying something.

The original change was accidentally reverted in a subsequent commit.

I'll submit a PR shortly re-making the fix.

whilei added a commit that referenced this issue Jun 2, 2018
solution: use mutex lock

fixes #284, again
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants