Skip to content
This repository has been archived by the owner on Oct 4, 2019. It is now read-only.

Stops synchronization after some time #398

Closed
splix opened this issue Nov 5, 2017 · 10 comments · Fixed by #506
Closed

Stops synchronization after some time #398

splix opened this issue Nov 5, 2017 · 10 comments · Fixed by #506
Assignees

Comments

@splix
Copy link
Contributor

splix commented Nov 5, 2017

Geth 4.1.0, Ubuntu+Docker+Kubernetes, Morden chain

Always stops chain synchronization after 1-4 hours, even for initial sync. Successfully continues download after restart.

Maybe related lack of peers on morden. It seems that geth removes a peer from the list, because of some error, and at the end gets left without any peer at all, even if most of them are back online.

@whilei
Copy link
Contributor

whilei commented Nov 7, 2017

I am unable to reproduce. Have synced Morden fully 3 times today.

How long are you waiting when sync stops? What does it look like? Do you have logs from when this happens? Can you share your docker/k8s config so I can try with that?

@tzdybal
Copy link
Contributor

tzdybal commented Nov 7, 2017

I synced Morden recently few times on different machines (both virtual and physical), with no problems.

In the past I have some pauses in synchronization - in the morning I found that nothing was logged for about 3 hours in the night. But I ignored the issue, because geth resumed automatically, I was not sure about the stability of network link.

@whilei
Copy link
Contributor

whilei commented Nov 9, 2017

It may also be possible that this is related to an ECIP1017 fork. Many nodes are not yet upgraded to include the change on Morden, eg our PR to Parity merged only 3 weeks ago - and many geth nodes I see are still running 3.5.0.

@splix
Copy link
Contributor Author

splix commented Nov 10, 2017

@whilei that was happening even for initial blocks, say before 1500000, so a potential incompatibility after block 2M should not affect that

@splix
Copy link
Contributor Author

splix commented Nov 10, 2017

Ok, let's keep it in mind, we'll get numbers as soon as we'll get better monitoring/logging/console

@whilei
Copy link
Contributor

whilei commented Nov 20, 2017

Possibly rel #346

whilei added a commit that referenced this issue Nov 23, 2017
… re: reorg

solution: modify reorg conditional in blockchain insertchain to use smaller block
number or randomized block reorg based on http://www.cs.cornell.edu/~ie53/publications/btcProcFC.pdf

rel #346
rel #398
possibly #378
whilei added a commit to whilei/go-ethereum that referenced this issue Nov 24, 2017
… re: reorg

solution: modify reorg conditional in blockchain insertchain to use smaller block
number or randomized block reorg based on http://www.cs.cornell.edu/~ie53/publications/btcProcFC.pdf

rel ethereumproject#346
rel ethereumproject#398
possibly ethereumproject#378
whilei added a commit that referenced this issue Nov 24, 2017
… re: reorg

solution: modify reorg conditional in blockchain insertchain to use smaller block
number or randomized block reorg based on http://www.cs.cornell.edu/~ie53/publications/btcProcFC.pdf

rel #346
rel #398
possibly #378
sorpaas pushed a commit that referenced this issue Dec 7, 2017
… re: reorg

solution: modify reorg conditional in blockchain insertchain to use smaller block
number or randomized block reorg based on http://www.cs.cornell.edu/~ie53/publications/btcProcFC.pdf

rel #346
rel #398
possibly #378
@splix
Copy link
Contributor Author

splix commented Jan 5, 2018

Still happening with 4.2:

2018-01-05 08:13:58 Sync      #2644587 of  #5148225 9b45654f    98/ 164/ 4 blk/txs/mgas sec 24/100 peers
E0105 08:14:01.435618 eth/downloader/downloader.go:1468] Result #2649611 [266896f2…] processing failed: Block's parent unknown d5790b34ea44f372d274da28513b06bd56e03fc49d9baf6b47252f8e2f3154a5
E0105 08:14:57.501460 eth/downloader/downloader.go:1468] Result #2657803 [17580b5c…] processing failed: Block's parent unknown a97e17e648af9a67135141658f1b49b74885ab86fab382107123000921ba66f2
2018-01-05 08:14:58 Sync      #2649610 of  #5148225 d5790b34    83/ 152/ 4 blk/txs/mgas sec 26/100 peers
2018-01-05 08:15:57 Sync      #2649610 of  #5148225 d5790b34     0/   0/ 0 blk/txs/mgas sec 29/100 peers
2018-01-05 08:16:57 Sync      #2649610 of  #5148225 d5790b34     0/   0/ 0 blk/txs/mgas sec 35/100 peers
....
2018-01-05 09:07:57 Sync      #2649610 of  #5148225 d5790b34     0/   0/ 0 blk/txs/mgas sec 50/100 peers
2018-01-05 09:08:57 Sync      #2649610 of  #5148225 d5790b34     0/   0/ 0 blk/txs/mgas sec 50/100 peers
2018-01-05 09:09:57 Sync      #2649610 of  #5148225 d5790b34     0/   0/ 0 blk/txs/mgas sec 50/100 peers
2018-01-05 09:10:57 Sync      #2649610 of  #5148225 d5790b34     0/   0/ 0 blk/txs/mgas sec 50/100 peers

Notice this eth/downloader/downloader.go:1468] Result #2649611 [266896f2…] processing failed: Block's parent unknown d5790b34ea44f372d274da28513b06bd56e03fc49d9baf6b47252f8e2f3154a5, stopped to sync right after it

@whilei
Copy link
Contributor

whilei commented Jan 8, 2018

Possibly related to #448, since this is an errInvalidChain and the peer should be dropped

@whilei
Copy link
Contributor

whilei commented Feb 5, 2018

Another example of this issue has been brought to my attention:

2018-02-02 09:49:27 Sync        #56568 of  #5315828 d1fbead3    18/  10/ 0 blk/txs/mgas sec  3/25 peers
2018-02-02 09:50:27 Sync        #58091 of  #5315828 d7c65a27    25/   4/ 0 blk/txs/mgas sec  3/25 peers
2018-02-02 09:51:27 Sync        #59572 of  #5315828 7bcc6cd9    24/   4/ 0 blk/txs/mgas sec  4/25 peers
E0202 09:51:57.241434 eth/downloader/downloader.go:1468] Result #60394 [e3bb0a5d…] processing failed: Block's parent unknown 96874a4f5d6a8271edec1c78c0668d7d889fd5                                                                           c6ecbf28aa6b0d2e665320c2a7
E0202 09:52:25.307636 eth/downloader/downloader.go:1468] Result #68586 [04f78440…] processing failed: Block's parent unknown 6675cd1da50f5c7550bb80f620f5f294682224                                                                           8b3a0861ff334226bce62a5649
2018-02-02 09:52:27 Sync        #60393 of  #5315828 96874a4f    13/   4/ 0 blk/txs/mgas sec  3/25 peers
2018-02-02 09:53:27 Sync        #60393 of  #5315828 96874a4f     0/   0/ 0 blk/txs/mgas sec  2/25 peers
2018-02-02 09:54:27 Sync        #60393 of  #5315828 96874a4f     0/   0/ 0 blk/txs/mgas sec  2/25 peers
2018-02-02 09:55:27 Sync        #60393 of  #5315828 96874a4f     0/   0/ 0 blk/txs/mgas sec  2/25 peers
2018-02-02 09:56:27 Sync        #60393 of  #5315828 96874a4f     0/   0/ 0 blk/txs/mgas sec  2/25 peers
2018-02-02 09:57:27 Sync        #60393 of  #5315828 96874a4f     0/   0/ 0 blk/txs/mgas sec  2/25 peers
2018-02-02 09:58:27 Sync        #60393 of  #5315828 96874a4f     0/   0/ 0 blk/txs/mgas sec  5/25 peers

Detailed logs from apparent time of failure:

000000000000000000000000000000000000000000000 logs=[]}
I0202 09:52:25.118010 core/blockchain.go:1510] [1517561545117984677] inserted block #60392 (1 TXs 21000 G 1 UNCs) [0xce3cf649b168ae82a11f6bde9bc2f1b7bdd61fd8984921edc7fbb75c062f4a4a]. Took 136.580085ms
I0202 09:52:25.119411 core/state_processor.go:136] receipt{med=4883f794972b3a099c574733d8a1d5e5bd50a5ee8e8aa2559a8b9572e318d579 cgas=21612 bloom=00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 logs=[]}
I0202 09:52:25.119940 core/vm/vm.go:113] running byte VM a9869c70
I0202 09:52:25.120181 core/vm/vm.go:116] byte VM a9869c70 done. time: 230.236µs instrc: 126
I0202 09:52:25.120400 core/state_processor.go:136] receipt{med=b45645c48c12dd877ae18631e237c40955e2b7ccbab3cf31285f4b27349661c8 cgas=78441 bloom=00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 logs=[]}
I0202 09:52:25.121132 core/blockchain.go:1510] [1517561545121110042] inserted block #60393 (2 TXs 78441 G 0 UNCs) [0x96874a4f5d6a8271edec1c78c0668d7d889fd5c6ecbf28aa6b0d2e665320c2a7]. Took 2.962765ms
I0202 09:52:25.121300 core/blockchain.go:1560] imported 361 block(s) (0 queued 1687 ignored) including 157 txs in 27.878542703s. #60393 [0xff707fa349da56330cd3070a9d98c94e4ea7ab0bdadf9d38794f41b2f0c1b044 / 0x96874a4f5d6a8271edec1c78c0668d7d889fd5c6ecbf28aa6b0d2e665320c2a7]
I0202 09:52:25.181230 eth/downloader/downloader.go:1426] Inserting chain with 8192 items (#68586 [04f78440…] - #76777 [a89d6a2f…])
I0202 09:52:25.189013 eth/peer.go:216] Peer 4329eb2c7e622064 [eth/63] fetching 128 block bodies first=0xc81bd035b3ff248925129eec554cf80fa40c46fad629a8103850b742311820a6
I0202 09:52:25.196067 eth/peer.go:216] Peer e1520f00ff23e82c [eth/63] fetching 128 block bodies first=0x6253da5fca83710dbc787ab86093fa6efa2010567bd8c538239c0a0d24f14284
I0202 09:52:25.268362 eth/peer.go:216] Peer 4329eb2c7e622064 [eth/63] fetching 128 block bodies first=0x04dcc3f0b2bde57bd3a58d5ec037752ade2aaee4306068f3e1713b7d6e990dd4
E0202 09:52:25.307636 eth/downloader/downloader.go:1468] Result #68586 [04f78440…] processing failed: Block's parent unknown 6675cd1da50f5c7550bb80f620f5f2946822248b3a0861ff334226bce62a5649
I0202 09:52:25.312573 eth/peer.go:216] Peer 4329eb2c7e622064 [eth/63] fetching 128 block bodies first=0x2925208f91488a9287e0b08d07a25895350c2dec4b23e8cee69465182bb67a6f
I0202 09:52:25.337366 p2p/server.go:589] Accepted conn 104.200.67.93:55230
I0202 09:52:25.354559 eth/peer.go:216] Peer 4329eb2c7e622064 [eth/63] fetching 128 block bodies first=0x894fc7558428cee0e37133c60ed8abfcd0d267bfa4efdcd013cdc8ec02041371
I0202 09:52:25.428373 eth/peer.go:216] Peer 4329eb2c7e622064 [eth/63] fetching 128 block bodies first=0x2583fa29b9afa5f5c473df0a93fa9a62eead0bb1ba5abe268760debb0996d09b
I0202 09:52:25.478044 p2p/server.go:589] Accepted conn 138.68.99.116:33410
I0202 09:52:25.486092 p2p/server.go:682] Added Peer id=9de1cabecb6526cd addr=138.68.99.116:33410 name=Parity/v1.8.6-beta-2d051e4-20180109/x86_64-linux-gnu/rustc1.22.1
I0202 09:52:25.486213 eth/handler.go:252] Peer 9de1cabecb6526cd [eth/63]: peer connected [Parity/v1.8.6-beta-2d051e4-20180109/x86_64-linux-gnu/rustc1.22.1]
W0202 09:52:25.486857 p2p/server.go:634] inbound conn ded0f7199506a956 104.200.67.93:55230 failed proto handshake: EOF
I0202 09:52:25.492640 eth/downloader/downloader.go:1577] Quality of service: rtt 2s, conf 0.322, ttl 18.644194197s
I0202 09:52:25.492717 eth/peer.go:209] Peer 9de1cabecb6526cd [eth/63] fetching 1 headers from #1920000, skipping 0 (reverse = false)
I0202 09:52:25.505408 eth/handler.go:194] Removing peer 9de1cabecb6526cd

@whilei
Copy link
Contributor

whilei commented Feb 6, 2018

I have again reproduced this. Looking like E0206 11:44:14.538993 eth/downloader/downloader.go:1468] Result #1831940 [59c27f88…] processing failed: Block's parent unknown 3285408d65fa9e81f9a5bdf40703d7a1dc3f7c96a5062074b88d9425c02f5b8a is near the cause or effect of the problem.

I0206 11:44:14.368974 core/blockchain.go:1510] [1517885054368960000] inserted block #1823745 (2 TXs 42000 G 0 UNCs) [0x035de0490b5e45c9605b1dbff7a165125300c862d9399a5a1418dec22723cf16]. Took 1.977421ms
I0206 11:44:14.376009 core/blockchain.go:1510] [1517885054375995000] inserted block #1823746 (2 TXs 64603 G 1 UNCs) [0x1598f63c5010cae55da6b3f2065ef34f48bfe76381f76fa6a6dccdb478d303b8]. Took 6.831514ms
I0206 11:44:14.377353 core/blockchain.go:1510] [1517885054377340000] inserted block #1823747 (1 TXs 21000 G 0 UNCs) [0x314f82ebc1e16c013c6bdb92b76066c8bede1473bfe582dada179f06c158eed4]. Took 1.195259ms
I0206 11:44:14.377491 core/blockchain.go:1560] imported 2048 block(s) (0 queued 0 ignored) including 9891 txs in 15.337596494s. #1823747 [0x9881fe9e75f6a53f6ddc21a75af91360754a5fe2d3f85fc806a025062e415cff / 0x314f82ebc1e16c013c6bdb92b76066c8bede1473bfe582dada179f06c158eed4]
I0206 11:44:14.405882 eth/downloader/downloader.go:1426] Inserting chain with 8192 items (#1831940 [59c27f88…] - #1840131 [920697e6…])
E0206 11:44:14.538993 eth/downloader/downloader.go:1468] Result #1831940 [59c27f88…] processing failed: Block's parent unknown 3285408d65fa9e81f9a5bdf40703d7a1dc3f7c96a5062074b88d9425c02f5b8a
W0206 11:44:15.326204 eth/downloader/downloader.go:342] sync busy
W0206 11:44:16.009376 eth/downloader/downloader.go:342] sync busy
I0206 11:44:16.045503 eth/fetcher/fetcher.go:345] [eth/62] Peer 93f07ee31c336b94: discarded announcement #5337891 [0x722d80a61d651eb2b70c5ca0145b042ca7cddcb6b71f436cfc69c169aba2d2ec], distance 3514144
I0206 11:44:16.285118 eth/fetcher/fetcher.go:345] [eth/62] Peer 10f514014316a39e: discarded announcement #5337891 [0x722d80a61d651eb2b70c5ca0145b042ca7cddcb6b71f436cfc69c169aba2d2ec], distance 3514144
I0206 11:44:16.285210 eth/fetcher/fetcher.go:345] [eth/62] Peer c82787ef94d4fed3: discarded announcement #5337891 [0x722d80a61d651eb2b70c5ca0145b042ca7cddcb6b71f436cfc69c169aba2d2ec], distance 3514144
I0206 11:44:16.396832 eth/fetcher/fetcher.go:345] [eth/62] Peer 4558a97ffba5ef40: discarded announcement #5337891 [0x722d80a61d651eb2b70c5ca0145b042ca7cddcb6b71f436cfc69c169aba2d2ec], distance 3514144
W0206 11:44:16.808207 eth/downloader/downloader.go:342] sync busy
W0206 11:44:16.815432 eth/downloader/downloader.go:342] sync busy
W0206 11:44:18.808442 eth/downloader/downloader.go:342] sync busy
W0206 11:44:19.278952 eth/downloader/downloader.go:342] sync busy
W0206 11:44:19.821848 eth/downloader/downloader.go:342] sync busy
W0206 11:44:19.863653 eth/downloader/downloader.go:342] sync busy
W0206 11:44:20.559650 eth/downloader/downloader.go:1174] Peer id=6b9f983a61256bb7 eth/63 [Parity/v1.7.8-stable-d5fcf3b-20171025/x86_64-windows-msvc/rustc1.20.0] [hs 416.97/s, bs 216.91/s, rs 0.00/s, ss 0.00/s, miss    0, rtt 2s]: stalling body delivery, dropping
W0206 11:44:21.093727 eth/downloader/downloader.go:342] sync busy
W0206 11:44:21.426940 eth/downloader/downloader.go:342] sync busy
W0206 11:44:21.889836 eth/downloader/downloader.go:342] sync busy
W0206 11:44:21.969829 eth/downloader/downloader.go:342] sync busy
W0206 11:44:24.111446 eth/downloader/downloader.go:342] sync busy
W0206 11:44:25.227256 eth/downloader/downloader.go:342] sync busy
W0206 11:44:25.492245 eth/downloader/downloader.go:342] sync busy
W0206 11:44:26.680338 eth/downloader/downloader.go:342] sync busy
W0206 11:44:26.744912 eth/downloader/downloader.go:342] sync busy
W0206 11:44:26.811892 eth/downloader/downloader.go:342] sync busy

Also, just to note this bug is not limited to Morden. Above two logs from mainnet, for example.

@whilei whilei self-assigned this Feb 6, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants