Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a workaround for waiting till node is synced with network #565

Merged
merged 3 commits into from
Jan 25, 2018

Conversation

dshulyak
Copy link
Contributor

@dshulyak dshulyak commented Jan 22, 2018

Fixes: #563

At the moment every second test will halt because of EnsureNodeSync.
It happens because first test usually downloads all the new state from the
network and for all following tests progress.HighestBlock will be 0.

It will be 0 because highest block is updated only if syncing with peer was started.

@adambabik
Copy link
Contributor

Without progress.HighestBlock > 0, the condition can be met even though the syncing is still in progress and EnsureNodeSync is supposed to return only if the blockchain is synced.

I would implement it as a prerequisite. We can run statusd and wait until it syncs and only after that start e2e tests. This way, it won't get interrupted and should sync in a few mins. The synced data should be cached for subsequent calls. The problem is that we don't have a mechanism that could easily stop statusd when it is sync. We can do some polling and use eth_syncing, though.

At the moment every second test will halt because of EnsureNodeSync.
It happens because first test usually downloads all the new state from the
network and for all following tests progress.HighestBlock will be 0.

It will be 0 because highest block is updated only if syncing with peer was started.
@dshulyak
Copy link
Contributor Author

@adambabik After deeper look at the downloader and fetcher I changed procedure. Please take a look.

So, the first condition would be a sanity check that we have peers connected to our server. If there is no peers we can assume that we don't know if syncing is required and we should continue waiting.

Second, check if sync is running according to the downloader. If it is running that we simply continue waiting if not we move to the next one.

Compare current with high block. In my understanding it is required only if sync was interrupted, e.g current best peer will go offline.

I don't see now how EnsureNodeSync can exit if sync is still in progress, or how it won't start when it is actually required. What do you think?

@themue @divan Can you also take a look at this change?

@dshulyak
Copy link
Contributor Author

public tests now fail with following error, but i am sure that it is not related to original problem

--- FAIL: TestCallContextResult (2.43s)
	Error Trace:	rpc_test.go:186
	Error:      	Should be true
	Messages:   	balance should be higher than 0
--- FAIL: TestRPCTestSuite (11.07s)

@dshulyak
Copy link
Contributor Author

dshulyak commented Jan 23, 2018

#567

go test -timeout 20m ./e2e/whisper/... -network=4
--- FAIL: TestRequestMessageFromMailboxAsync (2.31s)
	Error Trace:	whisper_mailbox_test.go:91
	Error:      	Not equal: 
	            	expected: 0
	            	received: 1
CRIT [01-23|11:45:53] Failed to store last header's hash       err="leveldb: closed"
FAIL	github.com/status-im/status-go/e2e/whisper	24.689s

Copy link
Contributor

@adambabik adambabik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As written in one comment, I recommend syncing from scratch and checking through the JSON-RPC method how it advances. I only remember that with LES it's a bit different and not intuitive.

@@ -298,6 +298,10 @@ func (m *NodeManager) addPeer(url string) error {
return nil
}

func (m *NodeManager) PeerCount() int {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m.node and m.node.Server() may be nil. In such cases, I would return 0.

continue
}
progress := downloader.Progress()
if progress.CurrentBlock >= progress.HighestBlock {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend running statusd -les -http and checking how the sync progress looks like using https://github.com/ethereum/wiki/wiki/JSON-RPC#eth_syncing. As far as I remember, with LES it's a bit strange and that's why progress.HighestBlock > 0 was added.

Maybe these checks are good, though but it would be great to confirm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It uses the same source of info as we are https://github.com/ethereum/go-ethereum/blob/master/internal/ethapi/api.go#L79-L94, so I just printed the progress:

INFO [01-23|16:38:58] Imported new block headers               count=384 elapsed=112.830ms number=1226112 hash=f197f6…495d1f ignored=0
INFO [01-23|16:38:58] Imported new block headers               count=192 elapsed=57.979ms  number=1226304 hash=ce5203…750e53 ignored=0
INFO [01-23|16:38:58] synchronization is not finished yet: current block 1226304 < highest block 1643240 geth=StatusIM
INFO [01-23|16:38:58] Imported new block headers               count=576 elapsed=165.479ms number=1226880 hash=60bd3c…9ba1e4 ignored=0
INFO [01-23|16:38:58] Imported new block headers               count=192 elapsed=55.092ms  number=1227072 hash=fcbc33…ef23af ignored=0
INFO [01-23|16:38:59] Imported new block headers               count=960 elapsed=283.560ms number=1228032 hash=cf13d1…68d1ba ignored=0
INFO [01-23|16:38:59] Imported new block headers               count=384 elapsed=116.356ms number=1228416 hash=db9630…6e83b3 ignored=0
INFO [01-23|16:38:59] Imported new block headers               count=768 elapsed=212.664ms number=1229184 hash=87f2da…b5ad22 ignored=0
INFO [01-23|16:38:59] synchronization is not finished yet: current block 1229184 < highest block 1643240 geth=StatusIM
INFO [01-23|16:38:59] Imported new block headers               count=768 elapsed=215.851ms number=1229952 hash=27dfbb…2ea2b3 ignored=0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Let's stay with this then.

Copy link
Contributor

@divan divan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code is fine with me, but does anyone understand completely the logic around syncing and Current/HighestBlock variables? I'm not sure I completely get it. Or, better say, CurrentBlock >= HighestBlock does make sense now, but I don't understand why HighestBlock > 0 was working before?

@adambabik
Copy link
Contributor

CurrentBlock >= HighestBlock does make sense now, but I don't understand why HighestBlock > 0 was working before?

I also can't recall why HighestBlock > 0. However, as it's broken now, we can simplify it and improve it later if needed.

Follow-up issue #568. This is good but it does not solve the problem which is starting e2e tests before the blockchain is synced.

@adambabik adambabik merged commit 889eeca into develop Jan 25, 2018
@adambabik adambabik deleted the workaroundsync branch January 25, 2018 07:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants