Skip to content
This repository has been archived by the owner on Oct 4, 2019. It is now read-only.

make state sync independent #506

Merged
merged 56 commits into from
Mar 23, 2018
Merged

make state sync independent #506

merged 56 commits into from
Mar 23, 2018

Conversation

whilei
Copy link
Contributor

@whilei whilei commented Feb 16, 2018

This is Frankensteined from ethereum/go-ethereum#14460, please see there for an excellent write up and along-the-way documentation in commit messages; 9509940 is a squashed, cherry-picked commit therefrom.

Scheduling of state node downloads hogged the downloader queue lock when
new requests were scheduled. This caused timeouts for other requests.
With this change, state sync is fully independent of all other downloads
and doesn't involve the queue at all.

It is expected that these changes will resolve #398.

TODO:

--- FAIL: TestFastCriticalRestarts63 (7.56s)
        downloader_test.go:1776: failing fast sync succeeded: <nil>
--- FAIL: TestFastCriticalRestarts64 (6.52s)
        downloader_test.go:1776: failing fast sync succeeded: <nil>
FAIL
FAIL    github.com/ethereumproject/go-ethereum/eth/downloader   69.679s

@tzdybal
Copy link
Contributor

tzdybal commented Feb 19, 2018

I'm running this changeset on my node (both mainnet and morden) and it seems, that it really helps a lot to keep client in sync (unfortunately my mainnet client has other issues).

@tzdybal
Copy link
Contributor

tzdybal commented Feb 19, 2018

As I said before, I'm also debugging the failing tests.

whilei added 11 commits March 2, 2018 15:36
solution: add debugging logs for investigation
solution: implement alternative node iterators based on EF
solution:
- copy common/hexutil from EF
- use non-exported stateObject (this will be reverted)
- modify assorted usages to enable compilation
solution: chase down code integration issues related to state db
refactoring
solution: progress similar to previous, but now focusing on tests

problem still:
- getting now a runtime panic from trie/nodeiterator/leaf
solution: fix implementation.

problem still:
- now, invalid nil pointer at statedb via lru
solution: (WIP) continue patching updates for revised pattern
@@ -125,7 +125,7 @@ func run(ctx *cli.Context) error {
glog.SetV(ctx.GlobalInt(VerbosityFlag.Name))

db, _ := ethdb.NewMemDatabase()
statedb, _ := state.New(common.Hash{}, db)
statedb, _ := state.New(common.Hash{}, state.NewDatabase(db))
Copy link
Contributor Author

@whilei whilei Mar 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a heads up: This naming convention of statedb in this common pattern I find misleading. state.New returns a *StateDB struct, and that's why, but I think even that struct name is misleading because it's really more than just a db... I'd say it's more a StateTrieManager.

Maybe wew can come up with a better naming convention.

solution: modify fsCriticalTrials, and use dlTester-based delay attribute instead of sync fn signature arg
solution: halve the fsCriticalRestarts in trials iterations
…l to statedb.IntermediateRoot

solution: add arg
Copy link
Contributor

@tzdybal tzdybal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this test was not deterministic before, I think we can keep status quo.
Changes looks good to me. And I used them for long time in production without problems.
Lets merge this!

@whilei
Copy link
Contributor Author

whilei commented Mar 22, 2018

Alright, I'm going to add a merge commit from master now including #537, and see how the CIs look again. My only remaining concern is Windows, and I just want to see if Appveyor seems to consistently fail the test while Mac and Linux succeed.

@whilei whilei changed the title [WIP] make state sync independent make state sync independent Mar 22, 2018
@whilei
Copy link
Contributor Author

whilei commented Mar 22, 2018

Because if Windows "always" fails, we're going to have a hard time getting a windows distribution artifact...

solution: refactor, add sleep for fullsync mode in test
solution: refactor to use t.Run syntax
@whilei whilei merged commit e6b4a8e into master Mar 23, 2018
@tzdybal tzdybal deleted the fix/dl-stop-syncing branch March 27, 2018 12:37
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Stops synchronization after some time
3 participants