Skip to content
This repository has been archived by the owner on Oct 28, 2021. It is now read-only.

Download snapshot through Parity's warp protocol #4622

Merged
merged 7 commits into from
Jan 31, 2018

Conversation

gumb0
Copy link
Member

@gumb0 gumb0 commented Oct 24, 2017

This is the solution using boost.fiber

Motivation for using boost.fiber

The approach that BlockChainSync class takes to deal with asynchronous p2p nature of the sync looks like an anti-pattern that led to it becoming utterly unmaintainable. That approach could be described as having several network message handlers in one class, saving info about current stage of the sync in a bunch of member variables in the class, branching a lot in each handler depending on the current state of member variables. This way the essence of the sync algorithm itself gets spread out over a lot of different parts of the class, it's hard to keep track of what's currently going on and what should happen next.

I am looking for a better approach that can elegantly express what we want to achieve. Fibers allow us here to have a "main algorithm" in one fiber that gets interrupted each time we need to get some data asynchronously, then gets resumed from the place we left off when we have data already. So the main algorithm looks like it's synchronous.
The data itself is sent to fiber through buffered_channel - producers/consumers-kind of data structure, where the fiber gets "blocked" if it tries to get the data not pushed yet to the channel.
To some extent I think this is similar to goroutines and channels of golang.

Also switching context between the fibers is very cheap unlike with threads, and it all happens in the single thread, therefore no need to deal with thread-safety/possible races/mutexes etc.

This is still kind of experiment to see where it goes, I myself am not fully convinced that fibers are the best solution for this.

Current status:

  • It is able to download the snapshot from from several peers at once (this is the improvement over the version in WIP - Download snapshot through Parity's warp protocol #4227)
  • It starts downloading from the first peer found that supports warp protocol (this probably should be changed to better strategy in later PRs) It's better currently to run it giving the peer address from the needed network with --peerset required:
  • Peer not responding to request should be handled correctly, but I haven't seen this in practice and still need to test if this works.

Still to be done:

  • Probably peer disconnecting should be handled better - if it happens after we reqested a chunk, I think chunk will stay marked as being downloaded forever.
  • Without --peerset we connect to Ethereum Classic nodes very often and happily download the snapshot for classic chain. To deal with this probably we need to do DAO challenge first similar to full sync.
  • Need to add boost.fiber to hunter's boost to be able to link at all (no idea yet how to do it)
  • fix macOS build failure because of unused SnapshotLog::debug - this doesn't make sense to me yet looks like I've already found a workaround doing import https://github.com/ethereum/cpp-ethereum/blob/04e8ca5afe785cafe18905d6f0f032cb1d869c7a/libethereum/SnapshotImporter.cpp#L45
  • tests

@axic
Copy link
Member

axic commented Oct 24, 2017

Is there an EIP about the warp protocol? Wouldn't it make sense creating one?

cc @arkpar @karalabe

@gumb0
Copy link
Member Author

gumb0 commented Oct 24, 2017

Now there's only parity's doc https://github.com/paritytech/parity/wiki/Warp-Sync

@gumb0 gumb0 force-pushed the snapshot-download-fibers branch 3 times, most recently from 52c6019 to 2444532 Compare October 24, 2017 12:05
eth/main.cpp Outdated
@@ -148,6 +148,7 @@ void help()
<< " --to <n> Export only to block n (inclusive); n may be a decimal, a '0x' prefixed hash, or 'latest'.\n"
<< " --only <n> Equivalent to --export-from n --export-to n.\n"
<< " --dont-check Prevent checking some block aspects. Faster importing, but to apply only when the data is known to be valid.\n\n"
<< " --download-snapshot <path> Download Parity Warp snapshot data to the specified path." << endl
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent so much time removing endls.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I'll fix this 😆
Anyway I hope this to go away after #4597

namespace eth
{

class WarpHostCapability: public p2p::HostCapability<WarpPeerCapability>, Worker
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this inherits Worker, there should be a destructor that calls terminate() (bad design it is...).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, thanks

@gumb0 gumb0 force-pushed the snapshot-download-fibers branch 2 times, most recently from 1d35ec8 to cfdc66f Compare October 25, 2017 16:40
@gumb0 gumb0 force-pushed the snapshot-download-fibers branch 2 times, most recently from f8bc9a9 to 3216276 Compare November 23, 2017 16:21
@chfast
Copy link
Member

chfast commented Dec 5, 2017

Using Fiber sounds like interesting experiment. They seems to work nice together with boost.asio.
This video have an idea how to manage lifetime of connection handlers: https://www.youtube.com/watch?v=rwOv_tw2eA4.

@codecov-io
Copy link

codecov-io commented Dec 13, 2017

Codecov Report

Merging #4622 into develop will decrease coverage by 0.77%.
The diff coverage is 29.8%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #4622      +/-   ##
===========================================
- Coverage    60.63%   59.86%   -0.78%     
===========================================
  Files          348      348              
  Lines        27425    27615     +190     
  Branches      2857     2887      +30     
===========================================
- Hits         16630    16531      -99     
- Misses        9800    10113     +313     
+ Partials       995      971      -24

@gumb0
Copy link
Member Author

gumb0 commented Dec 20, 2017

Another relevant talk: https://www.youtube.com/watch?v=mCD6VLVS_y4
(he starts to talk about fibers only on minute 44 though)

@gumb0 gumb0 force-pushed the snapshot-download-fibers branch 4 times, most recently from 02ff5ef to 769e75f Compare January 9, 2018 16:41
@gumb0 gumb0 force-pushed the snapshot-download-fibers branch 6 times, most recently from 5284451 to 8778947 Compare January 18, 2018 15:40
@gumb0
Copy link
Member Author

gumb0 commented Jan 19, 2018

Looks like boost.fiber from hunter now fails to build on GCC 4.8 (including Travis) but builds on GCC 5

@gumb0
Copy link
Member Author

gumb0 commented Jan 19, 2018

Might be related: boostorg/fiber#121 but should be fixed in boost 1.66 used here

@gumb0
Copy link
Member Author

gumb0 commented Jan 22, 2018

Looks like boost.fiber starting with boost 1.66 requires std::regex support and GCC 4.8 doesn't support it
Related: boostorg/fiber#147

@gumb0
Copy link
Member Author

gumb0 commented Jan 23, 2018

I pinned boost version to 1.65.1 for now, it seems to work fine on GCC 4.8
It might be resolved in boost.fiber, see boostorg/fiber#152
Otherwise we would need to give up GCC 4 support

@chfast
Copy link
Member

chfast commented Jan 23, 2018

I don't see problem with going with 1.65.1 for now and then skip 1.66.

@gumb0 gumb0 mentioned this pull request Jan 24, 2018
gumb0 added 6 commits January 25, 2018 11:46
Switch to using fiber::buffered_channel for free peers, because it's lock-free when pushing below capacity.
…ether it is suitable for snapshot download

- Request DAO fork block from the peer to understand whether it's on the right side of the fork
- Explicitly handle peer disconnect
@gumb0 gumb0 force-pushed the snapshot-download-fibers branch from a3326e7 to 8d3f1ad Compare January 25, 2018 10:47
@gumb0 gumb0 removed the in progress label Jan 25, 2018
@gumb0
Copy link
Member Author

gumb0 commented Jan 25, 2018

I might add some tests/minor tweaks, but overall it's ready to be reviewed I think.

@gumb0
Copy link
Member Author

gumb0 commented Jan 25, 2018

A couple of disadvantages of using fibers that I see, to be fair:

  • Data from network messages might be copied more often. That is, when all the logic of network algorithm is in network handlers (like in BlockChainSync) we often can accesss the data through RLP class only, that is using only with the view of the data without additional copying. Here we immediately copy this data (into futures or channels) to be handled by the different fiber later.
  • Debugging might be difficult, because debuggers don't have native support for the fibers, they can't show which fibers exist at the moment and at which point each fiber is suspended.
    Theoretically we should be able to see current fibers list if we build debug boost.fibers and look up the stack, but I didn't try this.

@gumb0 gumb0 requested a review from chfast January 30, 2018 11:04
eth/main.cpp Outdated
@@ -342,6 +342,7 @@ int main(int argc, char** argv)
("only", po::value<string>()->value_name("<n>"), "Equivalent to --export-from n --export-to n.")
("format", po::value<string>()->value_name("<binary/hex/human>"), "Set export format.")
("dont-check", "Prevent checking some block aspects. Faster importing, but to apply only when the data is known to be valid.")
("download-snapshot", po::value<string>()->value_name("<path>"), "Download Parity Warp Sync snapshot data to the specified path.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you provide the storage like this: po::value(&snapshotPath). No need for if (vm.count("download-snapshot")) then.

/// Trivial forwarding constructor.
EthashClient(ChainParams const& _params, int _networkID, p2p::Host* _host,
std::shared_ptr<GasPricer> _gpForAdoption,
boost::filesystem::path const& _dbPath = boost::filesystem::path(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorter: = {}.

@gumb0
Copy link
Member Author

gumb0 commented Jan 31, 2018

Issues fixed, I'll do the changes around reformatting program_options-setup code in a separate PR

@chfast
Copy link
Member

chfast commented Jan 31, 2018

Let's ship it.

@chfast
Copy link
Member

chfast commented Jan 31, 2018

:shipit:

@gumb0 gumb0 merged commit 8a0f6ad into develop Jan 31, 2018
@gumb0 gumb0 deleted the snapshot-download-fibers branch January 31, 2018 14:00
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants