Warp Sync: Alternative snapshot formats #8565

rphmeier · 2018-05-08T09:48:34Z

We want to investigate new snapshot formats which are better for the following properties:

Generation of (partial) snapshots on nodes
Reusability of chunks between snapshots to make finding them easier
Trustworthiness of data

Snapshot chunks are currently divided into two categories:

state chunks encode the entire account state of the blockchain at some block
security chunks provide corroboration for whether that block is valid without syncing the whole chain

W.r.t. different consensus engines, the "security" chunks will look completely different but will usually contain reusable data. For example, validator-set based consensus systems prove security with a series of bootstrapped handoffs (as long as we relax the weak subjectivity model to assume that old validator sets remain uncorrupted). All finalized handoffs can be reused, although usually their proof is small enough that all the handoffs can fit in a single chunk. Depending on the state churn and snapshot distance we may also be able to reuse some state chunks.

A keystone/delta model where we have intermittent "full" state snapshots every N*K blocks and the snapshots between them (every K blocks) only store deltas over that state is one possibility.

One major problem with the current snapshot system is that it is too heavy for most nodes to produce a snapshot before they prune the state that it encodes from their database. State chunks are currently very tightly packed using a method that makes it impossible to determine which account a chunk starts at or the exact data of the account entry without having produced all the chunks before. One possibility is to design predictable scheme for the boundaries of chunks will allow nodes to produce some of the state chunks but not all.

We can augment this scheme with random sampling: nodes which don't produce full snapshots will randomly sample some accounts and produce account entries for them, which they will keep on disk. They will refuse to propagate any snapshot where their random sample doesn't match the data in the snapshot. Assuming all repropagating nodes have their own random sample and a sufficiently large network, this makes it very unlikely for bad snapshot data to make its way through to unsynced nodes.

cc @ngotchac, @Vurich, @ordian

rphmeier · 2018-05-08T12:33:00Z

@ngotchac and I had a conversation about more predictable state chunk starting points and came to this proposal:

We get rid of duplicate-code optimizations where contract code shared between accounts is only present once (this may not affect size significantly for a few reasons).

We follow the same chunk-splitting algorithm as normal, but also enforce that groups of chunks only cover certain ranges of addresses. For example, if we want to have sets of state chunks each covering 1/256th of accounts, we would have a chunk starting with the first account after 0x00000... and a chunk ending with the last account before 0x01000.... We will decide how large the ranges should be based on the number of accounts in the state and how many accounts a node can reasonably fetch data for. Since it's not fast to determine how many accounts there are, we use the following heuristic:

const N_SAMPLES = 1024;

let mut seed = snapshot_block.hash;
let mut total_depth = 0;
for i in 0..N_SAMPLES {
   seed = keccak256(seed);
   take random path down account trie using seed until a leaf is reached (TBD)
   let d = count depth of terminus; // consider root as having depth 0
   total_depth += d;
}

let avg_depth = total_depth / N_SAMPLES;

return 16^avg_depth;

Then, assuming each node can process K accounts before the pruning history is up, we can assign the range size based on the number of accounts and our assumption of K. Nodes will produce as many "sub-snapshots" of K accounts, randomly selecting ranges, as possible before the state of the snapshot block is pruned.

notes on the account-counting heuristic:

taking a random path down the trie is possible by taking 4 bits from the seed every time we encounter a branch. but if the branch isn't fully packed (very possible), we will need to decide how to interpret those bits.
To avoid consistently overestimating to the next power of 16 accounts, we should find some way to deterministically do sub-integer powers of 16 with the avg_depth.
When counting the amount of accounts, what we really care about is the branching factor

tomusdrw · 2018-05-08T12:49:37Z

We will decide how large the ranges should be based on the number of accounts in the state and how many accounts a node can reasonably fetch data for.

That will only work given uniform distribution of accounts, seems that it could be easily attacked by creating a bunch of addresses within that particular chunk range.
That said, if we can handle over-sized chunks without any significant performance implications we should be fine.

rphmeier · 2018-05-08T12:50:32Z

another algorithm for account-counting: do N_SAMPLES random walks down the account trie. build branching_factors: [usize; 32] where the i'th entry represents the number of children observed at branches with depth i.

calculate approximate number of accounts with:

branching_factors.iter().enumerate().fold(1u64, |(acc, (level, total_children))| {
    let avg_children = total_children / N_SAMPLES;
    acc * avg_children
})

@tomusdrw

it's definitely an attack vector to some degree, but it won't cause oversized chunks, just more chunks covering that range (and perhaps the nodes attempting to produce that range will not finish in time). But if the amount of accounts per range is an underestimation it's very likely that some nodes will manage to complete the chunks covering that range regardless, so it's still an improvement over the current system. At the very least, they can still propagate the chunks they did manage to produce and it will be just the ones at the very end of the range which will be harder to find on the network.

trustfarm-dev · 2018-07-06T02:10:29Z

@rphmeier @Tbaut for fast search of state_root , how about add indexing table for state_root and blockheight by 1000 or 10000?
it will helpful offset based block and state search.

I've modified simple offset based export function. it is little time decrease, but, if index is provided for search, it will very faster and distributed function. reference commit

It also applied to making snapshotdb.

5chdn · 2018-09-25T21:41:42Z

Let's move this to the forum.

gumb0 · 2018-09-26T10:15:59Z

@5chdn Where's the forum about this?

5chdn · 2018-09-26T14:18:49Z

forum.parity.io, are you interested to follow this topic?

gumb0 · 2018-09-26T16:48:02Z

I might be, can't sign up with any of my emails though. Is it closed forum?

trustfarm-dev · 2018-09-27T04:07:53Z

@5chdn there's error on signup forum.parity.io
primary email error, with google accounts.
all is done , incorrect username,password,email error.
please checking signup

5chdn · 2018-09-27T10:48:18Z

let me discuss this

rphmeier mentioned this issue May 8, 2018

Refactoring ethcore-sync - Fixing warp-sync barrier #8543

Merged

Tbaut added this to the 1.12 milestone May 8, 2018

Tbaut mentioned this issue Jul 5, 2018

Proposal, distributed snapshotdb making. #9034

Closed

5chdn modified the milestones: 2.0, 2.1 Jul 17, 2018

rphmeier mentioned this issue Jul 27, 2018

Multithreaded snapshot creation #9239

Merged

5chdn modified the milestones: 2.1, 2.2 Sep 11, 2018

5chdn closed this as completed Sep 25, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Warp Sync: Alternative snapshot formats #8565

Warp Sync: Alternative snapshot formats #8565

rphmeier commented May 8, 2018 •

edited

Loading

rphmeier commented May 8, 2018 •

edited

Loading

tomusdrw commented May 8, 2018

rphmeier commented May 8, 2018 •

edited

Loading

trustfarm-dev commented Jul 6, 2018 •

edited

Loading

5chdn commented Sep 25, 2018

gumb0 commented Sep 26, 2018

5chdn commented Sep 26, 2018

gumb0 commented Sep 26, 2018

trustfarm-dev commented Sep 27, 2018

5chdn commented Sep 27, 2018

Warp Sync: Alternative snapshot formats #8565

Warp Sync: Alternative snapshot formats #8565

Comments

rphmeier commented May 8, 2018 • edited Loading

rphmeier commented May 8, 2018 • edited Loading

tomusdrw commented May 8, 2018

rphmeier commented May 8, 2018 • edited Loading

trustfarm-dev commented Jul 6, 2018 • edited Loading

5chdn commented Sep 25, 2018

gumb0 commented Sep 26, 2018

5chdn commented Sep 26, 2018

gumb0 commented Sep 26, 2018

trustfarm-dev commented Sep 27, 2018

5chdn commented Sep 27, 2018

rphmeier commented May 8, 2018 •

edited

Loading

rphmeier commented May 8, 2018 •

edited

Loading

rphmeier commented May 8, 2018 •

edited

Loading

trustfarm-dev commented Jul 6, 2018 •

edited

Loading