Skip to content
This repository has been archived by the owner on Nov 6, 2020. It is now read-only.

Warp Sync: Alternative snapshot formats #8565

Closed
rphmeier opened this issue May 8, 2018 · 10 comments
Closed

Warp Sync: Alternative snapshot formats #8565

rphmeier opened this issue May 8, 2018 · 10 comments
Labels
F7-optimisation 💊 An enhancement to provide better overall performance in terms of time-to-completion for a task. M4-core ⛓ Core client code / Rust.
Milestone

Comments

@rphmeier
Copy link
Contributor

rphmeier commented May 8, 2018

We want to investigate new snapshot formats which are better for the following properties:

  • Generation of (partial) snapshots on nodes
  • Reusability of chunks between snapshots to make finding them easier
  • Trustworthiness of data

Snapshot chunks are currently divided into two categories:

  • state chunks encode the entire account state of the blockchain at some block
  • security chunks provide corroboration for whether that block is valid without syncing the whole chain

W.r.t. different consensus engines, the "security" chunks will look completely different but will usually contain reusable data. For example, validator-set based consensus systems prove security with a series of bootstrapped handoffs (as long as we relax the weak subjectivity model to assume that old validator sets remain uncorrupted). All finalized handoffs can be reused, although usually their proof is small enough that all the handoffs can fit in a single chunk. Depending on the state churn and snapshot distance we may also be able to reuse some state chunks.

A keystone/delta model where we have intermittent "full" state snapshots every N*K blocks and the snapshots between them (every K blocks) only store deltas over that state is one possibility.

One major problem with the current snapshot system is that it is too heavy for most nodes to produce a snapshot before they prune the state that it encodes from their database. State chunks are currently very tightly packed using a method that makes it impossible to determine which account a chunk starts at or the exact data of the account entry without having produced all the chunks before. One possibility is to design predictable scheme for the boundaries of chunks will allow nodes to produce some of the state chunks but not all.

We can augment this scheme with random sampling: nodes which don't produce full snapshots will randomly sample some accounts and produce account entries for them, which they will keep on disk. They will refuse to propagate any snapshot where their random sample doesn't match the data in the snapshot. Assuming all repropagating nodes have their own random sample and a sufficiently large network, this makes it very unlikely for bad snapshot data to make its way through to unsynced nodes.

cc @ngotchac, @Vurich, @ordian

@rphmeier rphmeier added F7-footprint 🐾 An enhancement to provide a smaller (system load, memory, network or disk) footprint. F7-optimisation 💊 An enhancement to provide better overall performance in terms of time-to-completion for a task. M4-core ⛓ Core client code / Rust. and removed F7-footprint 🐾 An enhancement to provide a smaller (system load, memory, network or disk) footprint. labels May 8, 2018
@Tbaut Tbaut added this to the 1.12 milestone May 8, 2018
@rphmeier
Copy link
Contributor Author

rphmeier commented May 8, 2018

@ngotchac and I had a conversation about more predictable state chunk starting points and came to this proposal:

We get rid of duplicate-code optimizations where contract code shared between accounts is only present once (this may not affect size significantly for a few reasons).

We follow the same chunk-splitting algorithm as normal, but also enforce that groups of chunks only cover certain ranges of addresses. For example, if we want to have sets of state chunks each covering 1/256th of accounts, we would have a chunk starting with the first account after 0x00000... and a chunk ending with the last account before 0x01000.... We will decide how large the ranges should be based on the number of accounts in the state and how many accounts a node can reasonably fetch data for. Since it's not fast to determine how many accounts there are, we use the following heuristic:

const N_SAMPLES = 1024;

let mut seed = snapshot_block.hash;
let mut total_depth = 0;
for i in 0..N_SAMPLES {
   seed = keccak256(seed);
   take random path down account trie using seed until a leaf is reached (TBD)
   let d = count depth of terminus; // consider root as having depth 0
   total_depth += d;
}

let avg_depth = total_depth / N_SAMPLES;

return 16^avg_depth; 

Then, assuming each node can process K accounts before the pruning history is up, we can assign the range size based on the number of accounts and our assumption of K. Nodes will produce as many "sub-snapshots" of K accounts, randomly selecting ranges, as possible before the state of the snapshot block is pruned.

notes on the account-counting heuristic:

  • taking a random path down the trie is possible by taking 4 bits from the seed every time we encounter a branch. but if the branch isn't fully packed (very possible), we will need to decide how to interpret those bits.
  • To avoid consistently overestimating to the next power of 16 accounts, we should find some way to deterministically do sub-integer powers of 16 with the avg_depth.
  • When counting the amount of accounts, what we really care about is the branching factor

@tomusdrw
Copy link
Collaborator

tomusdrw commented May 8, 2018

We will decide how large the ranges should be based on the number of accounts in the state and how many accounts a node can reasonably fetch data for.

That will only work given uniform distribution of accounts, seems that it could be easily attacked by creating a bunch of addresses within that particular chunk range.
That said, if we can handle over-sized chunks without any significant performance implications we should be fine.

@rphmeier
Copy link
Contributor Author

rphmeier commented May 8, 2018

another algorithm for account-counting: do N_SAMPLES random walks down the account trie. build branching_factors: [usize; 32] where the i'th entry represents the number of children observed at branches with depth i.

calculate approximate number of accounts with:

branching_factors.iter().enumerate().fold(1u64, |(acc, (level, total_children))| {
    let avg_children = total_children / N_SAMPLES;
    acc * avg_children
})

@tomusdrw

it's definitely an attack vector to some degree, but it won't cause oversized chunks, just more chunks covering that range (and perhaps the nodes attempting to produce that range will not finish in time). But if the amount of accounts per range is an underestimation it's very likely that some nodes will manage to complete the chunks covering that range regardless, so it's still an improvement over the current system. At the very least, they can still propagate the chunks they did manage to produce and it will be just the ones at the very end of the range which will be harder to find on the network.

@trustfarm-dev
Copy link

trustfarm-dev commented Jul 6, 2018

@rphmeier @Tbaut for fast search of state_root , how about add indexing table for state_root and blockheight by 1000 or 10000?
it will helpful offset based block and state search.

I've modified simple offset based export function. it is little time decrease, but, if index is provided for search, it will very faster and distributed function. reference commit

It also applied to making snapshotdb.

@5chdn 5chdn modified the milestones: 2.0, 2.1 Jul 17, 2018
@5chdn 5chdn modified the milestones: 2.1, 2.2 Sep 11, 2018
@5chdn
Copy link
Contributor

5chdn commented Sep 25, 2018

Let's move this to the forum.

@5chdn 5chdn closed this as completed Sep 25, 2018
@gumb0
Copy link

gumb0 commented Sep 26, 2018

@5chdn Where's the forum about this?

@5chdn
Copy link
Contributor

5chdn commented Sep 26, 2018

forum.parity.io, are you interested to follow this topic?

@gumb0
Copy link

gumb0 commented Sep 26, 2018

I might be, can't sign up with any of my emails though. Is it closed forum?

@trustfarm-dev
Copy link

@5chdn there's error on signup forum.parity.io
primary email error, with google accounts.
all is done , incorrect username,password,email error.
please checking signup

@5chdn
Copy link
Contributor

5chdn commented Sep 27, 2018

let me discuss this

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
F7-optimisation 💊 An enhancement to provide better overall performance in terms of time-to-completion for a task. M4-core ⛓ Core client code / Rust.
Projects
None yet
Development

No branches or pull requests

6 participants