Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vote Graph and GHOST-hunting #1

Merged
merged 7 commits into from
Aug 14, 2018
Merged

Vote Graph and GHOST-hunting #1

merged 7 commits into from
Aug 14, 2018

Conversation

rphmeier
Copy link
Contributor

@rphmeier rphmeier commented Aug 8, 2018

This PR contains the logic for imposing a further DAG on top of the blockchain, where nodes are blocks which have been voted on directly.

Cumulative votes (flowing back to the "base" -- in practice, the last finalized block) are kept updated, and a "GHOST" calculation can be done on this DAG to find the highest block referenced in a node or edge of the DAG that fulfills some vote-predicate. This block may fall between some vote-nodes.

The GHOST logic, as it stands right now, operates under the assumption that the predicate will evaluate to true for at most one branch of any fork. Full GHOST is more expensive to implement as we have to compare the cumulative scores of all children and can't early-exit upon finding the first child which satisfies the predicate.

Copy link
Contributor

@gnunicorn gnunicorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few minor changes, then feel free to merge it.

src/lib.rs Outdated
// Copyright 2018 Parity Technologies (UK) Ltd.
// This file is part of finality-afg.

// Polkadot is free software: you can redistribute it and/or modify
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be finality-afg rather than Polkadot?

src/lib.rs Outdated
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.

// Polkadot is distributed in the hope that it will be useful,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here, too.

match self.find_containing_nodes(hash.clone(), number) {
Some(containing) => if containing.is_empty() {
self.append(hash.clone(), number, chain)?;
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be indented one level more rather than be on the level as the match-arms?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so but open for discussion

// given a key, node pair (which must correspond), assuming this node fulfills the condition,
// this function will find the highest point at which its descendents merge, which may be the
// node itself.
fn ghost_find_merge_point<'a, F>(&'a self, mut node_key: H, mut active_node: &'a Entry<H, V>, condition: F) -> (H, usize)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please wrap.

Ok(idx) => {
descendent_blocks[idx].1 += d_node.cumulative_vote.clone();
if condition(&descendent_blocks[idx].1) {
new_best = Some(d_block.clone());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uui.... 7 indentation levels... I can still follow, but it is a bit against the style guide ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, it's not ideal but I couldn't find a way to rewrite.

@andresilva
Copy link
Contributor

I'm still reviewing this, please don't merge it yet.

Copy link
Contributor

@andresilva andresilva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions and comments to help me understand the code.

/// Maintains a DAG of blocks in the chain which have votes attached to them,
/// and vote data which is accumulated along edges.
pub struct VoteGraph<H: Hash + Eq, V> {
entries: HashMap<H, Entry<H, V>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We assume that the same hash won't be proposed with different block numbers. I think it's safe to assume this since the block hash will usually depend on the block number, but just wanted to confirm my assumption.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true. I think it's fair to treat hashes as unique

maybe_entry
});

if let Some(new_entry) = produced_entry {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If descendents is not empty, then this must be Some.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed. but what am I supposed to do, panic?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think it's fine as is, it was just a comment regarding my understanding.


let mut active_node = get_node(&node_key);

if !condition(&active_node.cumulative_vote) { return None }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We take the parent vote node of the current block descendants. If this block isn't finalized shouldn't we keep searching backwards for another GHOST block, potentially until we reach the base?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function is defined to always return with a descendent of the supplied node. this question cuts to the crux of the issue mentioned in the PR description and comment on this function: that find_ghost isn't a proper GHOST but a weird "majority-GHOST" thing where you either have one descendent of a node that meets the condition or none. So we find the best block that meets the condition or none and we don't need to backtrack because of the majority condition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to note why the property of returning a descendent is useful: this function is needed for the vote-casting logic, where we want to cast votes either after some time has passed or as soon as GHOST(S) == GHOST(S + Y) where Y could be any new valid votes to add to the vote set.

And as the document shows, the GHOST block never backtracks as we include more votes. So by bookkeeping the current GHOST(S) node we can easily check GHOST(S + Y) with another call to the same function. Although it can still be optimized a bit it should still be pretty fast.

}
}

// active_node and node_key now correspond to the head node which has enough cumulative votes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correspond to the highest vote node with enough cumulative votes?

for offset in 1usize.. {
let mut new_best = None;
for d_node in descendent_nodes.iter() {
if let Some(d_block) = d_node.ancestor_block(base_number + offset) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We return None if the block is past the parent vote-node of d_node. It's ok to ignore it since the parent vote-node of d_node should eventually be reached (if needed) as the offset grows.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the parent vote-node of d_node will always be the GHOST vote-node. ancestor_block will return None when we have reached the block corresponding to d_node, at which point we will want to prune anyway.


descendent_blocks.clear();
descendent_nodes.retain(
|n| n.in_direct_ancestry(&best_hash, best_number).unwrap_or(false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be unwrap_or(true)? We're removing nodes that are too high compared to the current best GHOST node, i.e. there are other vote nodes in the middle. It doesn't necessarily mean that they're not part of this branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we get None back, then we've exhausted all the blocks in the "edge" between our GHOST vote-node and the descendent.

Since we're looking for an ancestor-block of the descendent vote-node's block which has enough cumulative votes, once we've reached this point we know that it's safe to remove the descendent node from our frontier, as we've checked all ancestors of the descendent vote-node already.

The other theoretical case (which cannot occur in this part of the code) where in_direct_ancestry returns None is where the block is earlier than the best GHOST vote-node. But we don't care about that case anyway since we're only looking for descendent blocks of that node.

@andresilva
Copy link
Contributor

Really clean code 👍 It's possible I misunderstood some parts of how this works or what invariants hold.

/// The GHOST (hash, number) returned will be the block with highest number for which the
/// cumulative votes of descendents and itself causes the closure to evaluate to true.
///
/// This assumes that the evaluation closure is one which returns true for at most a single
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This holds assuming an honest majority will follow the heaviest chain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it holds as long as nobody double-votes (or in the case of supermajority, less than 1/3 double-vote). But I intend to filter out all double-votes in a higher level that instantiates the graph in a follow-up PR, so the property should always hold in our use-case.

I would like more unit tests to ensure that my logic is sound here

Copy link
Contributor

@andresilva andresilva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the detailed explanations regarding my questions. I'll review the document again to further my understanding. 👍

@rphmeier rphmeier merged commit 39a73f4 into master Aug 14, 2018
@rphmeier rphmeier deleted the rh-vote-graph branch August 14, 2018 11:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants