Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Approval Distribution Subsystem #1951

Merged
merged 19 commits into from
Nov 25, 2020
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion roadmap/implementers-guide/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@
- [Bitfield Signing](node/availability/bitfield-signing.md)
- [Approval Subsystems](node/approval/README.md)
- [Approval Voting](node/approval/approval-voting.md)
- [Approval Networking](node/approval/approval-networking.md)
- [Approval Distribution](node/approval/approval-distribution.md)
- [Dispute Participation](node/approval/dispute-participation.md)
- [Utility Subsystems](node/utility/README.md)
- [Availability Store](node/utility/availability-store.md)
Expand Down
2 changes: 1 addition & 1 deletion roadmap/implementers-guide/src/node/approval/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@

The approval subsystems implement the node-side of the [Approval Protocol](../../protocol-approval.md).

We make a divide between the [assignment/voting logic](approval-voting.md) and the [networking](approval-networking.md) that distributes assignment certifications and approval votes. The logic in the assignment and voting also informs the GRANDPA voting rule on how to vote.
We make a divide between the [assignment/voting logic](approval-voting.md) and the [distribution logic](approval-distribution.md) that distributes assignment certifications and approval votes. The logic in the assignment and voting also informs the GRANDPA voting rule on how to vote.

This category of subsystems also contains a module for [participating in live disputes](dispute-participation.md) and tracks all observed votes (backing or approval) by all validators on all candidates.
193 changes: 193 additions & 0 deletions roadmap/implementers-guide/src/node/approval/approval-distribution.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,193 @@
# Approval Distribution

A subsystem for the distribution of assignments and approvals for approval checks on candidates over the network.

The [Approval Voting](approval-voting.md) subsystem is responsible for active participation in a protocol designed to select a sufficient number of validators to check each and every candidate which appears in the relay chain. Statements of participation in this checking process are divided into two kinds:
- **Assignments** indicate that validators have been selected to do checking
- **Approvals** indicate that validators have checked and found the candidate satisfactory.

The [Approval Voting](approval-voting.md) subsystem handles all the issuing and tallying of this protocol, but this subsystem is responsible for the disbursal of statements among the validator-set.

The inclusion pipeline of candidates concludes after availability, and only after inclusion do candidates actually get pushed into the approval checking pipeline. As such, this protocol deals with the candidates _made available by_ particular blocks, as opposed to the candidates which actually appear within those blocks, which are the candidates _backed by_ those blocks. Unless stated otherwise, whenever we reference a candidate partially by block hash, we are referring to the set of candidates _made available by_ those blocks.

We implement this protocol as a gossip protocol, and like other parachain-related gossip protocols our primary concerns are about ensuring fast message propagation while maintaining an upper bound on the number of messages any given node must store at any time.

Approval messages should always follow assignments, so we need to be able to discern two pieces of information based on our [View](../../types/network.md#universal-types):
1. Is a particular assignment relevant under a given `View`?
2. Is a particular approval relevant to any assignment in a set?

These two queries need not be perfect, but they must never yield false positives. For our own local view, they must not yield false negatives. When applied to our peers' views, it is acceptable for them to yield false negatives. The reason for that is that our peers' views may be beyond ours, and we are not capable of fully evaluating them. Once we have caught up, we can check again for false negatives to continue distributing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"it is acceptable for our conclusions to yield false negatives"


For assignments, what we need to be checking is whether we are aware of the (block, candidate) pair that the assignment references. For approvals, we need to be aware of an assignment by the same validator which references the candidate being approved.
rphmeier marked this conversation as resolved.
Show resolved Hide resolved

However, awareness on its own of a (block, candidate) pair would imply that even ancient candidates all the way back to the genesis are relevant. We are actually not interested in anything before finality.


## Protocol

## Functionality

```rust
type BlockScopedCandidate = (Hash, CandidateHash);

/// The `State` struct is responsible for tracking the overall state of the subsystem.
///
/// It tracks metadata about our view of the chain, which assignments and approvals we have seen, and our peers' views.
struct State {
blocks_by_number: BTreeMap<BlockNumber, Vec<Hash>>,
blocks: HashMap<Hash, BlockEntry>,
peer_views: HashMap<PeerId, View>,
finalized_number: BlockNumber,
}

enum MessageFingerprint {
Assigment(Hash, u32, ValidatorIndex),
Approval(Hash, u32, ValidatorIndex),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do the integers mean here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclear. We add a received timestame to assignments when verifying their VRF signature. Approval votes might benefit form a received timestame too.

}

struct Knowledge {
known_messages: HashSet<MessageFingerprint>,
}

/// Information about blocks in our current view as well as whether peers know of them.
struct BlockEntry {
// Peers who we know are aware of this block and thus, the candidates within it. This maps to their knowledge of messages.
known_by: HashMap<PeerId, Knowledge>,
// The number of the block.
number: BlockNumber,
// The parent hash of the block.
parent_hash: Hash,
// Our knowledge of messages.
knowledge: Knowledge,
// A votes entry for each candidate.
candidates: IndexMap<CandidateHash, CandidateEntry>,
rphmeier marked this conversation as resolved.
Show resolved Hide resolved
}

enum ApprovalState {
Assigned(AssignmentCert),
Approved(AssignmentCert, ApprovalSignature),
}

/// Information about candidates in the context of a particular block they are included in. In other words,
/// multiple `CandidateEntry`s may exist for the same candidate, if it is included by multiple blocks - this is likely the case /// when there are forks.
rphmeier marked this conversation as resolved.
Show resolved Hide resolved
struct CandidateEntry {
approvals: HashMap<ValidatorIndex, ApprovalState>,
}
```

### Network updates

#### `NetworkBridgeEvent::PeerConnected`

Add a blank view to the `peer_views` state.

#### `NetworkBridgeEvent::PeerDisconnected`

Remove the view under the associated `PeerId` from `State::peer_views`.

Iterate over every `BlockEntry` and remove `PeerId` from it.

#### `NetworkBridgeEvent::PeerViewChange`

Invoke `unify_with_peer(peer, view)` to catch them up to messages we have.

We also need to use the `view.finalized_number` to remove the `PeerId` from any blocks that it won't be wanting information about anymore. Note that we have to be on guard for peers doing crazy stuff like jumping their 'finalized_number` forward 10 trillion blocks to try and get us stuck in a loop for ages.

One of the safeguards we can implement is to reject view updates from peers where the new `finalized_number` is less than the previous.

We augment that by defining `constrain(x)` to output the x bounded by the first and last numbers in `state.blocks_by_number`.

From there, we can loop backwards from `constrain(view.finalized_number)` until `constrain(last_view.finalized_number)` is reached, removing the `PeerId` from all `BlockEntry`s referenced at that height. We can break the loop early if we ever exit the bound supplied by the first block in `state.blocks_by_number`.

#### `NetworkBridgeEvent::OurViewChange`

Prune all lists from `blocks_by_number` with number less than or equal to `view.finalized_number`. Prune all the `BlockEntry`s referenced by those lists.

#### `NetworkBridgeEvent::PeerMessage`

If the message is of type `ApprovalDistributionV1Message::Assignment(assignment_cert, claimed_index)`, then call `import_and_circulate_assignment(MessageSource::Peer(sender), assignment_cert, claimed_index)`

If the message is of type `ApprovalDistributionV1Message::Approval(approval_vote)`, then call `import_and_circulate_approval(MessageSource::Peer(sender), approval_vote)`

### Subsystem Updates

#### `ApprovalDistributionMessage::NewBlocks`

Create `BlockEntry` and `CandidateEntries` for all blocks.

For all peers:
* Compute `view_intersection` as the intersection of the peer's view blocks with the hashes of the new blocks.
* Invoke `unify_with_peer(peer, view_intersection)`.

#### `ApprovalDistributionMessage::DistributeAsignment`

Load the corresponding `BlockEntry`. Distribute to all peers in `known_by`. Add to the corresponding `CandidateEntry`.

#### `ApprovalDistributionMessage::DistributeApproval`

Load the corresponding `BlockEntry`. Distribute to all peers in `known_by`. Add to the corresponding `CandidateEntry`.

### Utility

```rust
enum MessageSource {
Peer(PeerId),
Local,
}
```

#### `import_and_circulate_assignment(source: MessageSource, assignment: IndirectAssignmentCert, claimed_candidate_index: u32)`

Imports an assignment cert referenced by block hash and candidate index. As a postcondition, if the cert is valid, it will have distributed the cert to all peers who have the block in their view, with the exclusion of the peer referenced by the `MessageSource`.

* Load the BlockEntry using `assignment.block_hash`. If it does not exist, report the source if it is `MessageSource::Peer` and return.
* Compute a fingerprint for the `assignment` using `claimed_candidate_index`.
* If the source is `MessageSource::Peer(sender)`:
* check if `peer` appears under `known_by` and whether the fingerprint is in the `known_messages` of the peer. If the peer does not know the block, report for providing data out-of-view and proceed. If the peer does know the block and the knowledge contains the fingerprint, report for providing replicate data and return.
* If the message fingerprint appears under the `BlockEntry`'s `Knowledge`, give the peer a small positive reputation boost and return. Note that we must do this after checking for out-of-view to avoid being spammed. If we did this check earlier, a peer could provide data out-of-view repeatedly and be rewarded for it.
* Dispatch `ApprovalVotingMessage::CheckAndImportAssignment(assignment)` and wait for the response.
* If the result is `AssignmentCheckResult::Accepted(candidate_index)` or `AssignmentCheckResult::AcceptedDuplicate(candidate_index)`
* If `candidate_index` does not match `claimed_candidate_index`, punish the peer's reputation, recompute the fingerprint, and re-do our knowledge checks. The goal here is to accept messages which peers send us that are labeled wrongly, but punish them for it as they've made us do extra work.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"labeled wrongly"? Are these indexes the vacated cores number? If so, we could always include the index inside the assignment, even when doing a modulo VRF criteria for tranche zero, so it could be read before proceeding, which makes the assignment outright invalid if the index is wrong. Is that helpful?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I'm unsure where claimed_candidate_index originates. I suppose the peer sends us data it believes important, but why does it need to tell us why the data matters? We either agree or not, yes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think including that data in the VRF would make this point redundant. At the moment, the vacated core index isn't part of the VRF, and we have peers send us the vacated core index alongside the assignment which makes it easier for us to do some politeness checks.

So the conundrum this is solving is what we should do if a peer sends us something labeled with the wrong core index but that is still a valid assignment for some other core. And my answer was to re-label it, punish the peer, and distribute.

Your general advice so far has been to keep as much data as possible out of the VRF, so it didn't cross my mind that was an option.

Copy link
Contributor

@burdges burdges Nov 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our VRF signatures have an input that corresponds to the output, which yes we minimize, but the same VRF also signs an extra auxiliary message, that influences only it proof, not its output.

We can pack anything we like into this extra auxiliary message, like whatever claims simplify the code, or even the entire gossip message containing the VRF (IndirectAssignmentCert?). In fact, including the whole gossip message saves a signature that costs at least 45ms and 64 bytes per assignment message, so I slanted the draft code I wrote in this direction.

Copy link
Contributor Author

@rphmeier rphmeier Nov 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. So it is possible to include the core index under the proof, even if that core index was unknown prior to the output being generated? That does indeed save us a ton of trouble. What's the schnorrkel API for that?

* Otherwise, if the vote was accepted but not duplicate, give the peer a positive reputation boost
* add the fingerprint to both our and the peer's knowledge in the `BlockEntry`. Note that we only doing this after making sure we have the right fingerprint.
* If the result is `AssignmentCheckResult::TooFarInFuture`, mildly punish the peer and return.
* If the result is `AssignmentCheckResult::Bad`, punish the peer and return.
* If the source is `MessageSource::Local(CandidateIndex)`
* check if the fingerprint appears under the `BlockEntry's` knowledge. If not, add it.
* Load the candidate entry for the given candidate index. It should exist unless there is a logic error in the approval voting subsystem.
* Set the approval state for the validator index to `ApprovalState::Assigned` unless the approval state is set already. This should not happen as long as the approval voting subsystem instructs us to ignore duplicate assignments.
* Dispatch a `ApprovalDistributionV1Message::Assignment(assignment, candidate_index)` to all peers in the `BlockEntry`'s `known_by` set, excluding the peer in the `source`, if `source` has kind `MessageSource::Peer`. Add the fingerprint of the assignment to the knowledge of each peer.


#### `import_and_circulate_approval(source: MessageSource, approval: IndirectSignedApprovalVote)`

Imports an approval signature referenced by block hash and candidate index.

* Load the BlockEntry using `approval.block_hash` and the candidate entry using `approval.candidate_entry`. If either does not exist, report the source if it is `MessageSource::Peer` and return.
* Compute a fingerprint for the approval.
* Compute a fingerprint for the corresponding assignment. If the `BlockEntry`'s knowledge does not contain that fingerprint, then report the source if it is `MessageSource::Peer` and return. All references to a fingerprint after this refer to the approval's, not the assignment's.
* If the source is `MessageSource::Peer(sender)`:
* check if `peer` appears under `known_by` and whether the fingerprint is in the `known_messages` of the peer. If the peer does not know the block, report for providing data out-of-view and proceed. If the peer does know the block and the knowledge contains the fingerprint, report for providing replicate data and return.
* If the message fingerprint appears under the `BlockEntry`'s `Knowledge`, give the peer a small positive reputation boost and return. Note that we must do this after checking for out-of-view to avoid being spammed. If we did this check earlier, a peer could provide data out-of-view repeatedly and be rewarded for it.
* Dispatch `ApprovalVotingMessage::CheckAndImportApproval(approval)` and wait for the response.
* If the result is `VoteCheckResult::Accepted(())`:
* Give the peer a positive reputation boost and add the fingerprint to both our and the peer's knowledge.
* If the result is `VoteCheckResult::Bad`:
* Report the peer and return.
* Load the candidate entry for the given candidate index. It should exist unless there is a logic error in the approval voting subsystem.
* Set the approval state for the validator index to `ApprovalState::Approved`. It should already be in the `Assigned` state as our `BlockEntry` knowledge contains a fingerprint for the assignment.
* Dispatch a `ApprovalDistributionV1Message::Approval(approval)` to all peers in the `BlockEntry`'s `known_by` set, excluding the peer in the `source`, if `source` has kind `MessageSource::Peer`. Add the fingerprint of the assignment to the knowledge of each peer. Note that this obeys the politeness conditions:
* We guarantee elsewhere that all peers within `known_by` are aware of all assignments relative to the block.
* We've checked that this specific approval has a corresponding assignment within the `BlockEntry`.
* Thus, all peers are aware of the assignment or have a message to them in-flight which will make them so.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this sounds fine, but it's quadratic so another reason candidates should be rejected long before "escalation by DoS".

It breaks gossip assumptions if one compares fingerprints between peers using publicly keyed bloom filters, but if each pair of peers computes their own shared secret bloom filter key, then they could compare using bloom filters. Yet another optimization I guess.



#### `unify_with_peer(peer: PeerId, view)`:

For each block in the view:
1. Initialize `fresh_blocks = {}`
rphmeier marked this conversation as resolved.
Show resolved Hide resolved
2. Load the `BlockEntry` for the block. If the block is unknown, or the number is less than the view's finalized number, go to step 6.
3. Inspect the `known_by` set of the `BlockEntry`. If the peer is already present, go to step 6.
4. Add the peer to `known_by` with a cloned version of `block_entry.knowledge`. and add the hash of the block to `fresh_blocks`.
5. Return to step 2 with the ancestor of the block.
6. For each block in `fresh_blocks`, send all assignments and approvals for all candidates in those blocks to the peer.

This file was deleted.

Loading