Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Erasure-coding chunks gossip #475

Closed
rphmeier opened this issue Oct 11, 2019 · 2 comments
Closed

Erasure-coding chunks gossip #475

rphmeier opened this issue Oct 11, 2019 · 2 comments
Labels
J0-enhancement An additional feature request.

Comments

@rphmeier
Copy link
Contributor

Polkadot's validity and availability scheme requires that validators maintain availability of an assigned piece of an erasure-coding of a parachain block validity proof and outgoing messages. This erasure-coding is committed to via a merkle root in the parachain block candidate, known as the erasure-root.

The i'th validator, as of the relay-chain block where the parachain header was included, is responsible for maintaining the i'th piece of the n-part erasure-coding, where n is the number of validators in total.

The full erasure-coding is created and distributed by any of the validators assigned to the parachain, who have issued either a Candidate or Valid statement for the parachain block candidate.

The availability-store holds the following pieces of information:

  • Which candidates were included in each relay-chain block.
  • Which erasure-roots were included in each relay-chain block.
  • Which chunks and branch-proofs we have locally for each erasure-root, by position i. This should probably be a mapping (relay_parent_hash, erasure-root), to make pruning easier in the case of erasure-roots being the same at different block heights, at the cost of some duplication.
  • What the local validator position i is at any relay_parent_hash, if any.
  • The "awaited frontier" of all (relay_parent, erasure-root) pairs where we don't have the ith message. Cached in memory and pruned on or shortly after finality.
  • (OPTIONAL initially) A mapping that lets us determine which validators attested to which parachain blocks / erasure-roots. This lets us know who to ask and will be important when we have point-to-point communication.

Any of the mappings here could alternatively be by relay-chain parent, if that is easier to implement. It may be, because then we can pre-emptively store chunks and erasure-roots during the validation process (when the block hash is not known) and fill in the gaps later on.

Whenever we import a block (and we should do this in a BlockImport implementation, because import notifications may lag behind or be missed), we:

  1. Extract our local position i from the validator set of the parent.
  2. Use a runtime API to extract all included erasure-roots from the imported block.
  3. Notify the background worker (below) about any needed (relay_parent, erasure-root, i) that we are missing, as well as the new (relay_parent, erasure-root) pairs and included parachain blocks.
  4. See if we import the block successfully (call wrapped BlockImport). It is important that this data is imported into the availability store and sent to network before importing the block. If that fails, we should fail the block import as well. If it succeeds, but block import fails, then we are relying on pruning to clean up the orphaned data and requests.

Additionally, we have a background worker which manages the availability store and waits for items on the awaited frontier.

It's best if all writing to the availability store, at least to the 'known chunks' field, is done through this background worker. That will help avoid race conditions where another piece of code wants to fetch something that is actually already available. Here are the message types it can receive:

  • ErasureRoots(relay_parent, Vec<erasure-root>).
  • ParachainBlocks(relay_parent, Vec<(CandidateReceipt, Option<PovBlock>>)
  • ListenForChunk(relay_parent, erasure-root, i)
  • Chunks(relay_parent, erasure-root, Vec<Chunk>). (assume Chunk contains index)

ErasureRoots, ParachainBlocks, and Chunks should also contain an mpsc::Sender<Result<(), Error>> as a parameter (error when writing fails), so that stuff like note_local_collation and the shared_table::ValidationWork can wait on success.

Here is the operation of the background worker:

  • On startup, registers listeners (gossip streams) for all (relay_parent, erasure-root, i) in the awaited frontier.
  • When an awaited item is received, it is placed into the availability store and removed from the frontier. Listener de-registered.
  • When it receives a ListenForChunk message, it double-checks that we don't have that piece, and then it registers a listener.
  • When receiving a Chunks message, before writing & noting status to sender, pass onwards to the gossip protocol via Network trait.
  • Other message types just write into the availability store and note success back to the sender.
  • It should handle the on-finality pruning of the availability store.

Now, on the gossip side:

We want to ensure that all of the branches of the erasure-coding in recent relay chain blocks are available on the gossip network.

We'll use topics based on (relay_parent, erasure-root). fn erasure_coding_topic(relay_parent, root) -> hash { blake2_256(b"erasure_chunks" ++ relay_parent ++ root) }

In the gossip validator, we keep:

  • ancestry: HashMap<Hash, Vec<erasure-root>>
  • topics: HashMap<Topic, RelayParent>
    Whenever we get a new chain head, find out and set the last N ancestors of that chain head, and set ancestry to contain only those blocks' hashes and erasure-roots, and topics should hold the topics computed by erasure_coding_topic.

We accept: messages with known topic.
We allow sending of messages to peers, whose topic matches a hash in the ancestry of the neighbor's relay block.

@rphmeier
Copy link
Contributor Author

rphmeier commented Jan 8, 2020

cc @montekki can this be closed?

@montekki
Copy link
Contributor

montekki commented Jan 9, 2020

@rphmeier At this point I think so.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
J0-enhancement An additional feature request.
Projects
None yet
Development

No branches or pull requests

2 participants