Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Erasure encoding availability #345

Merged
merged 48 commits into from
Dec 3, 2019
Merged

Erasure encoding availability #345

merged 48 commits into from
Dec 3, 2019

Conversation

montekki
Copy link
Contributor

@montekki montekki commented Jul 29, 2019

  1. availability-store modified to store block erasure chunks and reconstruct
    blocks when the number of chunks at hand is enough.

    • The user stores info about erasure encoded block with make_available.
    • Later more chunks of this block may be added to the storage with add_erasure_chunk
      method. Among other thing it will check wether the chunk belongs to the Merkle tree of
      the block encoding.
    • The reconstruction process happens lazily when block_data or extrinsic
      are called by the user and the result is cached in storage.
    • candidates_finalized purges chunks and whole blocks (if any) from the storage.
  2. network extended to pass erasure chunks in the gossip and in polkadot-specific
    Message-s as well.

    • In gossip each gossip message contains a chunk itself, the ValidatorIndex
      of the validator that has generated this message and it's signature.
    • The checked_statements stream in router provides signed statements and
      erasure chunks. If the chunk belongs to the candidate we are not yet aware of,
      it is deferred. Chunks are then imported with the import_erasure_chunk in validation.
  3. validation is modified to gossip the chunks of the local collations to the
    network:

    • As soon as the validator receives a valid collation, the block is broken into
      chunks which are stored into the validator's availability store and also
      handed over to the network local_collation to be gossiped to the peers.
    • A new erasure chunk is imported in import_erasure_chunk.

 * Modifications to availability store to keep chunks as well as
   reconstructed blocks and extrinsics.
 * Gossip messages containig signed erasure chunks.
 * Requesting eraure chunks with polkadot-specific messages.
 * Validation of erasure chunk messages.
@montekki montekki added the A3-in_progress Pull request is in progress. No review needed at this stage. label Jul 29, 2019
@@ -101,6 +106,21 @@ impl GossipStatement {
}
}

/// A gossip message containing one erasure chunk of a candidate block.
/// For each chunk of block erasure encoding one of this messages is constructed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the grammar on this line doesn't sound correct

network/src/lib.rs Outdated Show resolved Hide resolved
network/src/lib.rs Outdated Show resolved Hide resolved
network/src/lib.rs Outdated Show resolved Hide resolved
network/src/router.rs Outdated Show resolved Hide resolved
network/src/router.rs Outdated Show resolved Hide resolved
network/src/router.rs Outdated Show resolved Hide resolved
network/src/router.rs Outdated Show resolved Hide resolved
network/src/router.rs Outdated Show resolved Hide resolved
network/src/router.rs Outdated Show resolved Hide resolved
network/src/router.rs Outdated Show resolved Hide resolved
network/src/tests/mod.rs Outdated Show resolved Hide resolved
@montekki
Copy link
Contributor Author

The PR is updated with implementation of most of the features described in #475:

  • The store holds the necessary information per relay chain block (candidates, erasure roots, validator sets, etc)
  • Blocks are imported with a BlockImport implementaion
  • Most of the write operations on the storage are now performed by the background worker
  • The Store itself provides pub async API that asynchronously offloads the write operations to the background worker
  • The background worker also manages the finality notifications.

There is though an irritating problem of not being able to use network code in availability-store due to looped dependencies, that's why atm there is:

  1. A shim trait that proxies calls to NetworkService.
  2. Network validation just stupidly queries awaited_chunks from the store each time a new gossip message is received (which may not be so slow if the underlying storage caches that in RAM) because it is not clear how the Store may push awaited chunks info to networking which need to be done as soon as that set changes.

Now, there is also this. Because of the move to async interfaces I had to introduce a some calls to compatibility layers, add async blocks here and there and on top of that replace the Future implementaion for PrimedParachainWork with a more simple async function, however I tried to keep that changes at a minimum.

@montekki montekki added A0-please_review Pull request needs code review. and removed A5-grumble labels Nov 28, 2019
/// A trait that provides a shim for the [`NetworkService`] trait.
///
/// Currently it is not possible to use the networking code in the availability store
/// core directly due to a number of loop dependencies it require:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hoping that further Substrate improvements allow us to declare a subprotocol directly in this crate.

Copy link
Contributor

@rphmeier rphmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the store changes so far, that's all. Looks OK although I would like to see much more documentation particularly of invariants.

availability-store/src/lib.rs Outdated Show resolved Hide resolved
availability-store/src/lib.rs Show resolved Hide resolved
availability-store/src/worker.rs Outdated Show resolved Hide resolved

let handle = thread::spawn(move || {
let mut sender = self.sender.clone();
let mut runtime = LocalRuntime::new().expect("Could not create local runtime");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it could fail. Panickers should be proven not to fail or removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just copies logic in attestation_service, should we just fail with an error here before starting a thread and return that to the caller?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, that seems reasonable. Is the LocalRuntime Send? We could create and then send it if so

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, only a handle to it is

availability-store/src/worker.rs Outdated Show resolved Hide resolved
}
}

self.inner.import_block(block, new_cache).map_err(Into::into)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is racing against the background worker handling the messages sent above, so import notifications might come out before the information is registered in the availability store. Not sure what to do about that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's fine as long as we implement the network protocol in the end by getting notifications directly from the availability store.

availability-store/src/store.rs Show resolved Hide resolved
availability-store/src/store.rs Outdated Show resolved Hide resolved
network/src/router.rs Outdated Show resolved Hide resolved
runtime/src/lib.rs Outdated Show resolved Hide resolved
runtime/src/lib.rs Outdated Show resolved Hide resolved
validation/src/lib.rs Outdated Show resolved Hide resolved
///
/// This information is needed before the `add_candidates_in_relay_block` is called
/// since that call forms the awaited frontier of chunks.
/// In the current implementation this function is called in the `get_or_instantiate` at
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't block the PR on this, but in general I'd prefer to avoid these kinds of long-distance expectations of how the code should be used. It's an assumption that can easily change without noticing, unlike things like "X is called before Y".

@rphmeier rphmeier merged commit e5138ef into paritytech:master Dec 3, 2019
tomusdrw pushed a commit that referenced this pull request Mar 26, 2021
* make message-lane Event generic

* cargo fmt --all

* Update modules/message-lane/src/lib.rs

Co-authored-by: Hernando Castano <HCastano@users.noreply.github.com>

Co-authored-by: Hernando Castano <HCastano@users.noreply.github.com>
imstar15 pushed a commit to imstar15/polkadot that referenced this pull request Aug 25, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants