From cfb10d484b7a51119056cc70d355acee14bb2b1e Mon Sep 17 00:00:00 2001 From: Marcin S Date: Wed, 23 Nov 2022 08:20:25 -0500 Subject: [PATCH] Add PVF module documentation (#6293) * Add PVF module documentation TODO (once the PRs land): - [ ] Document executor parametrization. - [ ] Document CPU time measurement of timeouts. * Update node/core/pvf/src/lib.rs Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com> * Clarify meaning of PVF acronym * Move PVF doc to implementer's guide * Clean up implementer's guide a bit * Add page for PVF types * pvf: Better separation between crate docs and implementer's guide * ci: Add "prevalidating" to the dictionary * ig: Remove types/chain.md The types contained therein did not exist and the file was not referenced anywhere. Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com> --- node/core/pvf/src/executor_intf.rs | 2 +- node/core/pvf/src/lib.rs | 37 +++++++++++++------ node/core/pvf/src/priority.rs | 2 +- roadmap/implementers-guide/README.md | 5 +++ roadmap/implementers-guide/src/SUMMARY.md | 1 - roadmap/implementers-guide/src/glossary.md | 1 - .../src/node/utility/candidate-validation.md | 35 ++++++++++++++++++ .../src/node/utility/pvf-prechecker.md | 6 +-- .../implementers-guide/src/types/candidate.md | 16 ++++++++ roadmap/implementers-guide/src/types/chain.md | 30 --------------- .../src/types/overseer-protocol.md | 4 +- .../src/types/pvf-prechecking.md | 2 + scripts/ci/gitlab/lingua.dic | 1 + 13 files changed, 91 insertions(+), 51 deletions(-) delete mode 100644 roadmap/implementers-guide/src/types/chain.md diff --git a/node/core/pvf/src/executor_intf.rs b/node/core/pvf/src/executor_intf.rs index 6827fb636eac..bbeb6195e1dc 100644 --- a/node/core/pvf/src/executor_intf.rs +++ b/node/core/pvf/src/executor_intf.rs @@ -96,7 +96,7 @@ pub fn prevalidate(code: &[u8]) -> Result Result, sc_executor_common::error::WasmError> { sc_executor_wasmtime::prepare_runtime_artifact(blob, &CONFIG.semantics) } diff --git a/node/core/pvf/src/lib.rs b/node/core/pvf/src/lib.rs index ef5f31889237..1aabb1100437 100644 --- a/node/core/pvf/src/lib.rs +++ b/node/core/pvf/src/lib.rs @@ -16,18 +16,27 @@ #![warn(missing_docs)] -//! A crate that implements PVF validation host. +//! A crate that implements the PVF validation host. +//! +//! For more background, refer to the Implementer's Guide: [PVF +//! Pre-checking](https://paritytech.github.io/polkadot/book/pvf-prechecking.html) and [Candidate +//! Validation](https://paritytech.github.io/polkadot/book/node/utility/candidate-validation.html#pvf-host). +//! +//! # Entrypoint //! //! This crate provides a simple API. You first [`start`] the validation host, which gives you the //! [handle][`ValidationHost`] and the future you need to poll. //! -//! Then using the handle the client can send two types of requests: +//! Then using the handle the client can send three types of requests: +//! +//! (a) PVF pre-checking. This takes the PVF [code][`Pvf`] and tries to prepare it (verify and +//! compile) in order to pre-check its validity. //! -//! (a) PVF execution. This accepts the PVF [`params`][`polkadot_parachain::primitives::ValidationParams`] +//! (b) PVF execution. This accepts the PVF [`params`][`polkadot_parachain::primitives::ValidationParams`] //! and the PVF [code][`Pvf`], prepares (verifies and compiles) the code, and then executes PVF //! with the `params`. //! -//! (b) Heads up. This request allows to signal that the given PVF may be needed soon and that it +//! (c) Heads up. This request allows to signal that the given PVF may be needed soon and that it //! should be prepared for execution. //! //! The preparation results are cached for some time after they either used or was signaled in heads up. @@ -39,7 +48,7 @@ //! PVF execution requests can specify the [priority][`Priority`] with which the given request should //! be handled. Different priority levels have different effects. This is discussed below. //! -//! Preparation started by a heads up signal always starts in with the background priority. If there +//! Preparation started by a heads up signal always starts with the background priority. If there //! is already a request for that PVF preparation under way the priority is inherited. If after heads //! up, a new PVF execution request comes in with a higher priority, then the original task's priority //! will be adjusted to match the new one if it's larger. @@ -48,6 +57,8 @@ //! //! # Under the hood //! +//! ## The flow +//! //! Under the hood, the validation host is built using a bunch of communicating processes, not //! dissimilar to actors. Each of such "processes" is a future task that contains an event loop that //! processes incoming messages, potentially delegating sub-tasks to other "processes". @@ -55,11 +66,13 @@ //! Two of these processes are queues. The first one is for preparation jobs and the second one is for //! execution. Both of the queues are backed by separate pools of workers of different kind. //! -//! Preparation workers handle preparation requests by preverifying and instrumenting PVF wasm code, +//! Preparation workers handle preparation requests by prevalidating and instrumenting PVF wasm code, //! and then passing it into the compiler, to prepare the artifact. //! -//! Artifact is a final product of preparation. If the preparation succeeded, then the artifact will -//! contain the compiled code usable for quick execution by a worker later on. +//! ## Artifacts +//! +//! An artifact is the final product of preparation. If the preparation succeeded, then the artifact +//! will contain the compiled code usable for quick execution by a worker later on. //! //! If the preparation failed, then the worker will still write the artifact with the error message. //! We save the artifact with the error so that we don't try to prepare the artifacts that are broken @@ -68,12 +81,14 @@ //! The artifact is saved on disk and is also tracked by an in memory table. This in memory table //! doesn't contain the artifact contents though, only a flag that the given artifact is compiled. //! +//! A pruning task will run at a fixed interval of time. This task will remove all artifacts that +//! weren't used or received a heads up signal for a while. +//! +//! ## Execution +//! //! The execute workers will be fed by the requests from the execution queue, which is basically a //! combination of a path to the compiled artifact and the //! [`params`][`polkadot_parachain::primitives::ValidationParams`]. -//! -//! Each fixed interval of time a pruning task will run. This task will remove all artifacts that -//! weren't used or received a heads up signal for a while. mod artifacts; mod error; diff --git a/node/core/pvf/src/priority.rs b/node/core/pvf/src/priority.rs index de169be0696b..b80c9195832a 100644 --- a/node/core/pvf/src/priority.rs +++ b/node/core/pvf/src/priority.rs @@ -24,7 +24,7 @@ pub enum Priority { Normal, /// This priority is used for requests that are required to be processed as soon as possible. /// - /// For example, backing is on critical path and require execution as soon as possible. + /// For example, backing is on a critical path and requires execution as soon as possible. Critical, } diff --git a/roadmap/implementers-guide/README.md b/roadmap/implementers-guide/README.md index 7f3f8cef7e63..996041f176bb 100644 --- a/roadmap/implementers-guide/README.md +++ b/roadmap/implementers-guide/README.md @@ -22,6 +22,11 @@ Then install and build the book: ```sh cargo install mdbook mdbook-linkcheck mdbook-graphviz mdbook-mermaid mdbook-last-changed mdbook serve roadmap/implementers-guide +``` + +and in a second terminal window run: + +```sh open http://localhost:3000 ``` diff --git a/roadmap/implementers-guide/src/SUMMARY.md b/roadmap/implementers-guide/src/SUMMARY.md index bcf87aad8a49..c504b9ac1923 100644 --- a/roadmap/implementers-guide/src/SUMMARY.md +++ b/roadmap/implementers-guide/src/SUMMARY.md @@ -75,7 +75,6 @@ - [Availability](types/availability.md) - [Overseer and Subsystem Protocol](types/overseer-protocol.md) - [Runtime](types/runtime.md) - - [Chain](types/chain.md) - [Messages](types/messages.md) - [Network](types/network.md) - [Approvals](types/approval.md) diff --git a/roadmap/implementers-guide/src/glossary.md b/roadmap/implementers-guide/src/glossary.md index 8612d8834cb8..ed6a358be6da 100644 --- a/roadmap/implementers-guide/src/glossary.md +++ b/roadmap/implementers-guide/src/glossary.md @@ -48,4 +48,3 @@ exactly one downward message queue. Also of use is the [Substrate Glossary](https://substrate.dev/docs/en/knowledgebase/getting-started/glossary). [0]: https://wiki.polkadot.network/docs/learn-consensus -[1]: #pvf diff --git a/roadmap/implementers-guide/src/node/utility/candidate-validation.md b/roadmap/implementers-guide/src/node/utility/candidate-validation.md index 5393368c5c6b..07d7c09bf2f2 100644 --- a/roadmap/implementers-guide/src/node/utility/candidate-validation.md +++ b/roadmap/implementers-guide/src/node/utility/candidate-validation.md @@ -48,4 +48,39 @@ Once we have all parameters, we can spin up a background task to perform the val If we can assume the presence of the relay-chain state (that is, during processing [`CandidateValidationMessage`][CVM]`::ValidateFromChainState`) we can run all the checks that the relay-chain would run at the inclusion time thus confirming that the candidate will be accepted. +### PVF Host + +The PVF host is responsible for handling requests to prepare and execute PVF +code blobs. + +One high-level goal is to make PVF operations as deterministic as possible, to +reduce the rate of disputes. Disputes can happen due to e.g. a job timing out on +one machine, but not another. While we do not yet have full determinism, there +are some dispute reduction mechanisms in place right now. + +#### Retrying execution requests + +If the execution request fails during **preparation**, we will retry if it is +possible that the preparation error was transient (e.g. if the error was a panic +or time out). We will only retry preparation if another request comes in after +15 minutes, to ensure any potential transient conditions had time to be +resolved. We will retry up to 5 times. + +If the actual **execution** of the artifact fails, we will retry once if it was +an ambiguous error after a 1 second delay, to allow any potential transient +conditions to clear. + +#### Preparation timeouts + +We use timeouts for both preparation and execution jobs to limit the amount of +time they can take. As the time for a job can vary depending on the machine and +load on the machine, this can potentially lead to disputes where some validators +successfuly execute a PVF and others don't. + +One mitigation we have in place is a more lenient timeout for preparation during +execution than during pre-checking. The rationale is that the PVF has already +passed pre-checking, so we know it should be valid, and we allow it to take +longer than expected, as this is likely due to an issue with the machine and not +the PVF. + [CVM]: ../../types/overseer-protocol.md#validationrequesttype diff --git a/roadmap/implementers-guide/src/node/utility/pvf-prechecker.md b/roadmap/implementers-guide/src/node/utility/pvf-prechecker.md index b0f58346da99..fd75ce9e3804 100644 --- a/roadmap/implementers-guide/src/node/utility/pvf-prechecker.md +++ b/roadmap/implementers-guide/src/node/utility/pvf-prechecker.md @@ -12,11 +12,11 @@ This subsytem does not produce any output messages either. The subsystem will, h If the node is running in a collator mode, this subsystem will be disabled. The PVF pre-checker subsystem keeps track of the PVFs that are relevant for the subsystem. -To be relevant for the subsystem, a PVF must be returned by `pvfs_require_precheck` [`pvfs_require_precheck` runtime API][PVF pre-checking runtime API] in any of the active leaves. If the PVF is not present in any of the active leaves, it ceases to be relevant. +To be relevant for the subsystem, a PVF must be returned by the [`pvfs_require_precheck` runtime API][PVF pre-checking runtime API] in any of the active leaves. If the PVF is not present in any of the active leaves, it ceases to be relevant. When a PVF just becomes relevant, the subsystem will send a message to the [Candidate Validation] subsystem asking for the pre-check. -Upon receving a message from the candidate-validation subsystem, the pre-checker will note down that the PVF has its judgement and will also sign and submit a [`PvfCheckStatement`] via the [`submit_pvf_check_statement` runtime API][PVF pre-checking runtime API]. In case, a judgement was received for a PVF that is no longer in view it is ignored. It is possible that the candidate validation was not able to check the PVF. In that case, the PVF pre-checker will abstain and won't submit any check statements. +Upon receving a message from the candidate-validation subsystem, the pre-checker will note down that the PVF has its judgement and will also sign and submit a [`PvfCheckStatement`][PvfCheckStatement] via the [`submit_pvf_check_statement` runtime API][PVF pre-checking runtime API]. In case, a judgement was received for a PVF that is no longer in view it is ignored. It is possible that the candidate validation was not able to check the PVF. In that case, the PVF pre-checker will abstain and won't submit any check statements. Since a vote only is valid during [one session][overview], the subsystem will have to resign and submit the statements for for the new session. The new session is assumed to be started if at least one of the leaves has a greater session index that was previously observed in any of the leaves. @@ -28,4 +28,4 @@ If the node is not in the active validator set, it will still perform all the ch [Runtime API]: runtime-api.md [PVF pre-checking runtime API]: ../../runtime-api/pvf-prechecking.md [Candidate Validation]: candidate-validation.md -[`PvfCheckStatement`]: ../../types/pvf-prechecking.md +[PvfCheckStatement]: ../../types/pvf-prechecking.md#pvfcheckstatement diff --git a/roadmap/implementers-guide/src/types/candidate.md b/roadmap/implementers-guide/src/types/candidate.md index b9d5900b7f17..baad5b07e6cd 100644 --- a/roadmap/implementers-guide/src/types/candidate.md +++ b/roadmap/implementers-guide/src/types/candidate.md @@ -92,6 +92,22 @@ struct CandidateDescriptor { } ``` +## `ValidationParams` + +```rust +/// Validation parameters for evaluating the parachain validity function. +pub struct ValidationParams { + /// Previous head-data. + pub parent_head: HeadData, + /// The collation body. + pub block_data: BlockData, + /// The current relay-chain block number. + pub relay_parent_number: RelayChainBlockNumber, + /// The relay-chain block's storage root. + pub relay_parent_storage_root: Hash, +} +``` + ## `PersistedValidationData` The validation data provides information about how to create the inputs for validation of a candidate. This information is derived from the chain state and will vary from para to para, although some of the fields may be the same for every para. diff --git a/roadmap/implementers-guide/src/types/chain.md b/roadmap/implementers-guide/src/types/chain.md deleted file mode 100644 index e8ec6cea8f4a..000000000000 --- a/roadmap/implementers-guide/src/types/chain.md +++ /dev/null @@ -1,30 +0,0 @@ -# Chain - -Types pertaining to the relay-chain - events, structures, etc. - -## Block Import Event - -```rust -/// Indicates that a new block has been added to the blockchain. -struct BlockImportEvent { - /// The block header-hash. - hash: Hash, - /// The header itself. - header: Header, - /// Whether this block is considered the head of the best chain according to the - /// event emitter's fork-choice rule. - new_best: bool, -} -``` - -## Block Finalization Event - -```rust -/// Indicates that a new block has been finalized. -struct BlockFinalizationEvent { - /// The block header-hash. - hash: Hash, - /// The header of the finalized block. - header: Header, -} -``` diff --git a/roadmap/implementers-guide/src/types/overseer-protocol.md b/roadmap/implementers-guide/src/types/overseer-protocol.md index 4b9dc97c27e2..ad66d0132788 100644 --- a/roadmap/implementers-guide/src/types/overseer-protocol.md +++ b/roadmap/implementers-guide/src/types/overseer-protocol.md @@ -681,9 +681,7 @@ enum ProvisionerMessage { The Runtime API subsystem is responsible for providing an interface to the state of the chain's runtime. -This is fueled by an auxiliary type encapsulating all request types defined in the Runtime API section of the guide. - -> To do: link to the Runtime API section. Not possible currently because of https://github.com/Michael-F-Bryan/mdbook-linkcheck/issues/25. Once v0.7.1 is released it will work. +This is fueled by an auxiliary type encapsulating all request types defined in the [Runtime API section](../runtime-api) of the guide. ```rust enum RuntimeApiRequest { diff --git a/roadmap/implementers-guide/src/types/pvf-prechecking.md b/roadmap/implementers-guide/src/types/pvf-prechecking.md index 331429bd1fc5..f68f1e60feee 100644 --- a/roadmap/implementers-guide/src/types/pvf-prechecking.md +++ b/roadmap/implementers-guide/src/types/pvf-prechecking.md @@ -1,5 +1,7 @@ # PVF Pre-checking types +## `PvfCheckStatement` + > ⚠️ This type was added in v2. One of the main units of information on which PVF pre-checking voting is build is the `PvfCheckStatement`. diff --git a/scripts/ci/gitlab/lingua.dic b/scripts/ci/gitlab/lingua.dic index 3a19233a8fb9..a2f64af1cbd6 100644 --- a/scripts/ci/gitlab/lingua.dic +++ b/scripts/ci/gitlab/lingua.dic @@ -209,6 +209,7 @@ preconfigured preimage/MS preopen prepend/G +prevalidating prevalidation preverify/G programmatically