Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Runtime APIs for node-side code #1401

Merged
merged 12 commits into from
Jul 17, 2020
1 change: 1 addition & 0 deletions roadmap/implementers-guide/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
- [InclusionInherent Module](runtime/inclusioninherent.md)
- [Validity Module](runtime/validity.md)
- [Router Module](runtime/router.md)
- [Runtime APIs](runtime-api/README.md)
- [Node Architecture](node/README.md)
- [Subsystems and Jobs](node/subsystems-and-jobs.md)
- [Overseer](node/overseer.md)
Expand Down
183 changes: 183 additions & 0 deletions roadmap/implementers-guide/src/runtime-api/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
# Runtime APIs

Runtime APIs are the means by which the node-side code extracts information from the state of the runtime.

Every block in the relay-chain contains a *state root* which is the root hash of a state trie encapsulating all storage of runtime modules after execution of the block. This is a cryptographic commitment to a unique state. We use the terminology of accessing the *state at* a block to refer accessing the state referred to by the state root of that block.

Although Runtime APIs are often used for simple storage access, they are actually empowered to do arbitrary computation. The implementation of the Runtime APIs lives within the Runtime as Wasm code and exposes extern functions that can be invoked with arguments and have a return value. Runtime APIs have access to a variety of host functions, which are contextual functions provided by the Wasm execution context, that allow it to carry out many different types of behaviors.

Abilities provided by host functions includes:
* State Access
* Offchain-DB Access
* Submitting transactions to the transaction queue
* Optimized versions of cryptographic functions
* More

So it is clear that Runtime APIs are a versatile and powerful tool to leverage the state of the chain. In general, we will use Runtime APIs for these purposes:
* Access of a storage item
* Access of a bundle of related storage items
* Deriving a value from storage based on arguments
* Submitting misbehavior reports

More broadly, we have the goal of using Runtime APIs to write Node-side code that fulfills the requirements set by the Runtime. In particular, the constraints set forth by the [Scheduler](../runtime/scheduler.md) and [Inclusion](../runtime/inclusion.md) modules. These modules are responsible for advancing paras with a two-phase protocol where validators are first chosen to validate and back a candidate and then required to ensure availability of referenced data. In the second phase, validators are meant to attest to those para-candidates that they have their availability chunk for. As the Node-side code needs to generate the inputs into these two phases, the runtime API needs to transmit information from the runtime that is aware of the Availability Cores model instantiated by the Scheduler and Inclusion modules.

Node-side code is also responsible for detecting and reporting misbehavior performed by other validators, and the set of Runtime APIs needs to provide methods for observing live disputes and submitting reports as transactions.

The next sections will contain information on specific runtime APIs. The format is this:

```rust
/// Fetch the value of the runtime API at the block.
///
/// Definitionally, the `at` parameter cannot be any block that is not in the chain.
/// Thus the return value is unconditional. However, for in-practice implementations
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ordian Does this address your Result question?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

partially, I should have been more explicit in my question, this was more substrate related question than to this PR, namely how low-level db errors are handled. And from what I see, https://github.com/paritytech/substrate/pull/3997/files, either the errors will be swallowed or a panic will occur. Anyway, this is handled on a different level.

/// it may be possible to provide an `at` parameter as a hash, which may not refer to a
/// valid block or one which implements the runtime API. In those cases it would be
/// best for the implementation to return an error indicating the failure mode.
fn some_runtime_api(at: Block, arg1: Type1, arg2: Type2, ...) -> ReturnValue;
```

## Validators

Yields the validator-set at the state of a given block. This validator set is always the one responsible for backing parachains in the child of the provided block.

```rust
fn validators(at: Block) -> Vec<ValidatorId>;
```

## Validator Groups

Yields the validator groups used during the current session. The validators in the groups are referred to by their index into the validator-set.

```rust
/// A helper data-type for tracking validator-group rotations.
struct GroupRotationInfo {
session_start_block: BlockNumber,
group_rotation_frequency: BlockNumber,
now: BlockNumber,
}

impl GroupRotationInfo {
/// Returns the index of the group needed to validate the core at the given index,
/// assuming the given amount of cores/groups.
fn group_for_core(core_index: usize, cores: usize) -> usize;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn group_for_core(core_index: usize, cores: usize) -> usize;
fn group_for_core(core_index: ValidatorIndex, core: usize) -> usize;

If we have a ValidatorIndex newtype, we may as well use it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But these aren't validator indices, they're core indices. We do have a CoreIndex type, although I would like to keep it private to the scheduler module. It drifted in #1312 ; after #1411 we can move them back a bit.

}

/// Returns the validator groups and rotation info localized based on the block whose state
/// this is invoked on. Note that `now` in the `GroupRotationInfo` should be the successor of
/// the number of the block.
fn validator_groups(at: Block) -> (Vec<Vec<ValidatorIndex>>, GroupRotationInfo);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to have a type alias for a validator group as well

type ValidatorGroup = Vec<ValidatorIndex>;

```

## Availability Cores

Yields information on all availability cores. Cores are either free or occupied. Free cores can have paras assigned to them. Occupied cores don't, but they can become available part-way through a block due to bitfields and then have something scheduled on them. To allow optimistic validation of candidates, the occupied cores are accompanied by information on what is upcoming. This information can be leveraged when validators perceive that there is a high likelihood of a core becoming available based on bitfields seen, and then optimistically validate something that would become scheduled based on that, although there is no guarantee on what the block producer will actually include in the block.

```rust
fn availability_cores(at: Block) -> Vec<CoreState>;
```

This is all the information that a validator needs about scheduling for the current block. It includes all information on [Scheduler](../runtime/scheduler.md) core-assignments and [Inclusion](../runtime/inclusion.md) state of blocks occupying availability cores. It includes data necessary to determine not only which paras are assigned now, but which cores are likely to become freed after processing bitfields, and exactly which bitfields would be necessary to make them so.

```rust
struct OccupiedCore {
/// The ID of the para occupying the core.
para: ParaId,
/// If this core is freed by availability, this is the assignment that is next up on this
/// core, if any. None if there is nothing queued for this core.
next_up_on_available: Option<ScheduledCore>,
/// The relay-chain block number this began occupying the core at.
occupied_since: BlockNumber,
/// The relay-chain block this will time-out at, if any.
time_out_at: BlockNumber,
/// If this core is freed by being timed-out, this is the assignment that is next up on this
/// core. None if there is nothing queued for this core or there is no possibility of timing
/// out.
next_up_on_time_out: Option<ScheduledCore>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having two next_up_on fields feels like it introduces the potential for confusion. When would we want next_up_on_time_out to differ from next_up_on_available? If they do differ, is it possible for a core both to timeout and to become available simultaneously? If so, what's the actual next scheduled core?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"the runtime has all the answers".

We don't want them to differ, and yes, it is confusing, but this is purely exposing information about the system described in the scheduler module. It would be nice to have a clear "next up" in all cases, but deeper reading on the scheduler module will reveal why that is not possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible for a core both to timeout and to become available simultaneously

No, they are mutually exclusive. These are the two different paths an Occupied core can take to the Free state. However, depending on which path is taken (which is unpredictable as it's based on the view/honesty of the block producer), the scheduling metadata can be different, leading to a different assignment onto the now-Free core.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general we should always optimize for the next_up_on_available case in Node-side code. The reason being that timeouts are only possible at a small subset of blocks, as they are only triggered within the short span of time directly following a group rotation. And even when timeouts can be triggered, unless validators are offline, they will not be reached before availability. And if validators are offline, it's fine to degrade throughput of paras somewhat.

However this runtime API is in the interest of making all information available to the node so more advanced strategies can be taken as research evolves.

/// A bitfield with 1 bit for each validator in the set. `1` bits mean that the corresponding
/// validators has attested to availability on-chain. A 2/3+ majority of `1` bits means that
/// this will be available.
availability: Bitfield,
}

struct ScheduledCore {
/// The ID of a para scheduled.
para: ParaId,
/// The collator required to author the block, if any.
collator: Option<CollatorId>,
}

enum CoreState {
/// The core is currently occupied.
Occupied(OccupiedCore),
/// The core is currently free, with a para scheduled and given the opportunity
/// to occupy.
///
/// If a particular Collator is required to author this block, that is also present in this
/// variant.
Scheduled(ScheduledCore),
/// The core is currently free and there is nothing scheduled. This can be the case for parathread
/// cores when there are no parathread blocks queued. Parachain cores will never be left idle.
Free,
}
```

## Global Validation Schedule

Yields the [`GlobalValidationSchedule`](../types/candidate.md#globalvalidationschedule) at the state of a given block. This applies to all para candidates with the relay-parent equal to that block.

```rust
fn global_validation_schedule(at: Block) -> GlobalValidationSchedule;
```

## Local Validation Data

Yields the [`LocalValidationData`](../types/candidate.md#localvalidationdata) for the given [`ParaId`](../types/candidate.md#paraid) along with an assumption that should be used if the para currently occupies a core: whether the candidate occupying that core should be assumed to have been made available and included or timed out and discarded, along with a third option to assert that the core was not occupied. This choice affects everything from the parent head-data, the validation code, and the state of message-queues. Typically, users will take the assumption that either the core was free or that the occupying candidate was included, as timeouts are expected only in adversarial circumstances and even so, only in a small minority of blocks directly following validator set rotations.

The documentation of [`LocalValidationData`](../types/candidate.md#localvalidationdata) has more information on this dichotomy.

```rust
/// An assumption being made about the state of an occupied core.
enum OccupiedCoreAssumption {
/// The candidate occupying the core was made available and included to free the core.
Included,
/// The candidate occupying the core timed out and freed the core without advancing the para.
TimedOut,
/// The core was not occupied to begin with.
Free,
}

/// Returns the local validation data for the given para and occupied core assumption.
///
/// Returns `None` if either the para is not registered or the assumption is `Freed`
/// and the para already occupies a core.
fn local_validation_data(at: Block, ParaId, OccupiedCoreAssumption) -> Option<LocalValidationData>;
```

## Session Index

Get the session index that is expected at the child of a block.

In the [`Initializer`](../runtime/initializer.md) module, session changes are buffered by one block. The session index of the child of any relay block is always predictable by that block's state.

This session index can be used to derive a [`SigningContext`](../types/candidate.md#signing-context).

```rust
/// Returns the session index expected at a child of the block.
fn session_index_for_child(at: Block) -> SessionIndex;
```

## Validation Code

Fetch the validation code used by a para, making the given `OccupiedCoreAssumption`.

```rust
fn validation_code(at: Block, ParaId, OccupiedCoreAssumption) -> ValidationCode;
```

## Candidate Pending Availability

Get the receipt of a candidate pending availability. This returns `Some` for any paras assigned to occupied cores in `availability_cores` and `None` otherwise.

```rust
fn candidate_pending_availability(at: Block, ParaId) -> Option<CommittedCandidateReceipt>;
```
3 changes: 1 addition & 2 deletions roadmap/implementers-guide/src/runtime/scheduler.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,8 +131,7 @@ Storage layout:
ValidatorGroups: Vec<Vec<ValidatorIndex>>;
/// A queue of upcoming claims and which core they should be mapped onto.
ParathreadQueue: ParathreadQueue;
/// One entry for each availability core. Entries are `None` if the core is not currently occupied. Can be
/// temporarily `Some` if scheduled but not occupied.
/// One entry for each availability core. Entries are `None` if the core is not currently occupied.
/// The i'th parachain belongs to the i'th core, with the remaining cores all being
/// parathread-multiplexers.
AvailabilityCores: Vec<Option<CoreOccupied>>;
Expand Down
13 changes: 10 additions & 3 deletions roadmap/implementers-guide/src/types/candidate.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,14 @@ In a way, this entire guide is about these candidates: how they are scheduled, c

This section will describe the base candidate type, its components, and variants that contain extra data.

## Para Id

A unique 32-bit identifier referring to a specific para (chain or thread). The relay-chain runtime guarantees that `ParaId`s are unique for the duration of any session, but recycling and reuse over a longer period of time is permitted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't matter for the purpose of this PR, but I'm curious: are the wrapped u32s hashes of some kind, or handed out sequentially, or what?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We haven't described the registrar in the guide yet, but I think it gives them out sequentially. Parachain IDs start at 0 and parathread IDs start with some of the higher bits set, although I don't remember exactly what.

Reuse is fine as long as it's at least a few sessions apart, although it would be best not to reuse until having cycled completely. We could alternatively use a generation/index system like this for handing out IDs to avoid reuse over a long period of time: http://bitsquid.blogspot.com/2014/08/building-data-oriented-entity-system.html

The uniqueness property is what's important here and the details of the registrar are free to change


```rust
struct ParaId(u32);
```

## Candidate Receipt

Much info in a [`FullCandidateReceipt`](#full-candidate-receipt) is duplicated from the relay-chain state. When the corresponding relay-chain state is considered widely available, the Candidate Receipt should be favored over the `FullCandidateReceipt`.
Expand Down Expand Up @@ -64,7 +72,7 @@ This struct is pure description of the candidate, in a lightweight format.
/// A unique descriptor of the candidate receipt.
struct CandidateDescriptor {
/// The ID of the para this is a candidate for.
para_id: Id,
para_id: ParaId,
/// The hash of the relay-chain block this is executed in the context of.
relay_parent: Hash,
/// The collator's sr25519 public key.
Expand All @@ -82,8 +90,6 @@ struct CandidateDescriptor {

The global validation schedule comprises of information describing the global environment for para execution, as derived from a particular relay-parent. These are parameters that will apply to all parablocks executed in the context of this relay-parent.

> TODO: message queue watermarks (first downward messages, then XCMP channels)

```rust
/// Extra data that is needed along with the other fields in a `CandidateReceipt`
/// to fully validate the candidate.
Expand Down Expand Up @@ -112,6 +118,7 @@ This choice can also be expressed as a choice of which parent head of the para w
Para validation happens optimistically before the block is authored, so it is not possible to predict with 100% accuracy what will happen in the earlier phase of the [`InclusionInherent`](../runtime/inclusioninherent.md) module where new availability bitfields and availability timeouts are processed. This is what will eventually define whether a candidate can be backed within a specific relay-chain block.

> TODO: determine if balance/fees are even needed here.
> TODO: message queue watermarks (first downward messages, then XCMP channels)

```rust
/// Extra data that is needed along with the other fields in a `CandidateReceipt`
Expand Down
44 changes: 22 additions & 22 deletions roadmap/implementers-guide/src/types/overseer-protocol.md
Original file line number Diff line number Diff line change
Expand Up @@ -228,33 +228,33 @@ enum ProvisionerMessage {

The Runtime API subsystem is responsible for providing an interface to the state of the chain's runtime.

Other subsystems query this data by sending these messages.
This is fueled by an auxiliary type encapsulating all request types defined in the Runtime API section of the guide.

```rust
/// The information on validator groups, core assignments,
/// upcoming paras and availability cores.
struct SchedulerRoster {
/// Validator-to-groups assignments.
validator_groups: Vec<Vec<ValidatorIndex>>,
/// All scheduled paras.
scheduled: Vec<CoreAssignment>,
/// Upcoming paras (chains and threads).
upcoming: Vec<ParaId>,
/// Occupied cores.
availability_cores: Vec<Option<CoreOccupied>>,
}
> TODO: link to the Runtime API section. Not possible currently because of https://github.com/Michael-F-Bryan/mdbook-linkcheck/issues/25. Once v0.7.1 is released it will work.

```rust
enum RuntimeApiRequest {
/// Get the current validator set.
Validators(ResponseChannel<Vec<ValidatorId>>),
/// Get the assignments of validators to cores, upcoming parachains.
SchedulerRoster(ResponseChannel<SchedulerRoster>),
/// Get a signing context for bitfields and statements.
SigningContext(ResponseChannel<SigningContext>),
/// Get the validation code for a specific para, assuming execution under given block number, and
/// an optional block number representing an intermediate parablock executed in the context of
/// that block.
ValidationCode(ParaId, BlockNumber, Option<BlockNumber>, ResponseChannel<ValidationCode>),
/// Get the validator groups and rotation info.
ValidatorGroups(ResponseChannel<(Vec<Vec<ValidatorIndex>>, GroupRotationInfo)>),
/// Get the session index for children of the block. This can be used to construct a signing
/// context.
SessionIndex(ResponseChannel<SessionIndex>),
/// Get the validation code for a specific para, using the given occupied core assumption.
ValidationCode(ParaId, OccupiedCoreAssumption, ResponseChannel<Option<ValidationCode>>),
/// Get the global validation schedule at the state of a given block.
GlobalValidationSchedule(ResponseChannel<GlobalValidationSchedule>),
/// Get the local validation data for a specific para, with the given occupied core assumption.
LocalValidationData(
ParaId,
OccupiedCoreAssumption,
ResponseChannel<Option<LocalValidationData>>,
),
/// Get information about all availability cores.
AvailabilityCores(ResponseChannel<AvailabilityCores>),
/// Get a committed candidate receipt for all candidates pending availability.
CandidatePendingAvailability(ParaId, ResponseChannel<Option<CommittedCandidateReceipt>>),
}

enum RuntimeApiMessage {
Expand Down