Skip to content

Commit

Permalink
Update nep-0509.md
Browse files Browse the repository at this point in the history
  • Loading branch information
Shreyan Gupta authored Jun 18, 2024
1 parent 3135b5a commit 05570a9
Showing 1 changed file with 12 additions and 16 deletions.
28 changes: 12 additions & 16 deletions neps/nep-0509.md
Original file line number Diff line number Diff line change
Expand Up @@ -443,56 +443,52 @@ If it turns out that some limits weren't respected, the validators will generate
When a shard is missing some chunks, the following chunk on that shard will receive receipts from multiple blocks. This could lead to large `source_receipt_proofs` so a mechanism is added to reduce the impact. If there are two or more missing chunks in a row,
the shard is considered fully congested and no new receipts will be sent to it (unless it's the `allowed_shard` to avoid deadlocks).





## ChunkStateWitness distribution

For chunk production, the chunk producer is required to distribute the chunk state witness to all the chunk validators. The chunk validators then validate the chunk and send the chunk endorsement to the block producer. Chunk state witness distribution is thus a latency critical path.
For chunk production, the chunk producer is required to distribute the chunk state witness to all the chunk validators. The chunk validators then validate the chunk and send the chunk endorsement to the block producer. Chunk state witness distribution is on a latency critical path.

As we saw in the section above, the maximum size of the state witness can be about 16 MiB. If the chunk producer were to send the chunk state witness to all the chunk validators it would add a massive bandwidth requirement for the chunk producer. To ease and distribute the network requirements for all the chunk producers, we have a distribution mechanism similar to what we do with chunks. We divide the chunk state witness into a number of parts, and let the chunk validators distribute the parts among themselves.
As we saw in the section above, the maximum size of the state witness can be ~16 MiB. If the chunk producer were to send the chunk state witness to all the chunk validators it would add a massive bandwidth requirement for the chunk producer. To ease and distribute the network requirements across all the chunk producers, we have a distribution mechanism similar to what we have for chunks in the shards manager. We divide the chunk state witness into a number of parts, and let the chunk validators distribute the parts among themselves, and later reconstruct the chunk state witness.

### Distribution mechanism

A chunk producer divides the state witness into a set of `N` parts where `N` is the number of chunk validators. The parts are represented as [PartialEncodedStateWitness](https://github.com/near/nearcore/blob/66d3b134343d9f35f6e0b437ebbdbef3e4aa1de3/core/primitives/src/stateless_validation.rs#L40) and would be referred to as partial witnesses. Each chunk validator is assigned one of the parts. The chunk producer uses the [PartialEncodedStateWitnessMessage](https://github.com/near/nearcore/blob/66d3b134343d9f35f6e0b437ebbdbef3e4aa1de3/chain/network/src/state_witness.rs#L11) to forward each part to their respective owners. The chunk validator owners, on receiving the `PartialEncodedStateWitnessMessage`, forward this part to all other chunk validators via the [PartialEncodedStateWitnessForwardMessage](https://github.com/near/nearcore/blob/66d3b134343d9f35f6e0b437ebbdbef3e4aa1de3/chain/network/src/state_witness.rs#L15). Each validator then uses the parts received to reconstruct the full chunk state witness.
A chunk producer divides the state witness into a set of `N` parts where `N` is the number of chunk validators. The parts or partial witnesses are represented as [PartialEncodedStateWitness](https://github.com/near/nearcore/blob/66d3b134343d9f35f6e0b437ebbdbef3e4aa1de3/core/primitives/src/stateless_validation.rs#L40). Each chunk validator is the owner of one part. The chunk producer uses the [PartialEncodedStateWitnessMessage](https://github.com/near/nearcore/blob/66d3b134343d9f35f6e0b437ebbdbef3e4aa1de3/chain/network/src/state_witness.rs#L11) to send each part to their respective owners. The chunk validator part owners, on receiving the `PartialEncodedStateWitnessMessage`, forward this part to all other chunk validators via the [PartialEncodedStateWitnessForwardMessage](https://github.com/near/nearcore/blob/66d3b134343d9f35f6e0b437ebbdbef3e4aa1de3/chain/network/src/state_witness.rs#L15). Each validator then uses the partial witnesses received to reconstruct the full chunk state witness.

We have a separate [PartialWitnessActor](https://github.com/near/nearcore/blob/66d3b134343d9f35f6e0b437ebbdbef3e4aa1de3/chain/client/src/stateless_validation/partial_witness/partial_witness_actor.rs#L32) module that is responsible for dividing the state witness into parts, distributing the parts, handling both partial encoded state witness message and the forward message, validating and storing the parts, and reconstructing the state witness from the parts and sending is to the chunk validation module.
We have a separate [PartialWitnessActor](https://github.com/near/nearcore/blob/66d3b134343d9f35f6e0b437ebbdbef3e4aa1de3/chain/client/src/stateless_validation/partial_witness/partial_witness_actor.rs#L32) actor/module that is responsible for dividing the state witness into parts, distributing the parts, handling both partial encoded state witness message and the forward message, validating and storing the parts, and reconstructing the state witness from the parts and sending is to the chunk validation module.

### Building redundancy using Reed Solomon Erasure encoding

During the distribution mechanism, it's possible that some of the chunk validators are offline or have a high network latency. Since chunk witness distribution is on the critical path for block production, we safeguard the distribution mechanism by building in redundancy using the Reed Solomon Erasure encoding.
During the distribution mechanism, it's possible that some of the chunk validators are malicious, offline, or have a high network latency. Since chunk witness distribution is on the critical path for block production, we safeguard the distribution mechanism by building in redundancy using the Reed Solomon Erasure encoding.

With Reed Solomon Erasure encoding, we can divde the chunk state witness into `N` total parts with `D` number of data parts. We can reconstruct the whole state witness as long as we have `D` of the `N` parts. The ratio of data parts `r = D/N` is something we can play around with.

While reducing `r`, i.e. reducing the number of data parts required to reconstruct the state witness does allow for a more robust distribution mechanism, it comes with the cost of bloating the size of parts we need to distribute. If `S` is the size of the state witness, after reed solomon encoding, the total size `S'` of all parts becomes `S' = S/r` or `S' = S * N / D`.
While reducing `r`, i.e. reducing the number of data parts required to reconstruct the state witness does allow for a more robust distribution mechanism, it comes with the cost of bloating the overall size of parts we need to distribute. If `S` is the size of the state witness, after reed solomon encoding, the total size `S'` of all parts becomes `S' = S/r` or `S' = S * N / D`.

For the first release of stateless validation, we've kept the ratio as `0.6` representing that 2/3rd of all chunk validators need to be online for chunk state witness distribution mechanism to work smoothly.
For the first release of stateless validation, we've kept the ratio as `0.6` representing that ~2/3rd of all chunk validators need to be online for chunk state witness distribution mechanism to work smoothly.

One thing to note here is that the redundancy and upkeep requirement of 2/3rd is the _number_ of chunk validators and not the _stake_ of each validator.
One thing to note here is that the redundancy and upkeep requirement of 2/3rd is the _number_ of chunk validators and not the _stake_ of chunk validators.

### PartialEncodedStateWitness structure

The partial encoded state witness has the following fields
The partial encoded state witness has the following fields:
- `(epoch_id, shard_id, height_created)` : These are the three fields that together uniquely determine the chunk associated with the partial witness. Since the chunk and chunk header distribution mechanism is independent of the partial witness, we rely on this triplet to uniquely identify which chunk is a part associated with.
- `part_ord` : The index or id of the part in the array of partial witnesses.
- `part` : The data associated with the part
- `encoded_length` : The total length of the state witness. This is required in the reed solomon decoding process to reconstruct the state witness.
- `signature` : Each part is signed by the chunk producer. This way the validity of the partial witness can be verified by the chunk validators receiving the parts.

The `PartialEncodedStateWitnessTracker` module that is responsible for the storage and decoding of partial witnesses. This module has an LRU cache with `(epoch_id, shard_id, height_created)` triplet as our key and all the parts as the value. We reconstruct the state witness as soon as we have `D` of the `N` parts as forward the state witness to the validation module.
The `PartialEncodedStateWitnessTracker` module that is responsible for the storage and decoding of partial witnesses. This module has a LRU cache to store all the partial witnesses with `(epoch_id, shard_id, height_created)` triplet as the key. We reconstruct the state witness as soon as we have `D` of the `N` parts as forward the state witness to the validation module.

### Network tradeoffs

To get a sense of network requirements for validators with an without partial state witness distribution mechanism, we can do some quick back of the envelop calculations. Let `N` but the number of chunk validators, `S` be the size of the chunk state witness, `r` be the ratio of data parts to total parts for Reed Solomon Erasure encoding.

Without the partial state witness distribution, each chunk producer would have to send the state witness to all chunk validators, which would require a bandwidth `B` of `B = N * S`. For the worst case of ~16 validators and ~16 MiB of state witness size, this can be a burst requirement of 2 Gbps.

Partial state witness distribution takes this load off the chunk producer and distributed is among all the chunk validators but with an additional factor of `1/r` of extra data being transferred for redundancy. Each partial witness has a size of `P = S/N/r`. The chunk producer and validators needs a bandwidth `B` of `B = P * N` or `B = S / r` to forward its owned part to all chunk validators. For worst case of ~16 MiB of state witness size and encoding ratio of `0.6`, this works out to be ~214 Mbps, which is much more reasonable.
Partial state witness distribution takes this load off the chunk producer and distributes it evenly among all the chunk validators. However, we get an additional factor of `1/r` of extra data being transferred for redundancy. Each partial witness has a size of `P = S' / N` or `P = S / r / N`. The chunk producer and validators needs a bandwidth `B` of `B = P * N` or `B = S / r` to forward its owned part to all `N` chunk validators. For worst case of ~16 MiB of state witness size and encoding ratio of `0.6`, this works out to be ~214 Mbps, which is much more reasonable.

### Future work

In the Reed Solomon Erasure encoding section we discussed that the chunk state distribution mechanism relies on 2/3rd of the _number_ of chunk validators being available/non-malicious and not 2/3rd of the _total stake_ of the chunk validators. This can cause a potential issue where it's possible for more than 1/3rd of the chunk validators with small enough stake to be unavailable and still cause the chunk production to stall. In the future we would like to address this problem.
In the Reed Solomon Erasure encoding section we discussed that the chunk state distribution mechanism relies on 2/3rd of the _number_ of chunk validators being available/non-malicious and not 2/3rd of the _total stake_ of the chunk validators. This can cause a potential issue where it's possible for more than 1/3rd of the chunk validators with small enough stake to be unavailable and cause the chunk production to stall. In the future we would like to address this problem.

## Validator Role Change
Currently, there are two different types of validators and their responsibilities are as follows:
Expand Down

0 comments on commit 05570a9

Please sign in to comment.