Integrate Prospective Parachains Subsystem into Backing: Part 2 #5618

rphmeier · 2022-05-31T21:42:44Z

Follows on to #5557 (currently based upon it, but I will rebase on the feature branch once that's merged)

Closes #5055 . More details about the intended changeset can be found in this issue.

The high-level of the goals of the networking changes for asynchronous backing. We are coordinating 2 upgrades: a runtime API upgrade (v2 -> v3) and a network message upgrade (v1 -> v2). The idea is to do the network protocol upgrade first and have it be compatible with the messages that are needed for both runtime v2 and v3. Peers which are both on net-v2 will send each other messages and continue operating even after the runtime API upgrade. Peers which are on net-v1 will be useless for statement distribution after the runtime API upgrade. Until the runtime API upgrade, nodes running net-v2 will continue circulating statements to peers on net-v1, because it's backwards compatible. We will do something similar in the collator-protocol as well. Nothing changes in bitfield distribution, availability distribution, availability recovery, approval-distribution, or dispute-distribution, so outdated nodes can still continue interoperating on those protocols.

The main changes that asynchronous backing makes to backing are that 1) validators can legally second more than one candidate per relay-parent and 2) candidates can stick around for longer than the relay-parent remains a leaf in the block-tree. This has numerous implications for spam prevention which are described in detail in #5055

Work in this PR:

Introduces a new network protocol vstaging which is identical to v1 (with the exception of changes for statement distribution). This will become net-v2 later.
Enabled vstaging as the default with v1 as a fallback under the 'network-protocol-staging' feature-flag
Adapts the network bridge to handle vstaging network messages properly
Updates network subsystems to gracefully handle vstaging
Updates statement distribution to handle asynchronous backing. Update vstaging network protocol accordingly. This is the bulk of the work.
Test new statement distribution logic

rphmeier · 2022-06-06T13:07:34Z

Posting some notes on a couple questions I was thinking about - large statements, spam, and topology. At the end there is a proposed solution to all the issues.

How do we handle large statements for asynchronous backing?
- In the current form of statement distribution, we pile up all statements depending on a large CommittedCandidateReceipt and wait until we've fetched the candidate before processing them. Detecting dependent statements is easy because we can just check the candidate hash in the compact statement.
- With asynchronous backing, we may have candidates that depend on other candidates. We might get a candidate and check where it might appear in the fragment tree and find nothing. This could be spam, or it could be dependent on some Seconded statement that we've yet to fetch the candidate for.
- The best solution I can think of is to detect spam lazily, at least when it comes to Seconded statements. The rule is that we can't detect spam candidates until all of the fetches that were ongoing for the same para when the candidate was received have concluded. When we complete the last pending fetch started before the candidate was received, then we can definitively say whether the candidate is spam. For Valid statements, we can simply use the current spam detection mechanism of disallowing them unless the Seconded statement is known.
- We don't import or forward anything until we definitively know it's not spam. Furthermore, we don't initiate fetches for unconfirmed large-statements until we know they're not spam. This may mean we'll only make one large-statement request per para at a time under certain circumstances.
- We'll need some kind of upper-bound threshold on the number of potentially-spam candidates we're willing to field at any given time in general and from each peer. It should be n_validators * (max_depth + 1) per relay-parent, uniformly distributed - that is max_depth + 1 per validator per relay-parent. This is an over-estimate and we should also be careful only to actually import one per depth per validator per active leaf. If a peer ever sends us more than two candidates with a duplicate depth under an active-leaf then it's spam. This also means that we have to do large-statement fetches just in order to determine if candidates we're receiving are spam.
Gossip Topology for asynchronous backing statement distribution
- With asynchronous backing, statement distribution gets a few more restrictions on which messages can be sent at any time - i.e. we can't send statements about a candidate Y until its parent X is known by the peer. The only way we learn whether a candidate is known by a peer is if they send it to us or we send it to them.
- This is potentially incompatible with the gossip topology: the peer A who sends some target peer P the candidate X will likely not be the same as B who wants to send P the candidate Y. If X and Y have different authors and therefore appear elsewhere in the topology, this is almost certainly the case.
- What we can do is make sure that every peer makes sure that other peers in in their row and column are aware of which candidates they have.
- We can introduce a new message "NoteAware" which is sent to peers (in their row/column) on every candidate they're now aware of and is not forwarded. This will add some traffic overhead (mostly in the number of messages) which scales linearly in the number of validators and proportionally to sqrt(n_peers). Peers shouldn't send candidates which build on some other candidate until they're received a "NoteAware" from the intended recipient. This would also solve the spam prevention issue described above.

…tion-2

burdges · 2022-09-08T12:02:50Z

Are Views also tracking nodes assigned roles? We'd presumably send initial backing statements only to other backers, and only gossip double backing statements to everybody, for example.

rphmeier · 2022-09-09T00:37:40Z

Are Views also tracking nodes assigned roles? We'd presumably send initial backing statements only to other backers, and only gossip double backing statements to everybody, for example.

They don't directly but do indirectly - we can achieve the same effect based on node network public keys + chain state, which is what we usually do.

rphmeier added 30 commits May 18, 2022 21:03

BEGIN ASYNC candidate-backing CHANGES

62184b4

rename & document modes

821ed42

answer prospective validation data requests

1fc4928

GetMinimumRelayParents request is now plural

39f2076

implement an implicit view utility for backing subsystems

80af4be

implicit-view: get allowed relay parents

7c58f7a

refactorings and improvements to implicit view

fbcedac

add some TODOs for tests

126ed91

split implicit view updates into 2 functions

a5994a7

backing: define State to prepare for functional refactor

8944760

add some docs

c34d0e6

backing: implement bones of new leaf activation logic

20cd422

backing: create per-relay-parent-states

2b7d883

use new handle_active_leaves_update

923b28a

begin extracting logic from CandidateBackingJob

b9280d1

mostly extract statement import from job logic

967156e

handle statement imports outside of job logic

b25acb7

do some TODO planning for prospective parachains integration

df95962

finish rewriting backing subsystem in functional style

458d24d

add prospective parachains mode to relay parent entries

22a9d5e

fmt

fc0c4e4

add a RejectedByProspectiveParachains error

a4df277

notify prospective parachains of seconded and backed candidates

2f202d0

always validate candidates exhaustively in backing.

910b997

return persisted_validation_data from validation

19d7a43

handle rejections by prospective parachains

7f24629

implement seconding sanity check

22ead26

invoke validate_and_second

6738ee9

Alter statement table to allow multiple seconded messages per validator

c54e1ca

refactor backing to have statements carry PVD

e855b6c

collator-protocol: fix message fallout

6f16adf

rphmeier mentioned this pull request May 31, 2022

Integrate prospective parachains subsystem into backing: Part 1 #5557

Merged

6 tasks

rphmeier added 17 commits May 31, 2022 18:22

collator-protocol: load PVD from runtime

7e46d8a

make things compile

86007fb

fmt

faf00f6

begin extracting view logic to separate module

f43c8f6

create deeper submodule and add per-peer knowledge

759e1d1

add ProspectiveParachainsMessage to statement-distribution

51ca0ef

make recv_runtime public

778627b

add ChainApiMessage to outgoing of statement-dist

df6f7c0

begin new handle_active_leaves_update

9af0013

instantiate relay-parent-info without prospective data

ce6e3d8

add staging-network feature to protocol

5bdb0ec

rename to network-protocol staging

8c9158a

refactor view to better accomodate both protcools

dc6ee0d

begin fleshing out with_prospective

a45cb24

extract some tests to without_prospective; comment the rest

d5e8e88

begin high-level View API

3a095e6

fmt

406d87d

slumber force-pushed the rh-async-backing-feature branch from 2983e53 to 36268ec Compare June 22, 2022 18:48

Merge branch 'rh-async-backing-feature' into rh-async-backing-integra…

5c8e048

…tion-2

slumber force-pushed the rh-async-backing-integration-2 branch from 1300b20 to 5c8e048 Compare June 22, 2022 19:01

slumber mentioned this pull request Jun 24, 2022

Asynchronous Backing Spec & Tracking Issue #3779

Closed

18 tasks

This was referenced Sep 9, 2022

Network bridge changes for asynchronous backing + update subsystems to handle versioned packets #5991

Merged

Asynchronous backing statement distribution: Take III #5999

Merged

rphmeier closed this Sep 12, 2022

rphmeier deleted the rh-async-backing-integration-2 branch July 19, 2023 02:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate Prospective Parachains Subsystem into Backing: Part 2 #5618

Integrate Prospective Parachains Subsystem into Backing: Part 2 #5618

rphmeier commented May 31, 2022 •

edited

Loading

rphmeier commented Jun 6, 2022 •

edited

Loading

burdges commented Sep 8, 2022

rphmeier commented Sep 9, 2022 •

edited

Loading

Integrate Prospective Parachains Subsystem into Backing: Part 2 #5618

Integrate Prospective Parachains Subsystem into Backing: Part 2 #5618

Conversation

rphmeier commented May 31, 2022 • edited Loading

rphmeier commented Jun 6, 2022 • edited Loading

burdges commented Sep 8, 2022

rphmeier commented Sep 9, 2022 • edited Loading

rphmeier commented May 31, 2022 •

edited

Loading

rphmeier commented Jun 6, 2022 •

edited

Loading

rphmeier commented Sep 9, 2022 •

edited

Loading