Collation fetching fairness #4880

tdimitrov · 2024-06-26T07:13:31Z

Related to #1797

The problem

When fetching collations in collator protocol/validator side we need to ensure that each parachain has got a fair core time share depending on its assignments in the claim queue. This means that the number of collations fetched per parachain should ideally be equal to (but definitely not bigger than) the number of claims for the particular parachain in the claim queue.

Why the current implementation is not good enough

The current implementation doesn't guarantee such fairness. For each relay parent there is a waiting_queue (PerRelayParent -> Collations -> waiting_queue) which holds any unfetched collations advertised to the validator. The collations are fetched on first in first out principle which means that if two parachains share a core and one of the parachains is more aggressive it might starve the second parachain. How? At each relay parent up to max_candidate_depth candidates are accepted (enforced in fn is_seconded_limit_reached) so if one of the parachains is quick enough to fill in the queue with its advertisements the validator will never fetch anything from the rest of the parachains despite they are scheduled. This doesn't mean that the aggressive parachain will occupy all the core time (this is guaranteed by the runtime) but it will deny the rest of the parachains sharing the same core to have collations backed.

How to fix it

The solution I am proposing is to limit fetches and advertisements based on the state of the claim queue. At each relay parent the claim queue for the core assigned to the validator is fetched. For each parachain a fetch limit is calculated (equal to the number of entries in the claim queue). Advertisements are not fetched for a parachain which has exceeded its claims in the claim queue. This solves the problem with aggressive parachains advertising too much collations.

The second part is in collation fetching logic. The collator will keep track on which collations it has fetched so far. When a new collation needs to be fetched instead of popping the first entry from the waiting_queue the validator examines the claim queue and looks for the earliest claim which hasn't got a corresponding fetch. This way the collator will always try to prioritise the most urgent entries.

How the 'fair share of coretime' for each parachain is determined?

Thanks to async backing we can accept more than one candidate per relay parent (with some constraints). We also have got the claim queue which gives us a hint which parachain will be scheduled next on each core. So thanks to the claim queue we can determine the maximum number of claims per parachain.

For example the claim queue is [A A A] at relay parent X so we know that at relay parent X we can accept three candidates for parachain A. There are two things to consider though:

If we accept more than one candidate at relay parent X we are claiming the slot of a future relay parent. So accepting two candidates for relay parent X means that we are claiming the slot at rp X+1 or rp X+2.
At the same time the slot at relay parent X could have been claimed by a previous relay parent(s). This means that we need to accept less candidates at X or even no candidates.

There are a few cases worth considering:

Slot claimed by previous relay parent.
CQ @ rp X: [A A A]
Advertisements at X-1 for para A: 2
Advertisements at X-2 for para A: 2
Outcome - at rp X we can accept only 1 advertisement since our slots were already claimed.
Slot in our claim queue already claimed at future relay parent
CQ @ rp X: [A A A]
Advertisements at X+1 for para A: 1
Advertisements at X+2 for para A: 1
Outcome: at rp X we can accept only 1 advertisement since the slots in our relay parents were already claimed.

The situation becomes more complicated with multiple leaves (forks). Imagine we have got a fork at rp X:

CQ @ rp X: [A A A]
(rp X) -> (rp X+1) -> rp(X+2)
         \-> (rp X+1')

Now when we examine the claim queue at RP X we need to consider both forks. This means that accepting a candidate at X means that we should have a slot for it in BOTH leaves. If for example there are three candidates accepted at rp X+1' we can't accept any candidates at rp X because there will be no slot for it in one of the leaves.

How the claims are counted

There are two solutions for counting the claims at relay parent X:

Keep a state for the claim queue (number of claims and which of them are claimed) and look it up when accepting a collation. With this approach we need to keep the state up to date with each new advertisement and each new leaf update.
Calculate the state of the claim queue on the fly at each advertisement. This way we rebuild the state of the claim queue at each advertisements.

Solution 1 is hard to implement with forks. There are too many variants to keep track of (different state for each leaf) and at the same time we might never need to use them. So I decided to go with option 2 - building claim queue state on the fly.

To achieve this I've extended View from backing_implicit_view to keep track of the outer leaves. I've also added a method which accepts a relay parent and return all paths from an outer leaf to it. Let's call it paths_to_relay_parent.

So how the counting works for relay parent X? First we examine the number of seconded and pending advertisements (more on pending in a second) from relay parent X to relay parent X-N (inclusive) where N is the length of the claim queue. Then we use paths_to_relay_parent to obtain all paths from outer leaves to relay parent X. We calculate the claims at relay parents X+1 to X+N (inclusive) for each leaf and get the maximum value. This way we guarantee that the candidate at rp X can be included in each leaf. This is the state of the claim queue which we use to decide if we can fetch one more advertisement at rp X or not.

What is a pending advertisement

I mentioned that we count seconded and pending advertisements at relay parent X. A pending advertisement is:

An advertisement which is being fetched right now.
An advertisement pending validation at backing subsystem.
An advertisement blocked for seconding by backing because we don't know on of its parent heads.

Any of these is considered a 'pending fetch' and a slot for it is kept. All of them are already tracked in State.

polkadot/node/network/collator-protocol/src/validator_side/collation.rs

…al to `allowed_ancestry_len`

polkadot/node/network/collator-protocol/src/validator_side/mod.rs

polkadot/node/network/collator-protocol/src/validator_side/collation.rs

polkadot/node/network/collator-protocol/src/validator_side/mod.rs

…arents

alindima

Nice job 🚀 ! The code is a lot more intuitive now. We have now some good foundation for what's next with removing AsyncBackingParams altogether and several ideas for further refactoring.

Only some small things left from my side

polkadot/node/network/collator-protocol/src/validator_side/claim_queue_state.rs

polkadot/node/network/collator-protocol/src/validator_side/collation.rs

polkadot/node/network/collator-protocol/src/validator_side/mod.rs

alindima · 2024-11-27T13:53:57Z

polkadot/node/network/collator-protocol/src/validator_side/mod.rs

+	let unfulfilled_entries = claim_queue_states
+		.iter_mut()
+		.map(|cq| cq.unclaimed_at(relay_parent))
+		.max_by(|a, b| a.len().cmp(&b.len()))


not very clear on the purpose of this max_by. the unfulfilled claims on different forks can look quite different right?

First to clear the problem I am trying to solve here.

unfulfilled_claim_queue_entries is used by get_next_collation_to_fetch and it's purpose is to produce a Vec of ParaIDs which can be fetched, ordered by urgency. So a result could look like [A A B] which means that we can accept collations from both paras A and B but A is strongly preferred in this case.

As you mentioned the claims on different forks can be quite different and from all forks (all claim_queue_states) we need to produce a single Vec with ParaIds ordered by priority. Hope we are on the same page so far.

Now why max_by?
No matter how many forks we have got, they should have the same claims because the claim queue should be the same for them. Let's say we have got two forks - X1 and X2. They should have the same claim queue state ([A A B]). The difference might be that at X1 first spot might already be claimed while in X2 it might not be claimed. So in this case the unfulfilled entries for X1 will be [AB] while at X2 - [AAB].

For this reason I naively decided to go for the longest unfulfilled entries Vec since it might be the most urgent one to satisfy. Most urgent because if it is the longest one it's first slot is PROBABLY not satisfied. Of course there are a lot of corner cases here. E.g. the second slot might have been claimed at X1 which doesn't make it less urgent than X2.

We can be smarter here and examine what's claimed at each fork and make a more informed decision but I don't think it's worth the complication.

Also it's worth adding this information as a comment in the code because it's not obvious.

(added a comment)

Isn't that essentially incentivising forks? We will attempt to fetch for the fork which the most unfulfilled claims.

Co-authored-by: Alin Dima <alin@parity.io>

paritytech-workflow-stopper · 2024-11-27T14:49:55Z

All GitHub workflows were cancelled due to failure one of the required jobs.
Failed workflow url: https://github.com/paritytech/polkadot-sdk/actions/runs/12052661140
Failed job name: fmt

tdimitrov · 2024-11-28T08:32:02Z

polkadot/node/network/collator-protocol/src/validator_side/mod.rs

+		"Checking seconding limit",
+	);
+
+	for path in paths {


I've added a todo in #6679 to keep ClaimqueueState in State and have it ready for use when we needed instead of rebuilding it each time.

tdimitrov · 2024-11-28T08:32:21Z

polkadot/node/network/collator-protocol/src/validator_side/mod.rs

+// Returns the claim queue without fetched or pending advertisement. The resulting `Vec` keeps the
+// order in the claim queue so the earlier an element is located in the `Vec` the higher its
+// priority is.
+fn unfulfilled_claim_queue_entries(relay_parent: &Hash, state: &State) -> Result<Vec<ParaId>> {


Same here: I've added a todo in #6679 to keep ClaimqueueState in State and have it ready for use when we needed instead of rebuilding it each time.

Overkillus

Very nice and clear PR description and a much welcome refactor, GJ

I added some comments and questions, but nothing major.

Slot in our claim queue already claimed at future relay parent
CQ @ rp X: [A A A]
Advertisements at X+1 for para A: 1
Advertisements at X+2 for para A: 1
Outcome: at rp X we can accept only 1 advertisement since the slots in our relay parents were already claimed.

For my own curiosity but when would this happen? What order of collator adverts leads to this scenario?

(X->X+2 means collation anchored at rp X but claiming the slot at X+2)

Collator A builds collations for X->X, X->X+1, X->X+2
Collator A in time sends the collation X->X to next Collator B
Collator A then crashes and does not send X->X+1,X->X+2 to Collator B
Collator A becaue he crashed also does not send his collations to validators
Collator B since he has not seen X->X+1 or X->X+2, builds his own X+1->X+1, but is lazy so he does not make X+1->X+2
Collator B sends X+1->X+1 to Collator C and to validators
Validators fetch X+1->X+1 even though they have not seen any collations for slot X
Collator C also has not seen X->X+2 but he seen X+1->X+1 so he builds X+2->X+2
Collator C sends X+2->X+2 to validators
Validators fetch X+2->X+2
Collator A awakens from his crash and starts sending his collations X->X, X->X+1, X->X+2 to everyone
Other collators see it but it is too late already, they built their own for X+1 and X+2
Validators finally receive X->X but discard X->X+1 and X->X+2 since they already fetched collations X+1->X+1 and X+2->X+2

Is there a simpler scenario? 🤔

Overkillus · 2024-11-29T02:29:37Z

polkadot/node/network/collator-protocol/src/validator_side/claim_queue_state.rs

+/// should be built/fetched/accepted (depending on the context) at each block.
+///
+/// Since the claim queue peeks into the future blocks there is a relation between the claim queue
+/// state between the current block and the future blocks. Let's see an example:


Suggested change

/// state between the current block and the future blocks. Let's see an example:

/// state between the current block and the future blocks.

/// Let's see an example with 2 co-scheduled parachains:

Overkillus · 2024-11-29T13:38:47Z

polkadot/node/network/collator-protocol/src/validator_side/claim_queue_state.rs

+					self.future_blocks.push_back(ClaimInfo {
+						hash: None,
+						claim: Some(*expected_claim),
+						claim_queue_len: 1,


Why do we set claim_queue_len: 1 for future blocks? How should this valie be interpreted?

For instance ClaimQueueState:
block_state: A
future_blocks: B, C

I'd expect A to have CQ len 3, B 2 and C 1.

Overkillus · 2024-11-29T13:44:00Z

polkadot/node/network/collator-protocol/src/validator_side/claim_queue_state.rs

+}
+
+#[cfg(test)]
+mod test {


Very nice and clear tests!

I think there is one more test type worth adding. Tests where no claim queue is provided on leaf activation or cases where claim queue provided is smaller than one seen before.

Overkillus · 2024-11-29T14:20:46Z

polkadot/node/subsystem-util/src/backing_implicit_view.rs

@@ -323,6 +329,205 @@ impl View {
 			.as_ref()
 			.map(|mins| mins.allowed_relay_parents_for(para_id, block_info.block_number))
 	}
+
+	/// Returns all paths from a leaf to the last block in state containing `relay_parent`. If no


Suggested change

/// Returns all paths from a leaf to the last block in state containing `relay_parent`. If no

/// Returns all paths from each leaf to the last block in state containing `relay_parent`. If no

Overkillus · 2024-11-29T14:25:28Z

polkadot/node/subsystem-util/src/backing_implicit_view.rs

+			return vec![]
+		}
+
+		// Find all paths from each outer leaf to `relay_parent`.


What is an outer leaf? Is it different than a leaf?

Overkillus · 2024-11-29T14:50:46Z

polkadot/node/network/collator-protocol/src/validator_side/mod.rs

+			}
+		}
+
+		if !cq_state.can_claim_at(relay_parent, &para_id) {


This piece essentially ensures that there needs to be place in ALL forks/paths. I think this behaviour should be documented above the func ensure_seconding_limit_is_respected.

Also is it ever possible that on one path the order of paraid claims is different than the other path? i.e.:

0 1 2 A ----> B -> B \ -> A -> A

If that would ever happen it can cause a deadlock where we ignore all advertisements. All adverts for para B would be rejected because they cannot be claimed in bottom fork while all A adverts rejected because they don't fit into upper fork.

As far as I know this should never happen but just want to make absolutely sure.

Overkillus · 2024-11-29T15:14:03Z

polkadot/node/network/collator-protocol/src/validator_side/mod.rs

-	if per_relay_parent.collations.is_seconded_limit_reached(relay_parent_mode) {
-		return Err(AdvertisementError::SecondedLimitReached)
-	}
+	ensure_seconding_limit_is_respected(&relay_parent, para_id, state)?;


ensure_seconding_limit_is_respected checks if the limit is respected AKA this will error out if it turns out we cannot second the advertised candidate (optimistically since pending are counted).

Few lines below we check can_second which also has a logic connected to checking if we are able to second that candidate. I haven't fully looked into it but aren't they effectively doing the same?

In what cases will

ensure_seconding_limit_is_respected pass with no error

while

can_second result in an error?

Overkillus · 2024-11-29T16:30:39Z

polkadot/node/network/collator-protocol/src/validator_side/mod.rs

+	let unfulfilled_entries = claim_queue_states
+		.iter_mut()
+		.map(|cq| cq.unclaimed_at(relay_parent))
+		.max_by(|a, b| a.len().cmp(&b.len()))


Isn't that essentially incentivising forks? We will attempt to fetch for the fork which the most unfulfilled claims.

Overkillus · 2024-11-29T16:34:19Z

polkadot/node/network/collator-protocol/src/validator_side/collation.rs

+//!       validator counts all advertisements within its view not just at the relay parent.
+//!    3. If the advertisement was accepted, it's queued for fetch (per relay parent).
+//!    4. Once it's requested, the collation is said to be Pending.
+//!    5. Pending collation becomes Fetched once received, we send it to backing for validation.


Don't we treat it as Pending at this step as well?

Collation fetching fairness

f4738dc

tdimitrov added the T8-polkadot This PR/Issue is related to/affects the Polkadot network. label Jun 26, 2024

tdimitrov commented Jun 26, 2024

View reviewed changes

polkadot/node/network/collator-protocol/src/validator_side/collation.rs Outdated Show resolved Hide resolved

Comments

c7074da

tdimitrov commented Jun 26, 2024

View reviewed changes

polkadot/node/network/collator-protocol/src/validator_side/collation.rs Outdated Show resolved Hide resolved

tdimitrov added 4 commits June 26, 2024 16:39

Fix tests and add some logs

73eee87

Fix per para limit calculation in is_collations_limit_reached

fa321ce

Fix default TestState initialization: claim queue len should be equ…

96392a5

…al to `allowed_ancestry_len`

clippy

0f28aa8

tdimitrov force-pushed the tsv-collator-proto-fairness branch from c7f24aa to 0f28aa8 Compare June 28, 2024 08:19

Update is_collations_limit_reached - remove seconded limit

e5ea548

tdimitrov commented Jun 28, 2024

View reviewed changes

polkadot/node/network/collator-protocol/src/validator_side/mod.rs Show resolved Hide resolved

tdimitrov added 2 commits July 1, 2024 13:59

Fix pending fetches and more tests

9abc898

Remove unnecessary clone

c07890b

tdimitrov commented Jul 1, 2024

View reviewed changes

polkadot/node/network/collator-protocol/src/validator_side/collation.rs Outdated Show resolved Hide resolved

tdimitrov added 15 commits July 1, 2024 15:20

Comments

e50440e

Better var names

42b05c7

Fix pick_a_collation_to_fetch and add more tests

2f5a466

Fix test: collation_fetching_respects_claim_queue

ff96ef9

Add collation_fetching_fallback_works test + comments

e837689

More tests

91cdd13

Fix collation limit fallback

9f2d59b

Separate claim_queue_support from ProspectiveParachainsMode

a10c86d

Fix comments and add logs

b39858a

Update test: collation_fetching_prefer_entries_earlier_in_claim_queue

b30f340

Fix pick_a_collation_to_fetch and more tests

c0f18b9

Merge branch 'master' into tsv-collator-proto-fairness

703ed6d

Fix pick_a_collation_to_fetch - iter 1

fba7ca6

Fix pick_a_collation_to_fetch - iter 2

d4f4ce2

Remove a redundant runtime version check

5f52712

tdimitrov added 3 commits November 20, 2024 14:33

new test - claims_above_are_counted_correctly

f833783

more tests

7c807e9

comment

8f33ba0

alindima reviewed Nov 21, 2024

View reviewed changes

polkadot/node/network/collator-protocol/src/validator_side/mod.rs Outdated Show resolved Hide resolved

tdimitrov added 16 commits November 22, 2024 09:15

Fix path iteration in seconded_and_pending_for_para_above

1c9db10

Merge branch 'master' into tsv-collator-proto-fairness

e6947b9

Remove a debug println

439291a

Refactor claims counting: project claim queue state on top of relay p…

3f7691a

…arents

Remove unused code

b1de6ba

Trace logs in claim_queue_state

48d3f5c

Fix path generation in ensure_seconding_limit_is_respected

2e0d142

Rework unfulfilled_claim_queue_entries

33e5c9f

paths_from_leaves_via_relay_parent

5bc63de

Fix fetch_next_collation_on_invalid_collation

1a195c1

paths_via_relay_parent uses block_info

ac3e1a1

Fix tests

31c6cdb

Comments

3794e69

File header and tests for edge cases in claim_queue_state

3e2acd4

Merge branch 'master' into tsv-collator-proto-fairness

583f469

clippy

eac7a73

alindima reviewed Nov 27, 2024

View reviewed changes

tdimitrov mentioned this pull request Nov 27, 2024

Collator protocol validator side refactoring ideas #6679

Open

3 tasks

Apply suggestions from code review

3bd85d9

Co-authored-by: Alin Dima <alin@parity.io>

tdimitrov added 2 commits November 28, 2024 09:36

comment

4b2a67d

Small refactoring at claim_queue_state

f3634e1

alindima approved these changes Nov 28, 2024

View reviewed changes

tdimitrov commented Nov 28, 2024

View reviewed changes

Overkillus reviewed Nov 29, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Collation fetching fairness #4880

Collation fetching fairness #4880

tdimitrov commented Jun 26, 2024 •

edited by Overkillus

Loading

alindima left a comment

alindima Nov 27, 2024

tdimitrov Nov 27, 2024

tdimitrov Nov 28, 2024

Overkillus Nov 29, 2024

paritytech-workflow-stopper bot commented Nov 27, 2024

tdimitrov Nov 28, 2024

tdimitrov Nov 28, 2024

Overkillus left a comment

Overkillus Nov 29, 2024

Overkillus Nov 29, 2024

Overkillus Nov 29, 2024

Overkillus Nov 29, 2024

Overkillus Nov 29, 2024

Overkillus Nov 29, 2024

Overkillus Nov 29, 2024

Overkillus Nov 29, 2024

Overkillus Nov 29, 2024

Overkillus Nov 29, 2024

	/// state between the current block and the future blocks. Let's see an example:
	/// state between the current block and the future blocks.
	/// Let's see an example with 2 co-scheduled parachains:

	/// Returns all paths from a leaf to the last block in state containing `relay_parent`. If no
	/// Returns all paths from each leaf to the last block in state containing `relay_parent`. If no

Collation fetching fairness #4880

Are you sure you want to change the base?

Collation fetching fairness #4880

Conversation

tdimitrov commented Jun 26, 2024 • edited by Overkillus Loading

The problem

Why the current implementation is not good enough

How to fix it

How the 'fair share of coretime' for each parachain is determined?

How the claims are counted

What is a pending advertisement

alindima left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paritytech-workflow-stopper bot commented Nov 27, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Overkillus left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tdimitrov commented Jun 26, 2024 •

edited by Overkillus

Loading