Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constrain parachain block validity on a specific core #103

Merged
merged 24 commits into from
Sep 9, 2024

Conversation

sandreim
Copy link
Contributor

Following the discussion on #92, this is a proposal to introduce the required core index commitments to make elastic scaling work securely with open collator sets.

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
@sandreim sandreim changed the title Constrain parachain block validity on a single core Constrain parachain block validity on a specific core Jul 15, 2024
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Copy link

@eskimor eskimor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
Copy link
Contributor

@alindima alindima left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good!

text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved

At present time misbehaving collator nodes, or anyone who has acquired a valid collation can prevent a parachain from effecitvely using elastic scaling by providing the same collation to all backing groups assigned to the parachain. This happens before the next parachain block is authored and will prevent the chain of candidates to be formed, reducing the throughput of the parachain to a single core.

The session index of candidates is important for the disputes protocol as it is used to lookup validator keys and check dispute vote signatures. By adding a `SessionIndex` in the `CandidateDescriptor`, validators no longer have to trust the `Sessionindex` provided by the validator raising a dispute. It can happen that the dispute concerns a relay chain block not yet imported by a validator. In this case validators can safely assume the session index refers to the session the candidate has appeared in, otherwise the chain would have rejected candidate.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering that we now changed this RFC to also add the session index I would change the title of the RFC (which only mentions the core index commitment).
Either this or I would mention here that this change is not needed for elastic scaling but we are taking advantage and bundling these unrelated changes into one

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah and the motivations reads itself a little bit bumpy. Just a little introduction on that you try to solve two issues whatever.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dispute concerns a relay chain block not yet imported by a validator

If this is the case, doesn't this also means that the CandidateDescriptor is unknown and thus, we need to trust the validator giving us a valid descriptor? And the attack here basically is that the validator would maybe check against the wrong set?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you not just slash the validator for it asking you to check an invalid session index? It signs the dispute and if the session index is wrong, we should be able to proof this to the runtime or?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By having the SessionIndex in the descriptor, it is either valid or it does not exist on any chain (as the chain checks on import). If it does not exist on any chain, then the dispute is just spam and won't resolve anyways, hence there is no risk.

Can you not just slash the validator for it asking you to check an invalid session index? It signs the dispute and if the session index is wrong, we should be able to proof this to the runtime or?

Not easily. You would need to be able to prove the session of a block of another fork.

And the attack here basically is that the validator would maybe check against the wrong set?

yes.

It is also just "nice" as it makes candidates more self-contained: With the SessionIndex provided in the descriptor you can validate the state transition without any knowledge of the fork it appeared in.

It is true that the SessionIndex could be made up, but so could be the persisted validation data. Essentially what a checker checks when validating is: This is valid, assuming you find a chain that would accept the candidate/has accepted the candidate: Both persisted validation data and SessionIndex are verified on chain, thus it is not necessary to prove their validity off-chain: "Assuming this actually exists, I can confirm it is valid."

I am mostly sanity checking myself here. Any more concerns, please bring them up. @burdges @rphmeier thoughts?

text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved

The UMP queue layout is changed to allow the relay chain to receive both the XCM messages and `UMPSignal` messages. An empty message (empty `Vec<u8>`) is used to mark the end XCM messages and the start of `UMPSignal` messages.

This way of representing the new messages has been chosen over introducing an enum wrapper to minimize breaking changes of XCM message decoding in tools like Subscan for example.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will still break for them. They will not be able to decode the rest.

text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved

At present time misbehaving collator nodes, or anyone who has acquired a valid collation can prevent a parachain from effecitvely using elastic scaling by providing the same collation to all backing groups assigned to the parachain. This happens before the next parachain block is authored and will prevent the chain of candidates to be formed, reducing the throughput of the parachain to a single core.

The session index of candidates is important for the disputes protocol as it is used to lookup validator keys and check dispute vote signatures. By adding a `SessionIndex` in the `CandidateDescriptor`, validators no longer have to trust the `Sessionindex` provided by the validator raising a dispute. It can happen that the dispute concerns a relay chain block not yet imported by a validator. In this case validators can safely assume the session index refers to the session the candidate has appeared in, otherwise the chain would have rejected candidate.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah and the motivations reads itself a little bit bumpy. Just a little introduction on that you try to solve two issues whatever.

text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Show resolved Hide resolved
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
.
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Copy link
Contributor

@alindima alindima left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job!

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
| Core 2 | **Para A** | Para B | **Para A** |
| Core 3 | Para B | **Para A** | **Para A** |

The purpose of `ClaimQueueOffset` is to select the column from the above table.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Polkadot right now allows to use a claim ahead in the queue? I still don't really get why we need it. The parachain runtime should ensure that always a new core is being used, for that using selector is enough.
Do we use right now the relay chain block of the pov was build on to determine the claim queue and then to determine the core? Or what are we doing right now?

Copy link
Contributor Author

@sandreim sandreim Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, the core index is determined by the backers who have voted. We map validator index to group index to core index.

The claim queue offset is required because of the CoreSelector % number_of_assignments operation that selects the core. Because parachains can have a varying number of cores assigned to them (on-demand) at different offsets in the claim queue, the core index you get might be different depending on which claim is used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean I get this. However, you want to us all possible claims? So, you probably don't want to waste any cores. Which means you will build on all the cores. With prospective parachains you also can calculate which should be the next claim to use. Thus, we should not require the offset?

Copy link
Contributor Author

@sandreim sandreim Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean I get this. However, you want to us all possible claims? So, you probably don't want to waste any cores. Which means you will build on all the cores.

Yes

With prospective parachains you also can calculate which should be the next claim to use. Thus, we should not require the offset?

That would not be enough. Let's see what happens. In backing we need to determine if the candidate is valid on the core the backers are assigned to. We take the core selector and get compute the core index and assume we start to fill the claims at top of the queue , which is exactly like claim queue offset 0, or we could have a more evolved tracking mechanism and we'd know which is the next free claim on a higher queue depth.

The problem is that the collator has to provide the core index in the descriptor. Keep in mind that the modulo core_selector operation gives different results depending on number of claims. So the parachain runtime has to actually pick a claim. That is why we need this offset so that validator know what the parachain has picked. If we don't have this parameter, collators need to make same assumption as validators and the runtime otherwise they get different core index.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this a little bit more and I think I found a way to not require the offset. My main problem with this offset is that it is static. You also write in the RFC itself that it isn't that great.

So, to my proposal. I would propose we store the latest core_selector of each parachain in the relay chain state. Then we can use this latest_core_selector (fetched at the candidate context relay chain block) together with the claim_queue to determine if the core_index in the candidate is correct. First we would calculate the core_selector_diff = core_selector.wrapping_sub(latest_core_selector). Then we would fetch the claim_queue at the candidate context relay chain block. We would count all cores assigned to our parachain until the count reaches core_selector_diff and then we would have our core_index.

I think we can then also get rid off max_candidate_len from the async backing parameters as the maximum length would be determined by the "claim queue lookahead".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping a claim queue snapshot at valid relayparents in the runtime allows us to compute the core index correctly.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Snapshots are fine. Possible future optimization (if necessary): Accompany each entry in the claim queue with the block number it came into existence. Then by only knowing the block number of the relay parent, we can filter to only the relevant entries.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One easy fix is to duplicate the claim for a order if the claim queue is empty. The parachain gets two chances if it uses cq offset 0 and has enough time to do well with cq offset 1.

And if the claim queue is not empty? They loose their claim?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One easy fix is to duplicate the claim for a order if the claim queue is empty. The parachain gets two chances if it uses cq offset 0 and has enough time to do well with cq offset 1.

And if the claim queue is not empty? They loose their claim?

If the parachain is using offset 1, no. They have enough time to build the candidate.
If the parachain is using offset 0, then it can only do synchronous backing if my understanding is right.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if the claim queue is not empty? They loose their claim?

No. The claim would not even be lost without the duplication on empty queues, but if the queue was empty before your claim immediately goes to position 0, which means only synchronous backing. Position 0 is therefore kind of "special".

There is no such problem if the queue was not empty. If at least one item was present before, you start at position 1, which already provides capability for full blocks.

text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Copy link
Contributor

@bkchr bkchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the general structure of the RFC. I'm only requesting some more clarifications, especially around the offset/selector. Thank you for the work!

I also want to see that the UMPSignal is optional. Maybe we will never need, but there is also no harm in having it, especially as for single core chains there is no downside in omitting it.

text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
the start of `UMPSignal` messages.

This way of representing the new messages has been chosen over introducing an enum wrapper to
minimize breaking changes of XCM message decoding in tools like Subscan for example.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is true. They will probably just skip the empty message and choke on the unknown umpsignal. So, they will require some way to handle this any way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't say things don't break, only that the impact of breakage is smaller compared to the alternative.

text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved

## Drawbacks

The only drawback is that further additions to the descriptor are limited to the amount of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. As long as the start of the descriptor until the version field stays the same, we can implement some custom decoder.

Copy link
Contributor Author

@sandreim sandreim Sep 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is true, but the RFC assumes all future changes we make to the descriptor are backward compatible to not break other parts of the stack.

text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
text/0103-introduce-core-index-commitment.md Outdated Show resolved Hide resolved
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
@bkchr
Copy link
Contributor

bkchr commented Sep 3, 2024

/rfc propose

@paritytech-rfc-bot
Copy link
Contributor

Hey @bkchr, here is a link you can use to create the referendum aiming to approve this RFC number 0103.

Instructions
  1. Open the link.

  2. Switch to the Submission tab.

  1. Adjust the transaction if needed (for example, the proposal Origin).

  2. Submit the Transaction


It is based on commit hash 01719f7b8e74839056a3285e722658823743c81f.

The proposed remark text is: RFC_APPROVE(0103,5d41d2089f62f3ceb50fd20445521f3f1fe4dfa4f2aa8c7b0882d8c36e161179).

Copy link

github-actions bot commented Sep 4, 2024

Voting for this referenda is ongoing.

Vote for it here

Copy link

github-actions bot commented Sep 9, 2024

PR can be merged.

Write the following command to trigger the bot

/rfc process 0xdcbb1a70e58737edfbfdb0b866cf977bebafcea08479808340ae03e492922b3e

@sandreim
Copy link
Contributor Author

sandreim commented Sep 9, 2024

/rfc process 0xdcbb1a70e58737edfbfdb0b866cf977bebafcea08479808340ae03e492922b3e

@paritytech-rfc-bot
Copy link
Contributor

The on-chain referendum has approved the RFC.

@paritytech-rfc-bot paritytech-rfc-bot bot merged commit fe8db0f into polkadot-fellows:main Sep 9, 2024
@anaelleltd anaelleltd added the Approved Has passed on-chain voting. label Sep 10, 2024
github-merge-queue bot pushed a commit to paritytech/polkadot-sdk that referenced this pull request Sep 23, 2024
Partially implements
#5048

- adds a core selection runtime API to cumulus and a generic way of
configuring it for a parachain
- modifies the slot based collator to utilise the claim queue and the
generic core selection

What's left to be implemented (in a follow-up PR):
- add the UMP signal for core selection into the parachain-system pallet

View the RFC for more context:
polkadot-fellows/RFCs#103

---------

Co-authored-by: command-bot <>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Approved Has passed on-chain voting.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants