-
Notifications
You must be signed in to change notification settings - Fork 746
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prospective-parachains rework: take II #4937
prospective-parachains rework: take II #4937
Conversation
…e-parachains-allow-forks
This is ready for an initial review. CI zombienet tests pass and all of my local zombienet tests work well. In the meantime, it would be good to get some in-depth reviews. I'd appreciate very much reviews that focus on the protocol changes and the design rather than nitpicks about function names and structure (these can come at a later stage). |
Great work, Alin! I did one pass focusing mainly on the logic and the PR looks good. I'll do another full pass when you are ready. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a first pass on the changes. Looks good in general, I could not find any logic errors wrt the introduced changes in behaviour. However I do appreciate the changes that go deeper into the original prospective parachain subsystem architecture. Now things are much better to understand.
I
polkadot/node/core/prospective-parachains/src/fragment_chain/mod.rs
Outdated
Show resolved
Hide resolved
pred(&elem.candidate_hash) | ||
{ | ||
res.push(elem.candidate_hash); | ||
for elem in &self.best_chain.chain[base_pos..actual_end_index] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move this logic into find_acestor_paths and just return the remaining chain from there.
candidate_hash: CandidateHash, | ||
candidate: CommittedCandidateReceipt, | ||
persisted_validation_data: PersistedValidationData, | ||
) -> Result<Self, CandidateEntryError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doc comment should mention that if fails if 0 lenght cycle (ZeroLengthCycle)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are multiple cases where this function could fail. I think it's a good practice to just have the user just look at the possible Error variants, rather than mentioning them in doc comments (which can easily become outdated)
} | ||
|
||
res |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't seem true, it will be dropped in handle_introduce_seconded_candidate
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what are you referring to?
…e-parachains-allow-forks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will have a closer look tomorrow, from a quick pass it seems that advance_scope
does not actually advance scope, but instead the advancing logic still remains in handle_active_leaves_update
. You mentioned that and there may be good reasons, but the very least the function name seems off then.
polkadot/node/core/prospective-parachains/src/fragment_chain/mod.rs
Outdated
Show resolved
Hide resolved
polkadot/node/core/prospective-parachains/src/fragment_chain/mod.rs
Outdated
Show resolved
Hide resolved
…e-parachains-allow-forks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Approving modulo the todo.
self.can_add_candidate_as_potential(candidate)?; | ||
|
||
// This clone is cheap, as it uses an Arc for the expensive stuff. | ||
// We can't consume the candidate because other fragment chains may use it also. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Best to leave that decision to the caller. If we need a CandidateEntry and not a &CandidateEntry, we should just state it so and leave it to the caller whether any cloning is required or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the idea was to not clone unless we have to. There's no need to clone if the self.can_add_candidate_as_potential(candidate)?;
returns an Error
// form a chain with the latest included head. | ||
fragment_chain.populate_chain(&mut candidates_pending_availability); | ||
|
||
// TODO: return error if not all candidates were introduced successfully. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reminder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I chose to log a message instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent work!
polkadot/node/core/prospective-parachains/src/fragment_chain/mod.rs
Outdated
Show resolved
Hide resolved
Resolves #4800 In #4035, we removed support for parachain forks and cycles and added support for backing unconnected candidates (candidates for which we don't yet know the full path to the latest included block), which is useful for elastic scaling (parachains using multiple cores). Removing support for backing forks turned out to be a bad idea, as there are legitimate cases for a parachain to fork (if they have other consensus mechanism for example, like BABE or PoW). This leads to validators getting lower backing rewards (depending on whether they back the winning fork or not) and a higher pressure on only the half of the backing group (during availability-distribution for example). Since we don't yet have approval voting rewards, backing rewards are a pretty big deal (which may change in the future). A backing group is now allowed to back forks. Once a candidate becomes backed (has the minimum backing votes), we don't accept new forks unless they adhere to the new fork selection rule (have a lower candidate hash). This helps with keeping the implementation simpler, since forks will only be taken into account for candidates which are not backed yet (only seconded). Having this fork selection rule also helps with reducing the work backing validators need to do, since they have a shared way of picking the winning fork. Once they see a candidate backed, they can all decide to back a fork and not accept new ones. But they still accept new ones during the seconding phase (until the backing quorum is reached). Therefore, a block author which is not part of the backing group will likely not even see the forks (only the winning one). Just as before, a parachain producing forks will still not be able to leverage elastic scaling but will still work with a single core. Also, cycles are still not accepted. `CandidateStorage` is no longer a subsystem-wide construct. It was previously holding candidates from all relay chain forks and complicated the code. Each fragment chain now holds their candidate chain and their potential candidates. This should not increase the storage consumption since the heavy candidate data is already wrapped in an Arc and shared. It however allows for great simplifications and increase readability. `FragmentChain`s are now only creating a chain with backed candidates and the fork selection rule. As said before, `FragmentChain`s are now also responsible for maintaining their own potential candidate storage. Since we no longer have the subsytem-wide `CandidateStorage`, when getting a new leaf update, we use the storage of our latest ancestor, which may contain candidates seconded/backed that are still in scope. When a candidate is backed, the fragment chains which hold it are recreated (due to the fork selection rule, it could trigger a "reorg" of the fragment chain). I generally tried to simplify the subsystem and not introduce unneccessary optimisations that would otherwise complicate the code and not gain us much (fragment chains wouldn't realistically ever hold many candidates) TODO: - [x] update metrics - [x] update docs and comments - [x] fix and add unit tests - [x] tested with fork-producing parachain - [x] tested with cycle-producing parachain - [x] versi test - [x] prdoc
Resolves #4800
Problem
In #4035, we removed support for parachain forks and cycles and added support for backing unconnected candidates (candidates for which we don't yet know the full path to the latest included block), which is useful for elastic scaling (parachains using multiple cores).
Removing support for backing forks turned out to be a bad idea, as there are legitimate cases for a parachain to fork (if they have other consensus mechanism for example, like BABE or PoW). This leads to validators getting lower backing rewards (depending on whether they back the winning fork or not) and a higher pressure on only the half of the backing group (during availability-distribution for example). Since we don't yet have approval voting rewards, backing rewards are a pretty big deal (which may change in the future).
Description
A backing group is now allowed to back forks. Once a candidate becomes backed (has the minimum backing votes), we don't accept new forks unless they adhere to the new fork selection rule (have a lower candidate hash).
This helps with keeping the implementation simpler, since forks will only be taken into account for candidates which are not backed yet (only seconded).
Having this fork selection rule also helps with reducing the work backing validators need to do, since they have a shared way of picking the winning fork. Once they see a candidate backed, they can all decide to back a fork and not accept new ones.
But they still accept new ones during the seconding phase (until the backing quorum is reached).
Therefore, a block author which is not part of the backing group will likely not even see the forks (only the winning one).
Just as before, a parachain producing forks will still not be able to leverage elastic scaling but will still work with a single core. Also, cycles are still not accepted.
Some implementation details
CandidateStorage
is no longer a subsystem-wide construct. It was previously holding candidates from all relay chain forks and complicated the code. Each fragment chain now holds their candidate chain and their potential candidates. This should not increase the storage consumption since the heavy candidate data is already wrapped in an Arc and shared. It however allows for great simplifications and increase readability.FragmentChain
s are now only creating a chain with backed candidates and the fork selection rule. As said before,FragmentChain
s are now also responsible for maintaining their own potential candidate storage.Since we no longer have the subsytem-wide
CandidateStorage
, when getting a new leaf update, we use the storage of our latest ancestor, which may contain candidates seconded/backed that are still in scope.When a candidate is backed, the fragment chains which hold it are recreated (due to the fork selection rule, it could trigger a "reorg" of the fragment chain).
I generally tried to simplify the subsystem and not introduce unneccessary optimisations that would otherwise complicate the code and not gain us much (fragment chains wouldn't realistically ever hold many candidates)
TODO: