-
Notifications
You must be signed in to change notification settings - Fork 746
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prospective-parachains: allow chained candidates coming out-of-order #3541
Comments
Sounds like it would create additional complexity in |
that's a good idea. candidate-backing is less complex than prospective-parachains, so adding this logic wouldn't hurt as bad |
Although we never liked the restriction, we envisioned that initially B and C must come from the same collator.
We could've candidate C provide a merkle proof into the header of B, or the whole header of B. If the parachain uses sufficiently parallel algorithms, then yes B and C could be built without knowledge of one another. We could consider this "tree" scaling, instead of of "elastic" scaling. An approach goes: parachains have short forks of "helper" blocks, maybe based off some common "master" parachain parent. A future "master" parachain block could reintegrate these forks by proving their state changes mesh correctly. All this feels like "future work" though. |
That does not change anything, because that same collator would still send the collations to different backing groups. We just need to be more flexible and accept an "unconnected" fork, we just need to be careful to not open up a spam vector. |
Given the added complexity due to elastic scaling (and also more potential spam vectors), it would be good if we could drop support for parachain forks. Note we are not talking about relay chain forks here, it is clear that there will be a parachain fork for each relay chain fork, we are not concerned about these, only about multiple parachain forks for a single relay chain fork. We are pretty sure that (a) parachains have no good reason to fork (b) would be good to have verified, hence I would suggest adding a metric to prospective parachains recording the number of forks we see in the wild for parachains. The result should be improved validator performance, less complexity, less spam vectors, at the cost of forky parachains taking a performance hit. Thinking about this some more, actually I don't think we need metrics here:
|
So by not allowing forks, we have not yet solved "unconnected" collations though. If they are not connected, they could be whatever. Nodes which are not even collators could just provide us old collations they found and we would need to believe this is an unconnected fork and accept it. To resolve this, I would suggest for MVP to introduce the notion of extended validation:
The idea is, that the validation of a candidate is only considered successful once we have seen statements for its dependencies. This allows still for parallel validation of candidates, but sequences the results ... adding a bit of delay, which should not matter much with asynchronous backing. Consequences: For second and third cores in elastic scaling there will be no PoV distribution between validators. The collator need to provide the collation to sufficient validators himself. ReasoningIf we consider having the full chain as part of a successful validation, then elastic scaling does not worsen the situation for validators. Yes nodes can mess with us, but they could equally well just provide an invalid collation. It is arguably a easier to provide invalid resource consuming collations now, but by making the completing of the chain part of the validation the issue stays isolated on the validator and it can react to garbage candidates, by dropping reputation of the peer, especially with #616 validators will have proper means for that. If we instead distributed statements earlier, then garbage candidates would no longer be an isolated problem, but other validators will also already have invested work which then needs to be discarded. These validators could then not meaningfully "punish" anybody. OutlookWith proper dependency tracking we could distribute statements early, iff the dependency (CandidateHash of parent) was not only part of the descriptor, but part of the We could make it so that those unconnected collations can still be put into a block: They won't be enacted, but the backers could still get rewards. Thus a misbehaving collator would again only harm the parachain (and it has to be a collator as the provided collation was valid) and we can therefore again leave taking care of those nasty nodes to the Parachain. Implementation
|
Not sure I follow which problem we're trying to solve. Malicious collators trying to waste resources of validators? They can always add an invalid parent head data and never reveal grandparent candidate. But they could also just send an invalid candidate and create a new identify after we disconnect them. I guess it's a question of which candidates/collators do we prioritize. Not sure if sending the full chain of head data helps here though. Could you elaborate on how it helps exactly? What about parachains that genuinely have forks, for example PoW parachains? In addition to that, do we want to support non-blockchain based parachains in the future, e.g. DAG-based "chains"? |
Completely reworked the comment. For the new identity #616 should help.
I don't think that PoW parachains make any sense and if so, they would have poor performance anyways. All the ignorance on forks does is that it reduces the parachain performance if there are forks ... also 6s block times for a proof of work parachain seem unrealistic anyway. We force them to be a chain ... for our intents or purposes. This does not limit flexibilty in any way. In particular for the final version, all we need is that candidates form a chain ... given that candidates include the relay parent, this is almost guaranteed just because of this requirement. ParaDags could e.g. add a sequence number to ensure this in all cases. (Candidates based on the same relay parent) |
PoW parachains could be ignored for many reasons, including them being too slow for elastic scaling, but more because they cannot really contact backing groups or do off-chain messaging in sensible ways.
There an issue that backing groups share the candidates internally first, and only later to other validaters, so you want to avoid this internal sharing for spam, but your solution works then why not make collators do this anyways? It's worse latency with slow collators I guess? We've a race for reporting attestations on-chain under your proposal: We've many groups that've validated candidates, but none of them know if they can gossip their candidates. We need maybe 1/4 second before each being gossiped triggers the next to be gossiped, so we'll only manage 4 elastic scalaing candidates per extra second after the validation time.
We should never pay backers until finality anyways, when we'd pay approval checkers too, but I suppose internally valid candidates would be approved, even though they never fit into the parachain. That's fine.
I do think there is some value in approving dangling candidates who never fit into any chain. We do not have to permit this for elastic scaling, but it really does provide a gateway for parachains where the collators have not really figured everything out themselves yet. As I've said elsewhere, I'd favor an "asyncronous accumulation" replacement for the accumulate function in the core jam proposal. It'd just permit these dangling "helper" blocks, which then future parachain blocks could integrate into the parachain using the parachain's own logic, not bound by the execution time of code on the relay chain. At a high level, the big trade off for dangling blocks is that we're limiting how parachains can handle their own economics: VCs, etc cannot so easily subsidise collators. It's maybe fine if elastic scaling works with VC funding, but then corejam or whatever winds up much less friendly to VC funding. I donno.. Ideas: If headers were hashed like I've not followed core time closely enough, but can we now be sure that the relay chain gets paid for blocks which get approved but never fit into some chain? At least this must surely be doable. |
I'd like to propose something in between. Let's accept a bounded number of forks per depth but at the same time we disallow cycles and force DAGs to include some nonce. This should remove complexity and make the prospective parachains subsystem interface more straight forward, as we can't have the same candidate at multiple depths and/or relay parents. Parachains which are too forkful take a performance hit( That being said, to tackle the issue at hand, I'd go for a simpler approach in this WDYT ? |
Just realized: There is no risk of random nodes messing with us with historic candidates as we are still limiting the allowed relay parents. So all valid candidates even unconnected ones must have been produced recently by an actual collator. |
How could we even enforce that we disallow cycles? Just ignore them and hope that they won't blast out into bricking the relay chain? |
Yeah, IMO for the general case we really cannot, but for what is in scope of the prospective parachains at any point in time we could:
Additionally it might be good to only once allow the introduction of a candidate to the fragment tree. |
We can right now not disallow them, but we don't need to provide a guarantee that a parachain having cycles works. So we can drop (if it gains us something significant) support in that sense, we still need for validators not to crash if people do it anyway ;-) |
Sounds reasonable. Just dropping duplicates for whatever we have in view seems sensible, if it makes things easier. |
I have some updates from my prospective-parachains immersion. As this issue says, we've been under the impression that, if candidate B (which builds on top of candidate A) is supplied to a backing group before the group finds out (via statement-distribution) about candidate A, candidate B will be discarded. Based on this, I've been rewriting prospective-parachains to allow for a number of "orphan" candidates (still a draft yet). I just discovered though that this is already somewhat tackled in collator-protocol. This fixes the problem but hurts pipelining. Candidate B will not be validated and backed until Candidate A is backed and the statements reach the other backing group. We can therefore get on with testing on a testnet and see how big of an issue this is. In the meantime, I'll get on with the changes. In local zombienet tests, this is not an issue yet because the network is optimal and validation times are low |
Reworks prospective-parachains so that we allow a number of unconnected candidates (for which we don't know the parent candidate yet). Needed for elastic scaling: #3541. Without this, candidate B will not be validated and backed until candidate A (its parent) is validated and a backing statement reaches the validator. Due to the high complexity of the subsystem, I rewrote parts of it so that we don't concern ourselves with candidates which form cycles or which form parachain forks. We now have "Fragment chains" instead of "Fragment trees". This greatly simplifies some of the code and is a compromise we can make. We just need to make sure that cycle-producing parachains don't brick the relay chain and that fork-producing parachains can still make some progress (on one core at least). The only forks that are allowed are those on the relay chain, obviously. Unconnected candidates are kept in the `CandidateStorage` and whenever a new candidate is introduced, we try to repopulate the chain with as many candidates as we can. Also fixes #3219 Guide changes will be done as part of: #3699 TODOs: - [x] see if we can replace the `Cow` over the candidate commitments with an `Arc` over the entire `ProspectiveCandidate`. It's only being overwritten in unit tests. We can work around that. - [x] finish fragment_chain unit tests - [x] add more prospective-parachains subsystem tests - [x] test with zombienet what happens if a parachain is creating cycles (it should not brick the relay chain). - [x] test with zombienet a parachain that is creating forks. it should keep producing blocks from time to time (one bad collator should not DOS the parachain, even if throughput decreases) - [x] add some more logs and metrics - [x] add prdoc and remove the "silent" label --------- Signed-off-by: Andrei Sandu <andrei-mihail@parity.io> Co-authored-by: Andrei Sandu <andrei-mihail@parity.io>
Reworks prospective-parachains so that we allow a number of unconnected candidates (for which we don't know the parent candidate yet). Needed for elastic scaling: paritytech#3541. Without this, candidate B will not be validated and backed until candidate A (its parent) is validated and a backing statement reaches the validator. Due to the high complexity of the subsystem, I rewrote parts of it so that we don't concern ourselves with candidates which form cycles or which form parachain forks. We now have "Fragment chains" instead of "Fragment trees". This greatly simplifies some of the code and is a compromise we can make. We just need to make sure that cycle-producing parachains don't brick the relay chain and that fork-producing parachains can still make some progress (on one core at least). The only forks that are allowed are those on the relay chain, obviously. Unconnected candidates are kept in the `CandidateStorage` and whenever a new candidate is introduced, we try to repopulate the chain with as many candidates as we can. Also fixes paritytech#3219 Guide changes will be done as part of: paritytech#3699 TODOs: - [x] see if we can replace the `Cow` over the candidate commitments with an `Arc` over the entire `ProspectiveCandidate`. It's only being overwritten in unit tests. We can work around that. - [x] finish fragment_chain unit tests - [x] add more prospective-parachains subsystem tests - [x] test with zombienet what happens if a parachain is creating cycles (it should not brick the relay chain). - [x] test with zombienet a parachain that is creating forks. it should keep producing blocks from time to time (one bad collator should not DOS the parachain, even if throughput decreases) - [x] add some more logs and metrics - [x] add prdoc and remove the "silent" label --------- Signed-off-by: Andrei Sandu <andrei-mihail@parity.io> Co-authored-by: Andrei Sandu <andrei-mihail@parity.io>
Reworks prospective-parachains so that we allow a number of unconnected candidates (for which we don't know the parent candidate yet). Needed for elastic scaling: paritytech#3541. Without this, candidate B will not be validated and backed until candidate A (its parent) is validated and a backing statement reaches the validator. Due to the high complexity of the subsystem, I rewrote parts of it so that we don't concern ourselves with candidates which form cycles or which form parachain forks. We now have "Fragment chains" instead of "Fragment trees". This greatly simplifies some of the code and is a compromise we can make. We just need to make sure that cycle-producing parachains don't brick the relay chain and that fork-producing parachains can still make some progress (on one core at least). The only forks that are allowed are those on the relay chain, obviously. Unconnected candidates are kept in the `CandidateStorage` and whenever a new candidate is introduced, we try to repopulate the chain with as many candidates as we can. Also fixes paritytech#3219 Guide changes will be done as part of: paritytech#3699 TODOs: - [x] see if we can replace the `Cow` over the candidate commitments with an `Arc` over the entire `ProspectiveCandidate`. It's only being overwritten in unit tests. We can work around that. - [x] finish fragment_chain unit tests - [x] add more prospective-parachains subsystem tests - [x] test with zombienet what happens if a parachain is creating cycles (it should not brick the relay chain). - [x] test with zombienet a parachain that is creating forks. it should keep producing blocks from time to time (one bad collator should not DOS the parachain, even if throughput decreases) - [x] add some more logs and metrics - [x] add prdoc and remove the "silent" label --------- Signed-off-by: Andrei Sandu <andrei-mihail@parity.io> Co-authored-by: Andrei Sandu <andrei-mihail@parity.io>
Reworks prospective-parachains so that we allow a number of unconnected candidates (for which we don't know the parent candidate yet). Needed for elastic scaling: paritytech#3541. Without this, candidate B will not be validated and backed until candidate A (its parent) is validated and a backing statement reaches the validator. Due to the high complexity of the subsystem, I rewrote parts of it so that we don't concern ourselves with candidates which form cycles or which form parachain forks. We now have "Fragment chains" instead of "Fragment trees". This greatly simplifies some of the code and is a compromise we can make. We just need to make sure that cycle-producing parachains don't brick the relay chain and that fork-producing parachains can still make some progress (on one core at least). The only forks that are allowed are those on the relay chain, obviously. Unconnected candidates are kept in the `CandidateStorage` and whenever a new candidate is introduced, we try to repopulate the chain with as many candidates as we can. Also fixes paritytech#3219 Guide changes will be done as part of: paritytech#3699 TODOs: - [x] see if we can replace the `Cow` over the candidate commitments with an `Arc` over the entire `ProspectiveCandidate`. It's only being overwritten in unit tests. We can work around that. - [x] finish fragment_chain unit tests - [x] add more prospective-parachains subsystem tests - [x] test with zombienet what happens if a parachain is creating cycles (it should not brick the relay chain). - [x] test with zombienet a parachain that is creating forks. it should keep producing blocks from time to time (one bad collator should not DOS the parachain, even if throughput decreases) - [x] add some more logs and metrics - [x] add prdoc and remove the "silent" label --------- Signed-off-by: Andrei Sandu <andrei-mihail@parity.io> Co-authored-by: Andrei Sandu <andrei-mihail@parity.io>
Assume latest backed candidate of a parachain is A.
With elastic-scaling enabled, one collator builds candidate B and another collator candidate C.
There's currently no restriction on the order in which these collations will be advertised and supplied to the soon-to-be block author. If prospective-parachains learns of candidate C before candidate B, candidate C will be discarded.
For elastic-scaling to properly work, we need to allow some number of disjoint trees to form, even if they don't build directly on top of the latest backed candidate.
The alternative is collator cooperation and hoping that network latency/availability does not defeat that.
The text was updated successfully, but these errors were encountered: