Provisioner: Elastic Scaling #3130

eskimor · 2024-01-30T10:52:22Z

Check how many cores are currently available for a parachain and fetch that many candidates for paras inherent - make dependency chain check.

alindima · 2024-02-01T12:41:13Z

will need to also take into account changes made for fixing: #3141

alindima · 2024-02-01T13:19:37Z

looking into this, there are two dependency checks that need to be done:

check the dependencies between the candidates already occupying the cores. It's quite tricky to do this dependency check in the provisioner, because we don't have access to the parent_head_data (which is part of the PVD). We can offload this to the prospective-parachains subsystem, and supply a required_path that is not neccesarily ordered. Prospective-parachains will order it for us. The alternative would be to keep fragment trees in the provisioner as well, which I don't want to do.
check that the candidates we propose for backing also form a chain. we don't need to do that because prospective-parachains only provides valid chains. And the runtime will also check this in paras_inherent. so we're good

alindima · 2024-02-05T09:03:37Z

check the dependencies between the candidates already occupying the cores. It's quite tricky to do this dependency check in the provisioner, because we don't have access to the parent_head_data (which is part of the PVD). We can offload this to the prospective-parachains subsystem, and supply a required_path that is not neccesarily ordered. Prospective-parachains will order it for us. The alternative would be to keep fragment trees in the provisioner as well, which I don't want to do.

For this, I think we could use a similar algorithm as suggested here: #3131 (comment)

alindima · 2024-02-06T11:39:42Z

an important question to answer here is what do we do with candidates that are not included within one relay chain block.

Say we have A->B->C that are backed and pending availability and 3 cores for this parachain. A and B are included.
Say there's also a valid candidate D that descends from B.

What do we do now? C is still pending availability but we get the chance to back a new block.
Should it be:

another block that descends from B (which is D). This could be valuable because C could get timed out of the core. This would also require that in the inclusion runtime module, when a candidate is made available, we free any cores that would otherwise lead to invalid transitions (there's no point in keeping C in the core if D was already included). However, if C gets included first, we wasted a core by backing D.
or a block that descends from C (being optimistic that C will get included). This could lead to having all other cores depend on the availability of C, which could stall progress until C gets included or timed out.

I think the best option is 1

alindima · 2024-02-06T15:12:12Z

thought about the above a bit more:

option 1 would basically enable on-chain forks for parachains. We'd need code in the runtime and the provisioner to account for forks and free cores that build on top of an old fork once a competing fork is included. It also kind of modifies the meaning of the candidate timeout.

I think the complexity is not justified for option 1

sandreim · 2024-02-06T17:15:01Z

Yes, we'd want to avoid creating more complexity given the current status quo. Assuming candidate timeouts are very unlikely, my best bet would be all or nothing approach:

A, B, C get backed at RCB
We start the onchain inclusion process for all cores of the para when any of the candidates become available.
if only A, B have become available at RCB + 1, then we clear C's core (availability canceled/forced timeout) and only include A, B. Collators will have to rebuild on top of B and that is fine
if only B or C is included, then yeah, the para has lost a relay chain slot but that's life

alindima · 2024-02-06T17:23:04Z

That's a good idea! Would sacrifice a bit of throughput in weird scenarios but is much less complex

alindima · 2024-02-09T10:03:23Z

I implemented a version of the policy described in option 2 above. See #3233 and read the PR description for a detailed explanation of the proposed runtime policy

#3130 builds on top of #3160 Processes the availability cores and builds a record of how many candidates it should request from prospective-parachains and their predecessors. Tries to supply as many candidates as the runtime can back. Note that the runtime changes to back multiple candidates per para are not yet done, but this paves the way for it. The following backing/inclusion policy is assumed: 1. the runtime will never back candidates of the same para which don't form a chain with the already backed candidates. Even if the others are still pending availability. We're optimistic that they won't time out and we don't want to back parachain forks (as the complexity would be huge). 2. if a candidate is timed out of the core before being included, all of its successors occupying a core will be evicted. 3. only the candidates which are made available and form a chain starting from the on-chain para head may be included/enacted and cleared from the cores. In other words, if para head is at A and the cores are occupied by B->C->D, and B and D are made available, only B will be included and its core cleared. C and D will remain on the cores awaiting for C to be made available or timed out. As point (2) above already says, if C is timed out, D will also be dropped. 4. The runtime will deduplicate candidates which form a cycle. For example if the provisioner supplies candidates A->B->A, the runtime will only back A (as the state output will be the same) Note that if a candidate is timed out, we don't guarantee that in the next relay chain block the block author will be able to fill all of the timed out cores of the para. That increases complexity by a lot. Instead, the provisioner will supply N candidates where N is the number of candidates timed out, but doesn't include their successors which will be also deleted by the runtime. This'll be backfilled in the next relay chain block. Adjacent changes: - Also fixes: #3141 - For non prospective-parachains, don't supply multiple candidates per para (we can't have elastic scaling without prospective parachains enabled). paras_inherent should already sanitise this input but it's more efficient this way. Note: all of these changes are backwards-compatible with the non-elastic-scaling scenario (one core per para).

…ch#3233) paritytech#3130 builds on top of paritytech#3160 Processes the availability cores and builds a record of how many candidates it should request from prospective-parachains and their predecessors. Tries to supply as many candidates as the runtime can back. Note that the runtime changes to back multiple candidates per para are not yet done, but this paves the way for it. The following backing/inclusion policy is assumed: 1. the runtime will never back candidates of the same para which don't form a chain with the already backed candidates. Even if the others are still pending availability. We're optimistic that they won't time out and we don't want to back parachain forks (as the complexity would be huge). 2. if a candidate is timed out of the core before being included, all of its successors occupying a core will be evicted. 3. only the candidates which are made available and form a chain starting from the on-chain para head may be included/enacted and cleared from the cores. In other words, if para head is at A and the cores are occupied by B->C->D, and B and D are made available, only B will be included and its core cleared. C and D will remain on the cores awaiting for C to be made available or timed out. As point (2) above already says, if C is timed out, D will also be dropped. 4. The runtime will deduplicate candidates which form a cycle. For example if the provisioner supplies candidates A->B->A, the runtime will only back A (as the state output will be the same) Note that if a candidate is timed out, we don't guarantee that in the next relay chain block the block author will be able to fill all of the timed out cores of the para. That increases complexity by a lot. Instead, the provisioner will supply N candidates where N is the number of candidates timed out, but doesn't include their successors which will be also deleted by the runtime. This'll be backfilled in the next relay chain block. Adjacent changes: - Also fixes: paritytech#3141 - For non prospective-parachains, don't supply multiple candidates per para (we can't have elastic scaling without prospective parachains enabled). paras_inherent should already sanitise this input but it's more efficient this way. Note: all of these changes are backwards-compatible with the non-elastic-scaling scenario (one core per para).

eskimor mentioned this issue Jan 30, 2024

Elastic Scaling #1829

Open

eskimor added this to parachains team board Jan 30, 2024

github-project-automation bot moved this to Backlog in parachains team board Jan 30, 2024

alindima self-assigned this Jan 30, 2024

alindima moved this from Backlog to In Progress in parachains team board Feb 1, 2024

alindima mentioned this issue Feb 6, 2024

provisioner: allow multiple cores assigned to the same para #3233

Merged

eskimor moved this from In Progress to Review in progress in parachains team board Mar 1, 2024

alindima closed this as completed Mar 11, 2024

github-project-automation bot moved this from Review in progress to Completed in parachains team board Mar 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provisioner: Elastic Scaling #3130

Provisioner: Elastic Scaling #3130

eskimor commented Jan 30, 2024 •

edited

Loading

alindima commented Feb 1, 2024

alindima commented Feb 1, 2024

alindima commented Feb 5, 2024

alindima commented Feb 6, 2024

alindima commented Feb 6, 2024

sandreim commented Feb 6, 2024 •

edited

Loading

alindima commented Feb 6, 2024

alindima commented Feb 9, 2024

Provisioner: Elastic Scaling #3130

Provisioner: Elastic Scaling #3130

Comments

eskimor commented Jan 30, 2024 • edited Loading

alindima commented Feb 1, 2024

alindima commented Feb 1, 2024

alindima commented Feb 5, 2024

alindima commented Feb 6, 2024

alindima commented Feb 6, 2024

sandreim commented Feb 6, 2024 • edited Loading

alindima commented Feb 6, 2024

alindima commented Feb 9, 2024

eskimor commented Jan 30, 2024 •

edited

Loading

sandreim commented Feb 6, 2024 •

edited

Loading