Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provisioner: Elastic Scaling #3130

Closed
Tracked by #1829
eskimor opened this issue Jan 30, 2024 · 8 comments
Closed
Tracked by #1829

Provisioner: Elastic Scaling #3130

eskimor opened this issue Jan 30, 2024 · 8 comments
Assignees

Comments

@eskimor
Copy link
Member

eskimor commented Jan 30, 2024

Check how many cores are currently available for a parachain and fetch that many candidates for paras inherent - make dependency chain check.

@alindima alindima self-assigned this Jan 30, 2024
@alindima alindima moved this from Backlog to In Progress in parachains team board Feb 1, 2024
@alindima
Copy link
Contributor

alindima commented Feb 1, 2024

will need to also take into account changes made for fixing: #3141

@alindima
Copy link
Contributor

alindima commented Feb 1, 2024

looking into this, there are two dependency checks that need to be done:

  1. check the dependencies between the candidates already occupying the cores. It's quite tricky to do this dependency check in the provisioner, because we don't have access to the parent_head_data (which is part of the PVD). We can offload this to the prospective-parachains subsystem, and supply a required_path that is not neccesarily ordered. Prospective-parachains will order it for us. The alternative would be to keep fragment trees in the provisioner as well, which I don't want to do.
  2. check that the candidates we propose for backing also form a chain. we don't need to do that because prospective-parachains only provides valid chains. And the runtime will also check this in paras_inherent. so we're good

@alindima
Copy link
Contributor

alindima commented Feb 5, 2024

  1. check the dependencies between the candidates already occupying the cores. It's quite tricky to do this dependency check in the provisioner, because we don't have access to the parent_head_data (which is part of the PVD). We can offload this to the prospective-parachains subsystem, and supply a required_path that is not neccesarily ordered. Prospective-parachains will order it for us. The alternative would be to keep fragment trees in the provisioner as well, which I don't want to do.

For this, I think we could use a similar algorithm as suggested here: #3131 (comment)

@alindima
Copy link
Contributor

alindima commented Feb 6, 2024

an important question to answer here is what do we do with candidates that are not included within one relay chain block.

Say we have A->B->C that are backed and pending availability and 3 cores for this parachain. A and B are included.
Say there's also a valid candidate D that descends from B.

What do we do now? C is still pending availability but we get the chance to back a new block.
Should it be:

  1. another block that descends from B (which is D). This could be valuable because C could get timed out of the core. This would also require that in the inclusion runtime module, when a candidate is made available, we free any cores that would otherwise lead to invalid transitions (there's no point in keeping C in the core if D was already included). However, if C gets included first, we wasted a core by backing D.
  2. or a block that descends from C (being optimistic that C will get included). This could lead to having all other cores depend on the availability of C, which could stall progress until C gets included or timed out.

I think the best option is 1

@alindima
Copy link
Contributor

alindima commented Feb 6, 2024

thought about the above a bit more:

option 1 would basically enable on-chain forks for parachains. We'd need code in the runtime and the provisioner to account for forks and free cores that build on top of an old fork once a competing fork is included. It also kind of modifies the meaning of the candidate timeout.

I think the complexity is not justified for option 1

@sandreim
Copy link
Contributor

sandreim commented Feb 6, 2024

Yes, we'd want to avoid creating more complexity given the current status quo. Assuming candidate timeouts are very unlikely, my best bet would be all or nothing approach:

  • A, B, C get backed at RCB
  • We start the onchain inclusion process for all cores of the para when any of the candidates become available.
  • if only A, B have become available at RCB + 1, then we clear C's core (availability canceled/forced timeout) and only include A, B. Collators will have to rebuild on top of B and that is fine
  • if only B or C is included, then yeah, the para has lost a relay chain slot but that's life

@alindima
Copy link
Contributor

alindima commented Feb 6, 2024

That's a good idea! Would sacrifice a bit of throughput in weird scenarios but is much less complex

@alindima
Copy link
Contributor

alindima commented Feb 9, 2024

I implemented a version of the policy described in option 2 above. See #3233 and read the PR description for a detailed explanation of the proposed runtime policy

github-merge-queue bot pushed a commit that referenced this issue Mar 1, 2024
#3130

builds on top of #3160

Processes the availability cores and builds a record of how many
candidates it should request from prospective-parachains and their
predecessors.
Tries to supply as many candidates as the runtime can back. Note that
the runtime changes to back multiple candidates per para are not yet
done, but this paves the way for it.

The following backing/inclusion policy is assumed:
1. the runtime will never back candidates of the same para which don't
form a chain with the already backed candidates. Even if the others are
still pending availability. We're optimistic that they won't time out
and we don't want to back parachain forks (as the complexity would be
huge).
2. if a candidate is timed out of the core before being included, all of
its successors occupying a core will be evicted.
3. only the candidates which are made available and form a chain
starting from the on-chain para head may be included/enacted and cleared
from the cores. In other words, if para head is at A and the cores are
occupied by B->C->D, and B and D are made available, only B will be
included and its core cleared. C and D will remain on the cores awaiting
for C to be made available or timed out. As point (2) above already
says, if C is timed out, D will also be dropped.
4. The runtime will deduplicate candidates which form a cycle. For
example if the provisioner supplies candidates A->B->A, the runtime will
only back A (as the state output will be the same)

Note that if a candidate is timed out, we don't guarantee that in the
next relay chain block the block author will be able to fill all of the
timed out cores of the para. That increases complexity by a lot.
Instead, the provisioner will supply N candidates where N is the number
of candidates timed out, but doesn't include their successors which will
be also deleted by the runtime. This'll be backfilled in the next relay
chain block.

Adjacent changes:
- Also fixes: #3141
- For non prospective-parachains, don't supply multiple candidates per
para (we can't have elastic scaling without prospective parachains
enabled). paras_inherent should already sanitise this input but it's
more efficient this way.

Note: all of these changes are backwards-compatible with the
non-elastic-scaling scenario (one core per para).
@eskimor eskimor moved this from In Progress to Review in progress in parachains team board Mar 1, 2024
skunert pushed a commit to skunert/polkadot-sdk that referenced this issue Mar 4, 2024
…ch#3233)

paritytech#3130

builds on top of paritytech#3160

Processes the availability cores and builds a record of how many
candidates it should request from prospective-parachains and their
predecessors.
Tries to supply as many candidates as the runtime can back. Note that
the runtime changes to back multiple candidates per para are not yet
done, but this paves the way for it.

The following backing/inclusion policy is assumed:
1. the runtime will never back candidates of the same para which don't
form a chain with the already backed candidates. Even if the others are
still pending availability. We're optimistic that they won't time out
and we don't want to back parachain forks (as the complexity would be
huge).
2. if a candidate is timed out of the core before being included, all of
its successors occupying a core will be evicted.
3. only the candidates which are made available and form a chain
starting from the on-chain para head may be included/enacted and cleared
from the cores. In other words, if para head is at A and the cores are
occupied by B->C->D, and B and D are made available, only B will be
included and its core cleared. C and D will remain on the cores awaiting
for C to be made available or timed out. As point (2) above already
says, if C is timed out, D will also be dropped.
4. The runtime will deduplicate candidates which form a cycle. For
example if the provisioner supplies candidates A->B->A, the runtime will
only back A (as the state output will be the same)

Note that if a candidate is timed out, we don't guarantee that in the
next relay chain block the block author will be able to fill all of the
timed out cores of the para. That increases complexity by a lot.
Instead, the provisioner will supply N candidates where N is the number
of candidates timed out, but doesn't include their successors which will
be also deleted by the runtime. This'll be backfilled in the next relay
chain block.

Adjacent changes:
- Also fixes: paritytech#3141
- For non prospective-parachains, don't supply multiple candidates per
para (we can't have elastic scaling without prospective parachains
enabled). paras_inherent should already sanitise this input but it's
more efficient this way.

Note: all of these changes are backwards-compatible with the
non-elastic-scaling scenario (one core per para).
@github-project-automation github-project-automation bot moved this from Review in progress to Completed in parachains team board Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Completed
Development

No branches or pull requests

3 participants