Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

disputes runtime: don't fill up the block with only dispute votes #6919

Closed
sandreim opened this issue Mar 20, 2023 · 7 comments
Closed

disputes runtime: don't fill up the block with only dispute votes #6919

sandreim opened this issue Mar 20, 2023 · 7 comments
Labels
I3-bug Fails to follow expected behavior. U1-asap No need to stop dead in your tracks, however issue should be addressed as soon as possible.

Comments

@sandreim
Copy link
Contributor

Here we drop processing anything else in the inherent and we only fill the block with disputes. As we've seen in #6862 this is not a good idea due to the expensive nature of the signature checks and could lead to high import block times when under load.

We should consider filling up the block up to 50% with disputes. This would lead to slower finality but will avoid the liveliness issues we have observed.

https://grafana.parity-mgmt.parity.io/goto/leq6DiBVz?orgId=1 shows such a dispute filled block taking a huge amount of time to build and import, all while the weight reported is < 2s.

Screenshot 2023-03-20 at 15 12 03

@sandreim sandreim added I3-bug Fails to follow expected behavior. U1-asap No need to stop dead in your tracks, however issue should be addressed as soon as possible. labels Mar 20, 2023
@sandreim sandreim changed the title disputes runtime: don't fill up the block with disputes. disputes runtime: don't fill up the block with only dispute votes Mar 20, 2023
@eskimor
Copy link
Member

eskimor commented Mar 21, 2023

Worth mentioning, that this should never happen if the filtering in create_inherent was working correctly. Also if weights are correct, shouldn't this be fine?

@sandreim
Copy link
Contributor Author

My current thinking here is that due to the load the execution of the block would take more time then what the weights suggest. We also have this issue to look at improving this paritytech/polkadot-sdk#18

We're also working to measure how much individual signatures take in the runtime to get a better understanding.

@eskimor
Copy link
Member

eskimor commented Mar 22, 2023

No this is the wrong approach. We actually want a liveness issue if there are lot's of disputes. Better to have longer block times for parachains than an overloaded system. With validator disabling we should not even end up in this situation, but if we do this is a safeguard and should not be removed.

@sandreim
Copy link
Contributor Author

sandreim commented Mar 22, 2023

I don't agree with this approach of affecting liveness :). This somewhat implies that our system breaks under heavy dispute load and such load is never desirable. Which is not really the case as far as I have seen. I have no hard data to back this up, but based on what I observed during testing we'd still get slow down block production due to finality lag, or as it happened during the recent Polkadot issue, the system actually did really well (except the initializer heavy cost), there was no reason for affecting liveness. There are blocks during that period of time which only contain disputes such that this contributed to even longer parachain stalls.

@eskimor
Copy link
Member

eskimor commented Mar 22, 2023

If we need all the available block space for disputes, then we should use it for exactly that. Security trumps liveness.

The correct solution is to avoid disputes storms (e.g. via validator disabling + slashing), but if there is one, there is nothing more important to the system than to handle those disputes as effectively as possible. It absolutely must have a higher priority than adding even more work to the disputes system. It is worth mentioning that the last year's worth of work on disputes had precisely this objective: "Make it impossible for the system got get overloaded with disputes."

@sandreim
Copy link
Contributor Author

Ok, the strongest argument here is that the system is not designed to sustain the effort of everybody checking everything. Based on that, yeah it makes sense to have some intelligent back pressure.

I am proposing we keep the current clearing of candidates/bitfields but only when disputes require more weight than it is available for N consecutive blocks. N should be something reasonable as to absorb any flukes without affecting liveness but at the same time not overload the system.

WDYT ?

@burdges
Copy link
Contributor

burdges commented Mar 23, 2023

We typically have trouble in on-chain back pressure schemes once we require consensus upon the back pressure. I think block producers including more disputes and fewer parachain candidates sounds fine though.

If we've k tranche zero assignments, then a dispute costs slightly less CPU time and bandwidth than 1/kth of the candidates in a block. I do not know exactly what 50% means here, but arguably the block should be full at k disputes, which then suggests how much the dispute costs.

We could hold some computational resources in reserve, but doing so only effects the constant term. We could've permit one block producer to schedule many disputes, which effectively takes "weight" away form future block producers, but this brings liveness concerns. We could exploit slot under utilization to have more slots, but arguably doing so impacts this 1/k linearly in the opposite direction.

@sandreim sandreim closed this as not planned Won't fix, can't repro, duplicate, stale Mar 28, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
I3-bug Fails to follow expected behavior. U1-asap No need to stop dead in your tracks, however issue should be addressed as soon as possible.
Projects
No open projects
Development

No branches or pull requests

3 participants