`paraInclusion` is massively overestimating its weight #849

ggwpez · 2022-03-23T15:14:01Z

Take a look at some recent Polkadot blocks. Even empty blocks consume ~500ms of weight.
Benchmarking such blocks shows that only a fraction (~4%) of the weight is actually needed.
That means paraInclusion is massively overestimating its weight cost.
@shawntabrizi noted that we could think about refunding weight here.

Output from benchmark-block on reference hardware with production profile and wasm executor:

Block 9556370 with     2 tx used   4.29% of its weight (    21,509,600 of    501,617,735 ns)    
Block 9556371 with     3 tx used   4.08% of its weight (    20,643,956 of    505,412,849 ns)    
Block 9556372 with     2 tx used   3.94% of its weight (    19,750,293 of    501,617,735 ns)    
Block 9556373 with     2 tx used   3.98% of its weight (    20,368,847 of    511,839,008 ns)    
Block 9556374 with     3 tx used   3.97% of its weight (    19,920,982 of    501,891,126 ns)    
Block 9556375 with     3 tx used   3.98% of its weight (    20,114,849 of    505,370,681 ns)

The text was updated successfully, but these errors were encountered:

shawntabrizi · 2022-03-24T20:31:57Z

I think probably we can solve this with a weight refund. My guess is this overestimate comes from the fact that we must assume a lot of things about how many parachains there are, how many are upgrading, how many are sending messages, etc...

If we can track those real numbers and stick it back into the benchmark function, then we can refund any unused weight.

burdges · 2022-03-28T12:16:35Z

It depends what weight means too, like does off chain work count? We only do availability when backing a parachain block, but paraInclusion triggers approval checks with like 40-50 nodes downloading and checking the parachain block. Also approval checks shall become the dominant rewards source for validators.

ggwpez · 2022-03-29T09:14:19Z

I'm not sure if I understand your question. The weight should reflect the time that it takes validators to import the block.
If you want to specifically incentivize something, you can modify the fee per weight for that.

Otherwise we have the problem like right now that the blocks are 25% full without much in them. @burdges

burdges · 2022-03-29T12:02:46Z

Alright cool. We de facto rate limit approvals, etc. by the number of availability cores anyways, which makes sense.

rphmeier · 2022-03-31T14:19:21Z

Yeah @burdges this only concerns the on-chain weight, not the implicit weight of coordinated off-chain stuff.

kianenigma · 2022-12-19T22:10:38Z

I think this should be tackled not too long in the future. Perhaps someone from FRAME team can give you a hand (az Zeke did last year), especially if someone knowledgable is willing to help out.

eskimor · 2023-10-08T09:20:52Z

Indeed. We need a better way of tracking weight. The current top-level benchmarking of enter does not work at all. There are too many parameters to consider, including internal state and effects. Also the benchmarking system works poorly with many parameters.

We might need to go back to (semi-) manual tracking. E.g. benchmarking at building blocks and combining them. This would reduce the number of parameters needed for any single benchmark significantly and might make "proper" weight tracking actually feasible and manageable.

burdges · 2023-10-08T09:35:47Z

Why was this so heavy in the first place? Just on-chain messages? If so, we could bill for inclusion (cheap) and message count (expensive).

eskimor · 2023-10-08T10:07:02Z

The way it is calculated is just severely messed up, this combined with benchmarking doing weird stuff if you have many parameters. We can chat, if you are interested.

* Cleanup readme docs * fix broken test

* Start generalizing rialto-millau commands. * cargo fmt --all * Introduce generic balance. * Unify message payloads. * cargo fmt --all * init - generic * Attempt to unify send message. * Start moving things around. * cargo fmt --all * Move init-bridge. * cargo fmt --all * Improve UX of bridge argument. * Fix clippy. * Fix docs and scripts. * Add docs. * Apply suggestions from code review Co-authored-by: Hernando Castano <HCastano@users.noreply.github.com> * Fix copyright. * Add issue numbers. * More todos. * Update comments. Co-authored-by: Hernando Castano <HCastano@users.noreply.github.com>

kianenigma · 2024-04-05T13:31:22Z

related: #959

* Start generalizing rialto-millau commands. * cargo fmt --all * Introduce generic balance. * Unify message payloads. * cargo fmt --all * init - generic * Attempt to unify send message. * Start moving things around. * cargo fmt --all * Move init-bridge. * cargo fmt --all * Improve UX of bridge argument. * Fix clippy. * Fix docs and scripts. * Add docs. * Apply suggestions from code review Co-authored-by: Hernando Castano <HCastano@users.noreply.github.com> * Fix copyright. * Add issue numbers. * More todos. * Update comments. Co-authored-by: Hernando Castano <HCastano@users.noreply.github.com>

sandreim · 2024-07-13T11:31:14Z

This needs to be fixed soon, at 500 parachain validators on Kusama we see more than 70% weight usage.

eskimor · 2024-07-13T17:46:20Z

@ordian you have been looking into this and related issues already, right?

ordian · 2024-07-24T13:13:31Z

Overestimation

Looking at https://www.polkadot-weigher.com/history, we see the current block weight numbers for Kusama are around 75%:

A typical block on Kusama has around 500 bitfields (with around 20 candidates included) and around 20 candidates backed in its ParaInherent, which consumes almost all of the weight.

Similarly, Polkadot's weight is around 42% with 300 bitfields and 20 backed candidates per block.

Importing these blocks in practice takes much less time suggesting a benchmarking overhead over 10x.

Back-of-the-envelope

Looking at the bench numbers for kusama, we have:

type	weight	rocksdb	paritydb
bitfield	480us + 27r + 17w	2.855ms	1.546ms
backed	1409us + 30r + 16w	3.759ms	2.449ms

The weights differ significantly based on whether the weight is set to RocksDB or ParityDB backend (currently, it's rocksdb).

Given that a typical block will contain n_validators bitfields and n_cores backed candidates, the bottleneck seems to be the bitfields (we expect 1k n_validators and 200 n_cores in the not so distant future). Also as we can see from the db backend difference, the bottleneck is in db ops, not CPU (ParityDB weights are lower than RocksDB's).

Causes

Currently, the way we calculate the weight of processing a bitfield and a backed candidate is by running the whole enter function.
The are 2 main problems with this approach:

The weight of processing an empty inherent is non-trivial and is in included in the cost for each bitfield/backed candidate.
The cost of processing n bitfields != n * the cost of processing bitfields for 1 candidate. For the example, the number of signatures does not depend on the number of included candidates, but only on the number of validators. Some of the DB reads are shared between all candidates.
So the benchmark for bitfields seems to be incorrect.

Short-term fix

A proposed short-term fix is to address problem 1 and switch to ParityDB weights as it should be the default.

The estimated weights with the proposed fix should be (note that the bench weights would remain the same, but we subtract the weight of the empty parainherent when processing enter):

type	weight	rocksdb	paritydb
empty	121us + 15r + 5w	0.996ms	0.491ms
bitfield	359us + 12r + 12w	1.859ms	1.058ms
backed	1288us + 15r + 11w	2.763ms	1.958ms

Given these numbers, current Kusama blocks should be estimated to weigh 1.859ms * 500 + 20 * 2.763ms = 0.985s = 49%
we shaved off 25% of the total weight
With the switch to ParityDB db weights it would be 0.568s = 28%
another 21% shaved off

With these (projected) numbers we should be able to scale to 1k validators with
200 (async backing) cores with para inherent weight being 1.5s = 75%

See #5082 which addresses point 1.

We can optimize this further by addressing the point 2 in a follow-up PR (being worked on).

Long-term fix

Instead of running the whole enter function in order to estimate the cost of processing individual bitfields/backed candidates, it would be better to refactor the code into sections that processes disputes, bitfields and backed candidates and only benchmark specific subsection/function. Luckily such a refactoring was deemed necessary for other reasons and is also being worked on.

Conclusion

We are not blocked on ParaInherent weight overestimation for scaling to 1k and 200 cores with a simple fix (#5082).

closes #849 ## Context For the background on this and the long-term fix, see #849 (comment). ## Changes * The weigh files are renamed from `runtime_(parachains|common).*` to `polkadot_runtime_(parachains|common).*`. The reason for it is the renaming introduced in #4633. The new weight command and files are generated now include `polkadot_` prefix. * The WeightInfo for `paras_inherent` now includes `enter_empty` which calculates the cost of processing an empty parachains inherent. This cost is subtracted dynamically when calculating other weights (so the other weights remain the same) ## Benefits See #849 (comment), but TL;DR is that we are not blocked on weights for scaling the number of validators and cores further. Resolved questions: - [x] why new benchmarks for westend are doing fewer db IOPS? Is it due polkadot-sdk update (db IOPS diff)? or the bench setup is no longer valid? https://github.com/polkadot-fellows/runtimes/blob/7723274a2c5cbb10213379271094d5180716ca7d/relay/polkadot/src/weights/runtime_parachains_paras_inherent.rs#L131-L196 Answer: see background section of #5270 TODOs: - [x] Rerun benchmarks for Rococo and Westend - [x] PRDoc --------- Co-authored-by: command-bot <>

alexggh · 2024-09-12T10:47:57Z

Hi,

I just observed that on kusama the paraInherent weight halfed after we enacted Update Kusama v1.3.0 it went from around 73% before the enactment to around 35% after the enactment, the number of the candidates does not seem to change, so I guess the weights changed.

That's good news, but was that expected ?

ordian · 2024-09-12T10:57:14Z

Yes, that's expected. See #5082 (comment)

ggwpez added the I10-optimisation label Mar 23, 2022

kianenigma added the U2-some_time_soon label Dec 19, 2022

ggwpez mentioned this issue Mar 14, 2023

Polkadot Incident 11.03.2023 paritytech/polkadot#6862

Closed

Sophia-Gold transferred this issue from paritytech/polkadot Aug 24, 2023

the-right-joyce added I9-optimisation An enhancement to provide better overall performance in terms of time-to-completion for a task. and removed I10-optimisation labels Aug 25, 2023

eskimor added this to Parachains-core Oct 8, 2023

github-project-automation bot moved this to To do in Parachains-core Oct 8, 2023

the-right-joyce added this to parachains team board Oct 12, 2023

the-right-joyce moved this to Backlog in parachains team board Oct 12, 2023

claravanstaden pushed a commit to Snowfork/polkadot-sdk that referenced this issue Dec 8, 2023

Misc cleanups (paritytech#849)

54da049

* Cleanup readme docs * fix broken test

kianenigma added the T1-FRAME This PR/Issue is related to core FRAME, the framework. label Apr 5, 2024

eskimor mentioned this issue Apr 5, 2024

Cleanup the runtime #4002

Open

sandreim assigned ordian Jul 15, 2024

ordian mentioned this issue Jul 19, 2024

short-term fix for para inherent weight overestimation #5082

Merged

3 tasks

ordian closed this as completed in #5082 Aug 29, 2024

github-project-automation bot moved this from Backlog to Completed in parachains team board Aug 29, 2024

ordian mentioned this issue Aug 29, 2024

Refactor paras_inherent::Pallet::enter #5520

Open

3 tasks

ordian mentioned this issue Sep 17, 2024

Extrinsic accepted by the node but not included into the block #5313

Open

alexggh mentioned this issue Sep 30, 2024

Enabling 1k validators & 200 cores on polkadot #5867

Open

21 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`paraInclusion` is massively overestimating its weight #849

`paraInclusion` is massively overestimating its weight #849

ggwpez commented Mar 23, 2022

shawntabrizi commented Mar 24, 2022

burdges commented Mar 28, 2022

ggwpez commented Mar 29, 2022

burdges commented Mar 29, 2022

rphmeier commented Mar 31, 2022

kianenigma commented Dec 19, 2022

eskimor commented Oct 8, 2023

burdges commented Oct 8, 2023 •

edited

Loading

eskimor commented Oct 8, 2023

kianenigma commented Apr 5, 2024

sandreim commented Jul 13, 2024

eskimor commented Jul 13, 2024

ordian commented Jul 24, 2024 •

edited

Loading

alexggh commented Sep 12, 2024

ordian commented Sep 12, 2024

paraInclusion is massively overestimating its weight #849

paraInclusion is massively overestimating its weight #849

Comments

ggwpez commented Mar 23, 2022

shawntabrizi commented Mar 24, 2022

burdges commented Mar 28, 2022

ggwpez commented Mar 29, 2022

burdges commented Mar 29, 2022

rphmeier commented Mar 31, 2022

kianenigma commented Dec 19, 2022

eskimor commented Oct 8, 2023

burdges commented Oct 8, 2023 • edited Loading

eskimor commented Oct 8, 2023

kianenigma commented Apr 5, 2024

sandreim commented Jul 13, 2024

eskimor commented Jul 13, 2024

ordian commented Jul 24, 2024 • edited Loading

Overestimation

Back-of-the-envelope

Causes

Short-term fix

Long-term fix

Conclusion

alexggh commented Sep 12, 2024

ordian commented Sep 12, 2024

`paraInclusion` is massively overestimating its weight #849

`paraInclusion` is massively overestimating its weight #849

burdges commented Oct 8, 2023 •

edited

Loading

ordian commented Jul 24, 2024 •

edited

Loading