-
Notifications
You must be signed in to change notification settings - Fork 709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
short-term fix for para inherent weight overestimation #5082
Conversation
bot bench polkadot-pallet --pallet=polkadot_runtime_parachains::paras_inherent --runtime=westend |
…=rococo --target_dir=polkadot --pallet=polkadot_runtime_parachains::paras_inherent
…=westend --target_dir=polkadot --pallet=polkadot_runtime_parachains::paras_inherent
* master: Bump slotmap from 1.0.6 to 1.0.7 (#5096) feat: introduce pallet-parameters to Westend to parameterize inflation (#4938) Bump openssl from 0.10.64 to 0.10.66 (#5107) Bump lycheeverse/lychee-action from 1.9.1 to 1.10.0 (#5094) docs: remove the duplicate word (#5095) Prepare PVFs if node is a validator in the next session (#4791) Update parity publish (#5105)
The CI pipeline was cancelled due to failure one of the required jobs. |
What is the long term fix? |
* master: (27 commits) Bridges improved tests and nits (#5128) Fix misleading comment about RewardHandler in epm config (#3095) Introduce a workflow updating the wishlist leaderboards (#5085) membership: Restructure pallet into separate files (#4536) Fix after ring-proof api change (#5126) Bump paritytech/review-bot from 2.4.0 to 2.5.0 (#5057) Bump docker/login-action from 3.0.0 to 3.3.0 (#5109) Bump docker/build-push-action from 5.1.0 to 6.5.0 (#5108) Bump peter-evans/create-pull-request from 5.0.0 to 6.1.0 (#5093) Tx Payment: drop ED requirements for tx payments with exchangeable asset (#4488) Remove `pallet-getter` usage from pallet-transaction-payment (#4970) pallet macro: do not generate try-runtime related code when frame-support doesn't have try-runtime. (#5099) fix(chain-spec): ChainSpecBuilder with object as default genesis (#4345) Migrate BEEFY BLS crypto to bls12-381 curve (#4931) Bump clap from 4.5.9 to 4.5.10 in the known_good_semver group (#5120) Use jobserver in wasm-builder to limit concurrency of spawned cargo processes (#4946) include events for voting (#4613) [subsystem-bench] Add mocks for own assignments triggering (#5042) Remove not-audited warning (#5114) hotfix: blockchain/backend: Skip genesis leaf to unblock syncing (#5103) ...
polkadot/runtime/rococo/src/weights/runtime_parachains_paras_inherent.rs
Outdated
Show resolved
Hide resolved
benchmark.bitfields.clear(); | ||
benchmark.backed_candidates.clear(); | ||
benchmark.disputes.clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these really needed if we just called build()
on a default BenchBuilder
instance?
polkadot/runtime/rococo/src/weights/runtime_parachains_paras_inherent.rs
Outdated
Show resolved
Hide resolved
* master: (51 commits) Remove unused feature gated code from the minimal template (#5237) make polkadot-parachain startup errors pretty (#5214) Coretime auto-renew (#4424) network/strategy: Backoff and ban overloaded peers to avoid submitting the same request multiple times (#5029) Fix frame crate usage doc (#5222) beefy: Tolerate pruned state on runtime API call (#5197) rpc: Enable ChainSpec for polkadot-parachain (#5205) Add an adapter for configuring AssetExchanger (#5130) Replace env_logger with sp_tracing (#5065) Adjust sync templates flow to use new release branch (#5182) litep2p/discovery: Publish authority records with external addresses only (#5176) Run UI tests in CI for some other crates (#5167) Remove `pallet::getter` usage from the pallet-balances (#4967) pallet-timestamp: `UnixTime::now` implementation logs error only if called at genesis (#5055) [CI] Cache try-runtime check (#5179) [Backport] version bumps and the prdocs reordering from stable2407 (#5178) [subsystem-benchmark] Update availability-distribution-regression-bench baseline after recent subsystem changes (#5180) Remove pallet::getter usage from proxy (#4963) Remove pallet::getter macro usage from pallet-election-provider-multi-phase (#4487) Review-bot@2.6.0 (#5177) ...
bot bench polkadot-pallet --pallet=polkadot_runtime_parachains::paras_inherent --runtime=westend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, but there is something I don't clearly understand, as far as I can tell we are using fallback_max_validators()
in a lot of places, don't we need to adjust those to get realistic results for 1000 validators ?
* master: (35 commits) Fix OurViewChange small race (#5356) Make ticket non-optional and add ensure_successful method to Consideration trait (#5359) [tests] dedup test code, add more tests, improve naming and docs (#5338) Stop running the wishlist workflow on forks (#5297) Migrate foreign assets v3::Location to v4::Location (#4129) Minor clean up (#5284) [Pools] Ensure members can always exit the pool gracefully (#4998) StorageWeightReclaim: set to node pov size if higher (#5281) [Bot] Add prdoc generation (#5331) Small nits found accidentally along the way (#5341) Create subsystem-benchmarks.yml (#5325) Bump libp2p-identity from 0.2.8 to 0.2.9 (#5232) Bump authoring duration for async backing to 2s. (#5195) Fix spelling issues (#5206) Bump the known_good_semver group across 1 directory with 3 updates (#5315) `polkadot-node-core-pvf-common`: Fix test compilation error (#5310) ci: Paused `cmd-action` commenter (#5287) Remove unnecessary mut (#5318) chain-spec: minor clarification on the genesis config patch (#5324) fix av-distribution Jaeger spans mem leak (#5321) ...
* master: Remove redundant minimal template workspace (#5330) approval-distribution: Fix handling of conclude (#5375) More logs in `is_potential_spam` from `dispute-coordinator` (#5252) Fix zombienet bridges test (#5373) Update Readme of the `polkadot` crate (#5326) allow for `u8` to be used as hold/freeze reason (#5348) Moving cargo check for runtimes to GHA (#5340) Update links in the documentation (#5175) fix visibility for `pallet_nfts` types used as call arguments (#3634) Correct some typos in crates' descriptions (#5262) Aura: Ensure we are building on each relay chain fork (#5352) Update Identity pallet README.md (#5183) Bump trie-db from 0.29.0 to 0.29.1 (#5231) [Coretime] Always include UnpaidExecution, not just when revenue is > 0 (#5369) [Pools] fix derivation of pool account (#4999) Upgrade accidentally downgraded deps (#5365) [Pools] Fix issues with member migration to `DelegateStake` (#4822) Unify `no_genesis` check (#5360) [CI] Fix prdoc command (#5358) Beefy: add benchmarks for `report_fork_voting()` (#5188)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for a short term fix this is good. @alexggh do have a point about the 1000 validators but it can be a follow up or part of the proper fix.
bot bench polkadot-pallet --pallet=polkadot_runtime_parachains::paras_inherent --runtime=westend |
@ordian Command
|
…=westend --target_dir=polkadot --pallet=polkadot_runtime_parachains::paras_inherent
@ordian Command |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empirically, how much is the weight consumed by paras inherent in Polkadot and Kusama RC now, vs. how much will it be when this is deployed? (and which one is closer to reality -- although this is hard to measure 🙈)
I've provided some back-of-the-envelope calculations in this comment: #849 (comment), but TL;DR is that the dominant cost are from |
* master: (39 commits) short-term fix for para inherent weight overestimation (#5082) CI: Add backporting bot (#4795) Fix benchmark failures when using `insecure_zero_ed` flag (#5354) Command bot GHA v2 - /cmd <cmd> (#5457) Remove pallet::getter usage from treasury (#4962) Bump blake2b_simd from 1.0.1 to 1.0.2 (#5404) Bump rustversion from 1.0.14 to 1.0.17 (#5405) Bridge zombienet tests: remove old command (#5434) polkadot-parachain: Add omni-node variant with u64 block number (#5269) Refactor verbose test (#5506) Use umbrella crate for minimal template (#5155) IBP Coretime Polkadot bootnodes (#5499) rpc server: listen to `ipv6 socket` if available and `--experimental-rpc-endpoint` CLI option (#4792) Update approval-voting-regression-bench (#5504) change try-runtime rpc domains (#5443) polkadot-parachain-bin: Remove contracts parachain (#5471) Add feature to allow Aura collator to use full PoV size (#5393) Adding stkd bootnodes (#5470) Make `PendingConfigs` storage item public (#5467) frame-omni-bencher maintenance (#5466) ...
On top of #5082. ## Background Previously, before #3479, we would [include](https://github.com/paritytech/polkadot-sdk/blame/75074952a859f90213ea25257b71ec2189dbcfc1/polkadot/runtime/parachains/src/builder.rs#L508C12-L508C44) the cost enacting the candidate into the cost of processing a single bitfield. [Now](https://github.com/paritytech/polkadot-sdk/blame/dd48544a573dd02da2082cec1dda7ce735e2e719/polkadot/runtime/parachains/src/builder.rs#L529) it is different, although the benchmarks seems to be not-up-to date. Including the cost of enacting a candidate into a processing a single bitfield cost was incorrect, since we multiple that by the number of bitfields we have. Instead, we should separate calculate the cost of processing a single bitfield without enactment, and multiple the cost of enactment by the actual number of processed candidates (which is limited by the number cores, not validators). ## Bench Previously, the weight of `enact_candidate` was calculated manually (without a benchmark) and then neglected: https://github.com/paritytech/polkadot-sdk/blob/dd48544a573dd02da2082cec1dda7ce735e2e719/polkadot/runtime/parachains/src/inclusion/mod.rs#L584 In this PR, we have a benchmark for it and it's based on the number of ump and sent hrmp messages as well as whether the candidate has a runtime upgrade (new_validation_code). The differences from the previous attempt paritytech/polkadot#6929 are that * we don't include the cost of enactment into the cost of processing a backed candidate. The reason for it is that enactment happens not in the same block as backing (typically the next one), since we process bitfields before backing votes. * we don't take into account the size of the runtime upgrade, the benchmark weight doesn't seem to depend much on it, but rather whether there was one or not. Similarly to the previous attempt, we don't account for dmp messages (fixed cost). Also we don't account properly for received hrmp messages (hrmp_watermark) because the cost of it depends on the runtime state and can't be statically deduced in the benchmark (unless we pass the information about channels as benchmark u32 arguments). The total weight cost of processing a parainherent now includes the cost of enactment of each candidate, but we don't do filtering based on that (because we enact after processing bitfields and making other changes to the storage). ## Numbers ``` Reads = 7 + (0 * u) + (3 * h) + (8 * c) Writes = 10 + (1 * u) + (3 * h) + (7 * c) ``` In addition, there is a fixed cost of a few of ms (!) per candidate. This might result a full block slightly overflowing its weight with 200 enacted candidates, which in turn could prevent non-mandatory transactions from being included in a block. Given our modest limits on max ump and hrmp messages: ``` maxUpwardMessageNumPerCandidate: 16 hrmpMaxMessageNumPerCandidate: 10 ``` and the fact that runtime upgrades are can't happen very frequently (`validation_upgrade_cooldown`), we might only go over the limits in case of many disputes. TODOs: - [x] Fix the overweight test - [x] Generate the weights for Westend and Rococo - [x] PRDoc --------- Co-authored-by: command-bot <> Co-authored-by: Alin Dima <alin@parity.io>
closes #849
Context
For the background on this and the long-term fix, see #849 (comment).
Changes
runtime_(parachains|common).*
topolkadot_runtime_(parachains|common).*
. The reason for it is the renaming introduced in Unify dependency aliases #4633. The new weight command and files are generated now includepolkadot_
prefix.paras_inherent
now includesenter_empty
which calculates the cost of processing an empty parachains inherent. This cost is subtracted dynamically when calculating other weights (so the other weights remain the same)Benefits
See #849 (comment), but TL;DR is that we are not blocked on weights for scaling the number of validators and cores further.
Resolved questions:
Is it due polkadot-sdk update (db IOPS diff)?
or the bench setup is no longer valid?
https://github.com/polkadot-fellows/runtimes/blob/7723274a2c5cbb10213379271094d5180716ca7d/relay/polkadot/src/weights/runtime_parachains_paras_inherent.rs#L131-L196
Answer: see background section of inclusion: bench
enact_candidate
weight #5270TODOs: