[benchmark-cli] Wrong proof_size calculation leading to exaggerated weights #13765

agryaznov · 2023-03-30T08:04:33Z

This has been found during investigation of x5 proof_size weight for one of the pallet_contracts API functions.
This bug was introduced in #11637.

Context

Results of benchmarks are processed with writer::process_storage_results() in the following way.

It loops through all the storage keys of all results

substrate/utils/frame/benchmarking-cli/src/pallet/writer.rs

Lines 549 to 550 in 9c92e49

    
           for result in results.iter().rev() { 
        
           	for (key, reads, writes, whitelisted) in &result.keys {

and multiples each benchmark result for each key, adjusting the result's proof_size for each key, as it depends on:

PoV estimation mode
single read PoV overhead
number of reads

substrate/utils/frame/benchmarking-cli/src/pallet/writer.rs

Lines 628 to 632 in 9c92e49

    
           // Add the additional trie layer overhead for every new prefix. 
        
           if *reads > 0 { 
        
           	prefix_result.proof_size += 15 * 33 * additional_trie_layers as u32; 
        
           } 
        
           storage_per_prefix.entry(prefix.clone()).or_default().push(prefix_result);

We get storage_per_prefix data as the result, which originally was used for creating comments with information about the storage keys touched during each benchmark.

The Bug

In PR#11637 this data started to be used for calculation of the resulting proof_size formula into the weights.rs . But in a wrong way:

Step 1, (almost right). We find average base proof_size value and slope for each component, by making regression analysis of benchmark results we put to this component in storage_per_prefix (see above).

substrate/utils/frame/benchmarking-cli/src/pallet/writer.rs

Lines 299 to 303 in 9c92e49

    
           let proof_size_per_components = storage_per_prefix 
        
           	.iter() 
        
           	.map(|(prefix, results)| { 
        
           		let proof_size = analysis_function(results, BenchmarkSelector::ProofSize) 
        
           			.expect("analysis function should return proof sizes for valid inputs");

(This is almost right but not totally right, because the values we're making regression analysis on are not per-key benchmark results but originated from benchmark results for the all the keys and then adjusted for a single key, see below for details)

Step 2, (wrong). The resulting base_calculated_proof_size and component_calculated_proof_size values are calculated as a simple sum of the values for all prefixes from the per_prefix benchmark results.

substrate/utils/frame/benchmarking-cli/src/pallet/writer.rs

Lines 317 to 323 in 9c92e49

    
           for (_, slope, base) in proof_size_per_components.iter() { 
        
           	base_calculated_proof_size += base; 
        
           	for component in slope.iter() { 
        
           		let mut found = false; 
        
           		for used_component in used_calculated_proof_size.iter_mut() { 
        
           			if used_component.name == component.name { 
        
           				used_component.slope += component.slope;

But, wait a minute. To recap, the path of these proof_sizes is:

They first were calculated for all keys in each benchmark run.
Then we multiplied them for each key, adjusted depending on reads overhead for the key.
(which is weird because we're adding up overhead of a single key PoV to proof_size of all the keys)
Then we summed up those multiplied values.

And this is wrong. It leads to exaggerated weights when being put to weights.rs:

substrate/.maintain/frame-weight-template.hbs

Lines 73 to 75 in 9c92e49

    
           {{#each benchmark.component_calculated_proof_size as |cp|}} 
        
           .saturating_add(Weight::from_parts(0, {{cp.slope}}).saturating_mul({{cp.name}}.into())) 
        
           {{/each}}

Suggested Fix

Instead of a simple sum, those base_calculated_proof_size and component_calculated_proof_size should be calculated as averages or medians from all keys.

The text was updated successfully, but these errors were encountered:

agryaznov · 2023-03-30T08:12:02Z

Current workaround is to use #[skip_meta] for benchmarks which would lead to skip analyzing on the per-key basis.

agryaznov added this to Benchmarking and Weights Mar 30, 2023

github-project-automation bot moved this to Backlog in Benchmarking and Weights Mar 30, 2023

agryaznov assigned shawntabrizi Mar 30, 2023

athei assigned ggwpez Mar 30, 2023

ggwpez mentioned this issue Mar 30, 2023

Fixes PoV over-estimation #13766

Merged

1 task

paritytech-processbot bot closed this as completed in #13766 Apr 13, 2023

github-project-automation bot moved this from Backlog to Done in Benchmarking and Weights Apr 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[benchmark-cli] Wrong proof_size calculation leading to exaggerated weights #13765

[benchmark-cli] Wrong proof_size calculation leading to exaggerated weights #13765

agryaznov commented Mar 30, 2023 •

edited

Loading

agryaznov commented Mar 30, 2023

[benchmark-cli] Wrong proof_size calculation leading to exaggerated weights #13765

[benchmark-cli] Wrong proof_size calculation leading to exaggerated weights #13765

Comments

agryaznov commented Mar 30, 2023 • edited Loading

Context

The Bug

Suggested Fix

agryaznov commented Mar 30, 2023

agryaznov commented Mar 30, 2023 •

edited

Loading