Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trace_decoder: Tweak batch size for small blocks #576

Merged
merged 2 commits into from
Sep 1, 2024

Conversation

Nashtare
Copy link
Collaborator

Simply halves the block in two chunks in case the initial batch size is too large, to prevent wasting ressources on dummy payloads.

closes #457

Follow-up work mentioned #518 will improve on this but this requires additional benchmarkings / analysis of how much our instances can optimally handle.

@Nashtare Nashtare added this to the Performance Tuning milestone Aug 31, 2024
@Nashtare Nashtare self-assigned this Aug 31, 2024
@github-actions github-actions bot added the crate: trace_decoder Anything related to the trace_decoder crate. label Aug 31, 2024
Copy link
Contributor

@0xaatif 0xaatif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a bunch of documentation in these issues. Could the knowledge there be transferred to the code?

It would be great to embed what motivates particular operations

@Nashtare Nashtare requested a review from 0xaatif August 31, 2024 18:12
@0xaatif
Copy link
Contributor

0xaatif commented Sep 1, 2024

How urgent is this change?

I'd lean to an approach like this (though I haven't worked out the right approach for padding):

/// Break `txns` into batches of length `hint`, prioritising creating at least two batches.
fn batch(mut txns: Vec<TxnInfo>, hint: NonZero<usize>) -> Vec<Vec<TxnInfo>> {
    let n_batches = txns.iter().chunks(hint.get()).into_iter().count();
    match (txns.len(), n_batches) {
        // enough
        (_, 2..) => txns
            .into_iter()
            .chunks(hint.get())
            .into_iter()
            .map(FromIterator::from_iter)
            .collect(),
        // not enough batches at `hint`, but enough real transactions
        (2.., ..2) => {
            let second = txns.split_off(txns.len() / 2);
            vec![txns, second]
        }
        (0 | 1, _) => todo!("not enough real transactions - pad"),
    }
}

This would slot in well with my planned refactoring of the backend (and simple heuristics that might be forthcoming in #518):

let batches = middle::middle(
state,
storage,
txn_info
.into_iter()
.chunks(batch_size.get())
.into_iter()
.map(FromIterator::from_iter)
.collect(),
&mut code,
)?;

I'm happy to merge this as an interim measure if required - that refactoring is blocked on testing :^)

@Nashtare
Copy link
Collaborator Author

Nashtare commented Sep 1, 2024

How urgent is this change?

It's not urgent, though it affects IMX benchmarks, as both their test load chain and real chain is having a relatively low number of txn / block, often inducing dummy proofs with the current approach.

I'm happy to merge this as an interim measure if required

Ideally this approach would be replaced by a dynamic batching based on txn state access count and state trie growth, though this requires additional measurements before assigning proper weights to determine the optimal batch sizes, hence this standalone PR.

@Nashtare Nashtare merged commit 9ba2eda into develop Sep 1, 2024
15 checks passed
@Nashtare Nashtare deleted the small_batch_sizes branch September 1, 2024 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
crate: trace_decoder Anything related to the trace_decoder crate.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

perf(continuations): Adapt batching for small blocks
2 participants