Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add default value for loaded-accounts-data-size #1355

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

tao-stones
Copy link

@tao-stones tao-stones commented May 14, 2024

Problem

Compute budget instruction set_loaded_accounts_data_size_limit was introduced since v1.16, transaction sender can use it to set limits of accounts to load for transaction. It uses MAX_LOADED_ACCOUNTS_DATA_SIZE_BYTES = 64MiB as default value if set_loaded_accounts_data_size_limit is absent from transaction during initial rollout, so no known transactions in mnb would fail without adapting the new instruction. It should use a reasonable default value that is lesser than MAX, as users start to adapting this instruction.

Summary of Changes

  • add and use DEFAULT_LOADED_ACCOUNTS_DATA_SIZE_BYTES in absence of set_loaded_accounts_data_size_limit

Feature Gate Issue: #1568

@tao-stones tao-stones force-pushed the imp-add-default-loaded-accounts-data-size branch from e979dfd to f41a7b0 Compare May 16, 2024 20:35
@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.1%. Comparing base (aa2f078) to head (f41a7b0).
Report is 141 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##           master    #1355     +/-   ##
=========================================
- Coverage    82.1%    82.1%   -0.1%     
=========================================
  Files         886      886             
  Lines      236417   236424      +7     
=========================================
- Hits       194266   194253     -13     
- Misses      42151    42171     +20     

@tao-stones
Copy link
Author

using mod ledger-tool replayed several random mnb ledgers, the max loaded account size a transaction had is about 10MB, which is below 32MB default. Still needs more data to get a sense how big the impact is to user if lower default to 32MB.

Adding loaded_accounts_data_size, per transaction, to metrics seems too much in general, but perhaps can run it on canary? Or maybe @brooksprumo already have something like that collected?

@bw-solana @apfitzge for visibility

@bw-solana
Copy link

using mod ledger-tool replayed several random mnb ledgers, the max loaded account size a transaction had is about 10MB, which is below 32MB default. Still needs more data to get a sense how big the impact is to user if lower default to 32MB.

Adding loaded_accounts_data_size, per transaction, to metrics seems too much in general, but perhaps can run it on canary? Or maybe @brooksprumo already have something like that collected?

@bw-solana @apfitzge for visibility

@KirillLykov has mentioned some data for understanding distribution of account sizes, but not sure if this was for all accounts or distribution for which ones are being accessed by transactions.

If we don't have it, I would be in favor of some histogram for storing this information and maybe reporting per slot (mean, median, max)

@KirillLykov
Copy link

using mod ledger-tool replayed several random mnb ledgers, the max loaded account size a transaction had is about 10MB, which is below 32MB default. Still needs more data to get a sense how big the impact is to user if lower default to 32MB.
Adding loaded_accounts_data_size, per transaction, to metrics seems too much in general, but perhaps can run it on canary? Or maybe @brooksprumo already have something like that collected?
@bw-solana @apfitzge for visibility

@KirillLykov has mentioned some data for understanding distribution of account sizes, but not sure if this was for all accounts or distribution for which ones are being accessed by transactions.

If we don't have it, I would be in favor of some histogram for storing this information and maybe reporting per slot (mean, median, max)

I have data for all the accounts sizes on mainnet, so I cannot say how much data one txs uses at max.

@tao-stones
Copy link
Author

Assuming accounts size data @KirillLykov collected were directly from mnb accounts db, then mean could be severely skewed by abandoned small accounts (eg accounts created by not used). Still very helpful info tho, thanks @KirillLykov

@KirillLykov
Copy link

KirillLykov commented May 28, 2024

Assuming accounts size data @KirillLykov collected were directly from mnb accounts db, then mean could be severely skewed by abandoned small accounts (eg accounts created by not used). Still very helpful info tho, thanks @KirillLykov

The data I've published in the slack chat is obtained from the large file which has pubkeys for accounts (thanks @grooviegermanikus for collecting), so it might be possible to extract information about accounts last time have been used. If you are interested in this data, I can share it with you.

@tao-stones
Copy link
Author

tao-stones commented May 28, 2024

I added histogram for loaded accounts size per tx, as @bw-solana suggested early, testing against an random mnb snapshot, gives this (in bytes):

loaded accounts data size stats: 
75-percentile 3,779, 
90-percentile 1,689,256, 
95-percentile 2,042,627,
min 36, 
max 10,980,688, 
mean 345,718, 
stddev 813,011, 
entries 4694,

mean + 1*stddev = 1,158,729;

This doesn't line up with @KirillLykov per account data:

Mean account size is ~350b and std is ~6k.

with 64 max accounts per tx, (350 + 6,000) * 64 = 406,400. I'd think this is mainly due to accounts actively used tend to have larger than average accounts size.

@tao-stones
Copy link
Author

tao-stones commented May 30, 2024

Some more data point:

for 432167 slots 251851770 to 252288093

loaded accounts data size stats: 75-percentile 3748, 90-percentile 1103102, 95-percentile 2281702, min 14, max 14302577, mean 431086, stddev 1368588, entries 1236440,

for 9763 slots 245285309 to 245295300

loaded accounts data size stats: 75-percentile 240649, 90-percentile 1954546, 95-percentile 2617246, min 14, max 19042141, mean 530225, stddev 1327948, entries 320815,

@tao-stones tao-stones force-pushed the imp-add-default-loaded-accounts-data-size branch from f41a7b0 to e5fad37 Compare May 31, 2024 16:09
@tao-stones
Copy link
Author

Given the data @KirillLykov published on slack, and histogram from mainnet-beta ledger (above), I'd like to propose the default to be somewhere at 95 percentile: 2MiB. With this default limit, we'd expect about 5-10% transactions to fail due to TransactionError::MaxLoadedAccountsDataSizeExceeded until compute-budget instruction set_loaded_accounts_data_size_limit is included in transaction.

I prepared this write-up to facilitate dev-comm.

@tao-stones tao-stones marked this pull request as ready for review May 31, 2024 16:17
@tao-stones tao-stones requested review from apfitzge and bw-solana May 31, 2024 16:20
@tao-stones tao-stones force-pushed the imp-add-default-loaded-accounts-data-size branch 5 times, most recently from 6bf9c1a to d57dfd0 Compare June 4, 2024 00:21
@tao-stones tao-stones added the feature-gate Pull Request adds or modifies a runtime feature gate label Jun 5, 2024
@danenbm
Copy link

danenbm commented Jun 7, 2024

Given the data @KirillLykov published on slack, and histogram from mainnet-beta ledger (above), I'd like to propose the default to be somewhere at 95 percentile: 2MiB. With this default limit, we'd expect about 5-10% transactions to fail due to TransactionError::MaxLoadedAccountsDataSizeExceeded until compute-budget instruction set_loaded_accounts_data_size_limit is included in transaction.

I prepared this write-up to facilitate dev-comm.

The problem statement states that there should be a reasonable default that is lesser than MAX, but I am not able to infer what specific problem this change solves.

I looked in the attached write-up and found the comment:

The background calculation:

  • 1MiB accounts memory usage cost is about 250 CU.
  • currently 64MiB adds 16,000 CU to transaction, that could significantly under pack block
  • 2MiB adds insignificant 500CU
  • 10MiB adds 2,500 CU per tx.

Is that the root cause for the change? To more efficiently pack blocks that may currently be under-packed?

@bw-solana
Copy link

Is that the root cause for the change? To more efficiently pack blocks that may currently be under-packed?

Sort of. Right now, it doesn't matter because we don't factor in data bytes to block limits.
When we do start factoring in data bytes to block limits, we could start underpacking if the default value is too large.

@tao-stones
Copy link
Author

Is that the root cause for the change? To more efficiently pack blocks that may currently be under-packed?

Sort of. Right now, it doesn't matter because we don't factor in data bytes to block limits. When we do start factoring in data bytes to block limits, we could start underpacking if the default value is too large.

Correct. To add more color, large default value will "over reserve" block space during packing (here), which could prevent other banking threads to schedule more transactions, especially close to the end of slot, potentially under pack block.

compute-budget/src/compute_budget_processor.rs Outdated Show resolved Hide resolved
runtime-transaction/src/runtime_transaction.rs Outdated Show resolved Hide resolved
svm/src/account_loader.rs Outdated Show resolved Hide resolved
// ComputeBudgetDetails does not concern with loaded_accounts_data_size_limit, hence safe
// to hardcode its feature gate `default_loaded_accounts_data_size_limit` as activated
let use_default_loaded_accounts_data_size = true;
let compute_budget_limits = process_compute_budget_instructions(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we change this variable to compute_budget_details?

ComputeBudgetLimits does in fact deal with the loaded account size; it's ComputeBudgetDetails that doesn't.

(though this naming is confusing since I'd maybe intuitively think Details has more details?)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea think ..._deails might be more confusing. process_compute_budget_instructions(...) -> Result< ComputeBudgetLimits, ...>, ...._limits reflects better that those are the limits we got from processing instructions.

replace default() with new_with(feature_gate_status);
add public getter with feature_gate status and making consts private;
plumb feature-gate status to process_compute_budget_instructions()
@tao-stones tao-stones force-pushed the imp-add-default-loaded-accounts-data-size branch from 507ef56 to 8f9ea1e Compare June 28, 2024 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-gate Pull Request adds or modifies a runtime feature gate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants