Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow bank hash file to contain details for multiple slots #34516

Merged
merged 7 commits into from
Dec 27, 2023

Conversation

steviez
Copy link
Contributor

@steviez steviez commented Dec 19, 2023

Problem

The original implementation for bank hash details files (#32632) only allows for outputting a single slot at a time. It is advantageous to allow the schema to support multiple slots. One such usescase is a proposed process described in more detail in #34246 (comment). The idea would be:

  • Generate the bank hashes for a bunch of slots in a ledger using a "known good" version
  • Retain the ledger and bank hash file
  • Later, use the ledger + good hashes for testing, whether manual testing of a new feature or in some automated fashion
    • Ie, run the test branch and check computed hashes against good hashes, divergent could be spit out to reduce triage time

Summary of Changes

Adjust the BankHashDetails struct to allow for details on multiple banks. Note that while the structure can now handle multiple banks, the method exposed to write the file still only allows for passing in a single bank. Allowing for multiple banks (and potentially optional fields) can be added in a follow up PR.

The old json object looked like this:

{
  "version": "1.18.0 (src:00000000; feat:142935575, client:SolanaLabs)",
  "account_data_encoding": "base64",
  "slot": 236734421,
  "bank_hash": "D9VzYkV7XxWhWAtATJYWrTayTtAfXbyUKSgSpg5nTRxW",
  "parent_bank_hash": "4YMdLwwqch2PKv9tA3LNyfmPJcFc7S6UXptjGYLoH3MM",
  "accounts_delta_hash": "Et7xGHroFgPiWkWXnL9BGnRNrbDBYmooRvT22fENjdfr",
  "signature_count": 1808,
  "last_blockhash": "9TPtRt3mCgnQJScfnRsAW5G8NAWmHKj48AWHXAg4MpC3",
  "accounts": [

With this PR, the structure now has an array for the per-slot details:

{
  "version": "1.18.0 (src:00000000; feat:142935575, client:SolanaLabs)",
  "account_data_encoding": "base64",
  "bank_hash_details": [
    {
      "slot": 236734421,
      "bank_hash": "D9VzYkV7XxWhWAtATJYWrTayTtAfXbyUKSgSpg5nTRxW",
      "parent_bank_hash": "4YMdLwwqch2PKv9tA3LNyfmPJcFc7S6UXptjGYLoH3MM",
      "accounts_delta_hash": "Et7xGHroFgPiWkWXnL9BGnRNrbDBYmooRvT22fENjdfr",
      "signature_count": 1808,
      "last_blockhash": "9TPtRt3mCgnQJScfnRsAW5G8NAWmHKj48AWHXAg4MpC3",
      "accounts": [ 

Here is a jq query that will translate the old format (v1.16/v1.17) to something that would be directly diff-able with the output of this PR:

jq '{version: .version, account_data_encoding: .account_data_encoding, bank_hash_details: [{slot: .slot, bank_hash: .bank_hash, parent_bank_hash: .parent_bank_hash, accounts_delta_hash: .accounts_delta_hash, signature_count: .signature_count, last_blockhash: .last_blockhash, accounts: .accounts}]}'

Add an outer struct, that contains most of the previous struct in a
Vec<_>. Also, update the existing function to output a file to use the
new struct.

The only functional change is the change to the json structure that is
output.
Also, make the function handle picking a filename for variable number of
bank hash details
Copy link

codecov bot commented Dec 19, 2023

Codecov Report

Attention: 6 lines in your changes are missing coverage. Please review.

Comparison is base (d2b5afc) 81.8% compared to head (7ac8858) 81.9%.
Report is 30 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #34516   +/-   ##
=======================================
  Coverage    81.8%    81.9%           
=======================================
  Files         822      822           
  Lines      221470   221496   +26     
=======================================
+ Hits       181363   181413   +50     
+ Misses      40107    40083   -24     

@steviez
Copy link
Contributor Author

steviez commented Dec 20, 2023

@lheeger-jump - FYI for FD team; ripatel mentioned you've been relying more on other means at the moment, but incase you had any workflows that depends on this.

pub bank_hash_details: Vec<BankHashSlotDetails>,
}

impl BankHashDetails {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd maybe rename this to something like BankHashDetailsSet or something that indicates we are looking at a set (or really a vector) of bank hashes/slots. From the name, it's not totally clear that this holds more than one bank hash

Copy link
Contributor Author

@steviez steviez Dec 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm generally hesitant to use Set or any other specific container name in the type name unless the container is purely that. Ie, something like:

type BankHashDetailsMap = HashMap<Slot, BankHashDetails>

In this instance, we have additional contents.

We are providing (plural) details about each bank, so having BankHashDetails with an inner Vec<BankHashDetail> doesn't fit either (ie the inner Details also needs to be plural).

Both you and @seanyoung had suggestions so maybe we should adjust the name. I see your points but I'm not in love with the alternatives that we've come up with either. I'm just now offering my rebuttal so I'll give you and Sean the chance to read/think/respond. But, if we don't agree on something that we all like, I think we should push forward as-is. A reader could also go inspect the inner type if they want more information. I'll double check comments as well

This type name will not show up in the produced json, and it is only used within this source file at the moment. So, it will be very easy to update down the road, and pushing now will allow subsequent work (potentially by both Greg and Sean, and maybe me too) to build on top.

Thoughts?

Copy link
Contributor

@gregcusack gregcusack Dec 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya totally get that and agree. your update looks good to me. ship it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BankHashDetailsSummary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BankHashDetailsSummary?

Use this name for the outer, and then use BankHashSlotDetails or BankHashDetails for the inner ?

seanyoung
seanyoung previously approved these changes Dec 21, 2023
Copy link
Contributor

@seanyoung seanyoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

runtime/src/bank/bank_hash_details.rs Show resolved Hide resolved
Copy link
Contributor

@gregcusack gregcusack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@steviez
Copy link
Contributor Author

steviez commented Dec 27, 2023

The only remaining item of discussion was naming. I'm going to push this in as-is so other things can build on top of it, and we can continue discussion on naming at a later point if desired

@steviez steviez merged commit 1f7c714 into solana-labs:master Dec 27, 2023
35 checks passed
@steviez steviez deleted the bh_multi_slot branch December 27, 2023 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants