Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store finalized block roots in database (3s startup) #3320

Merged
merged 3 commits into from
Jan 30, 2022
Merged

Conversation

arnetheduck
Copy link
Member

When the chain has finalized a checkpoint, the history from that point
onwards becomes linear - this is exploited in .era files to allow
constant-time by-slot lookups.

In the database, we can do the same by storing finalized block roots in
a simple sparse table indexed by slot, bringing the two representations
closer to each other in terms of conceptual layout and performance.

Doing so has a number of interesting effects:

  • mainnet startup time is improved 3-5x (3s on my laptop)
  • the first startup might take slightly longer as the new index is
    being built - ~10s on the same laptop
  • we no longer rely on the beacon block summaries to load the full dag -
    this is a lot faster because we no longer have to look up each block by
    parent root
  • a collateral benefit is that we no longer need to load the full
    summaries table into memory - we get the RSS benefits of DAG loading: don't preload summaries in memory #3164 without
    the CPU hit.

Other random stuff:

  • simplify forky block generics
  • fix withManyWrites multiple evaluation
  • fix validator key cache not being updated properly in chaindag
    read-only mode
  • drop pre-altair summaries from kvstore
  • recreate missing summaries from altair+ blocks as well (in case
    database has lost some to an involuntary restart)
  • print database startup timings in chaindag load log
  • avoid allocating superfluos state at startup
  • use a recursive sql query to load the summaries of the unfinalized
    blocks

@github-actions
Copy link

github-actions bot commented Jan 25, 2022

Unit Test Results

     12 files  ±0     806 suites  +4   31m 12s ⏱️ - 7m 58s
1 652 tests +1  1 604 ✔️ +1    48 💤 ±0  0 ±0 
9 665 runs  +4  9 561 ✔️ +4  104 💤 ±0  0 ±0 

Results for commit fba2cae. ± Comparison against base commit 29e2169.

♻️ This comment has been updated with latest results.

yield res
elif (let blck = db.getMergeBlock(res.root); blck.isSome()):
res.summary = blck.get().message.toBeaconBlockSummary()
yield res
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you can place a single yield after the if/elif/else section here. Reducing the number of yeilds in an inline iterator is beneficial for reducing the overall code size.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -433,19 +454,59 @@ proc init*(T: type ChainDAGRef, cfg: RuntimeConfig, db: BeaconChainDB,
withBlck(genesisBlock): BlockRef.init(genesisBlockRoot, blck.message)

var
headRef: BlockRef
backfillBlocks = newSeq[Eth2Digest](tailRef.slot.int)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this have to be tailRef.slot.int - backfill.slot semantically?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no: the seq is there to answer by-slot requests from the rest and libp2p api and needs to cover the entire genesis->tail range (for which we have no blockref) - backfill.slot identifies the "valid" part of this seq.

in other words: after backfilling is done, this seq holds an in-memory mapping of slot->root which we then use to load blocks from the database with.

Copy link
Member Author

@arnetheduck arnetheduck Jan 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, this is the seq we talked about that potentially could be removed in future releases and replaced with a database query, in a future release - it's still a bit early for that though

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the name misleading then? My first interpretation of backfillBlocks will be "the blocks that we backfilled"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't finalizedBlocks the sequence that starts from genesis?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one, perhaps confusing inconsistency is that finalizedBlocks in the database covers all finalized blocks, while the in-memory structure is split in two, at the tail. I don't have good ideas for how to make that more clear though, right now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But why does backfillBlocks need to go to genesis? Shouldn't it cover only the weak subjectivity horizon?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe in the future? right now, the de-facto standard is to backfill to genesis because that's the only point from which you reliably can regenerate states (there's no standard way to distribute states, except the debug REST api) - also, it's needed for clients such as .. say .. nimbus, that don't support checkpoint syncing: a hard-fork is needed before genesis support is dropped.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, and this gets us to my original question. Isn't the size of backfillBlocks semantically tailRef.slot.int - backfill.slot? (this works OK now because backfill.slot is zero and it will work OK in the future).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still no - backfill.slot is the "first" slot for which we have a block in the database. on checkpoint sync start, it will have the value of tail.slot, then it moves towards genesis. we don't want to reallocate the seq every time a block is backfilled, so we allocate the full seq at once.

When the chain has finalized a checkpoint, the history from that point
onwards becomes linear - this is exploited in `.era` files to allow
constant-time by-slot lookups.

In the database, we can do the same by storing finalized block roots in
a simple sparse table indexed by slot, bringing the two representations
closer to each other in terms of conceptual layout and performance.

Doing so has a number of interesting effects:

* mainnet startup time is improved 3-5x (3s on my laptop)
* the _first_ startup might take slightly longer as the new index is
being built - ~10s on the same laptop
* we no longer rely on the beacon block summaries to load the full dag -
this is a lot faster because we no longer have to look up each block by
parent root
* a collateral benefit is that we no longer need to load the full
summaries table into memory - we get the RSS benefits of #3164 without
the CPU hit.

Other random stuff:

* simplify forky block generics
* fix withManyWrites multiple evaluation
* fix validator key cache not being updated properly in chaindag
read-only mode
* drop pre-altair summaries from `kvstore`
* recreate missing summaries from altair+ blocks as well (in case
database has lost some to an involuntary restart)
* print database startup timings in chaindag load log
* avoid allocating superfluos state at startup
* use a recursive sql query to load the summaries of the unfinalized
blocks
* fix yields
* dispose minIdStmt
@zah zah merged commit d583e8e into unstable Jan 30, 2022
@zah zah deleted the fin-block-db branch January 30, 2022 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants