Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise historic sequential state lookups #3873

Closed
AgeManning opened this issue Jan 12, 2023 · 8 comments
Closed

Optimise historic sequential state lookups #3873

AgeManning opened this issue Jan 12, 2023 · 8 comments
Assignees
Labels
database optimization Something to make Lighthouse run more efficiently.

Comments

@AgeManning
Copy link
Member

Description

When searching for sequential historical states, Lighthouse repeatedly reconstructs the state from the last checkpoint.

We could optimize requests of this kind by temporarily caching the reconstructed state to potentially use for following calls. This could help prevent significant state reconstruction.

@AgeManning AgeManning added the enhancement New feature or request label Jan 12, 2023
@int88
Copy link
Contributor

int88 commented Apr 6, 2023

@AgeManning I'd like to research on this, if @michaelsproul has not started :)

@michaelsproul
Copy link
Member

I haven't started, that would be great!

Have a look in the beacon_node/store crate. I think that would be the most natural place to add the cache, and it should be used in get_cold_state either as the result or as the starting point for the result

@int88
Copy link
Contributor

int88 commented Apr 8, 2023

The most naive solution to this issue might be:

Add a lru cache in HotColdDB for states, just like block_cache for blocks.

In get_state() lookup state_cache first, if not found, load it from db and update state_cache in the end.

@michaelsproul WDYT?

@int88
Copy link
Contributor

int88 commented Apr 11, 2023

ping @michaelsproul :)

@michaelsproul
Copy link
Member

I don't think a simple LRU cache on its own will be very useful because it will only accelerate lookups for the exact same state. The way state lookups work is by retrieving a base state (a "restore point") and then replaying blocks on top of that. I'll write-up how a single state lookup works so you can see what I mean:

Imagine the user wants to load the state at slot 8128. The --slots-per-restore-point param is set to 8192 so there are full states (restore points) stored at slot 0 and slot 8192. We can't use the restore point from slot 8192 because we can't apply blocks in reverse (they are not diffs), so we have to use the state at slot 0. We load all the blocks from 1-8128 and apply them on top of the slot 0 state here. This yields the desired state at slot 8128, but also passes over every state in between at slot 1, 2, 3, .., 8127.

Now imagine that the user wants to lookup the state one epoch after the previous one looked up, the state at slot 8160. Currently Lighthouse will load the state at slot 0 again, and replay all the blocks from 1-8160. This is a huge waste, because if we had saved the state from slot 8128 we could just replay the last epoch of blocks, which is going to be much faster.

To be able to take advantage of that cached state we need two things:

  • An LRU cache for recently looked-up states.
  • Changes to load_cold_state so that it uses the cache where possible instead of the nearest restore point.

The algorithm for using the cache could plug-in here to choose a different low_restore_point state from the cache.

To find a suitable state in the cache we could use one of 2 obvious algorithms:

  1. Iterate through the cache to find the cached state with the highest slot that is <= slot (the slot of the state we want to reconstruct). This is probably good because it's O(cache_size) and we expect the cache to be small.
  2. Iterate backwards from the target slot until we find a state in the cache. This is O(slots_per_restore_point) and probably slower, but has the advantage that it can short-circuit without iterating the whole range.

I probably prefer approach (1) for now, especially as it's probably only feasible to keep <32 states in the cache.

@michaelsproul michaelsproul added optimization Something to make Lighthouse run more efficiently. database and removed enhancement New feature or request labels Apr 11, 2023
@int88
Copy link
Contributor

int88 commented Apr 11, 2023

@michaelsproul appreciate your detailed explanation :)

So the cache only work for states in the coldDB? As in hotDB, we would save full state on epoch boundary, it would not be too wasteful to replay there.

WDYT?

@michaelsproul
Copy link
Member

Yeah, lets just target the cold DB. You're right that replaying in the hot DB isn't so bad, and we have another long-running project to make the caching of hot states more efficient (#3206).

bors bot pushed a commit that referenced this issue May 5, 2023
## Issue Addressed

#3873

## Proposed Changes

add a cache to optimise historical state lookup.

## Additional Info

N/A


Co-authored-by: Michael Sproul <micsproul@gmail.com>
@michaelsproul
Copy link
Member

Implemented in #4228

ghost pushed a commit to oone-world/lighthouse that referenced this issue Jul 13, 2023
## Issue Addressed

sigp#3873

## Proposed Changes

add a cache to optimise historical state lookup.

## Additional Info

N/A


Co-authored-by: Michael Sproul <micsproul@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
database optimization Something to make Lighthouse run more efficiently.
Projects
None yet
Development

No branches or pull requests

3 participants