Improve block processing performance during re-org #2805
Labels
A1
major-task
A significant amount of work or conceptual task.
optimization
Something to make Lighthouse run more efficiently.
Description
Consider the following re-org that frustrates Lighthouse's attempts to process blocks quickly:
Let
n
be a slot on an epoch boundary (n % 32 == 0
).n
the preemptive state advance occurs as normaln
arrives super late (12s+), consuming the advanced staten + 1
arrives on time, but builds upon the parent at slotn -1
. It's going to be super slow to process because its parent state is missing from the cache, meaning:a) We need to load the full state for slot
n - 1
from disk (a few hundred ms)b) We need to transition that state through an epoch boundary (200ms)
c) We need to store the state for slot
n
on disk. It is different from the slotn
slot with blockn
applied, and presently we store every epoch boundary stateExample
Here's an instance of this behaviour that I observed at slot
n=2485472
on mainnet, resulting in block processing taking 2.5s instead of the usual 80ms (median) or 456ms (99th percentile) (metrics from sigp/lighthouse-metrics#31).Even though the block arrived on time, taking 2.5s to process it meant that any attestations at this slot would have missed (if running on this node).
Additional Info
It should be noted that this behaviour should be quite rare, due to the infrequency of re-orgs and late blocks on mainnet (at worst ~4% of blocks are late, with very few being 12s+ late). However if proposer boosting is adopted we may see more re-orgs of this type, where a proposer intentionally orphans the previous block despite it having been published.
The text was updated successfully, but these errors were encountered: