Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stagedsync: fix bor heimdall mining flow #9149

Merged
merged 11 commits into from
Jan 9, 2024
Merged

Conversation

taratorio
Copy link
Member

@taratorio taratorio commented Jan 5, 2024

Currently the mining loop is broken for the polygon chain. This PR fixes this.

High level changes:

  • Introduces new Bor<->Heimdall stage specifically for the needs of the mining flow
  • Extracts out common logic from Bor<->Heimdall sync and mining stages into shared functions
  • Removes mine flag for the Bor<->Heimdall sync stage
  • Extends the current StartMining function to prefetch span zero if needed before the mining loop is started
  • Fixes Bor to read span zero (instead of span 1) from heimdall when the span is not initially set in the local smart contract that the Spanner uses

Test with devnet "state-sync" scenario:
Screenshot 2024-01-05 at 17 41 23


// Whitelist service is called to check if the bor chain is on the canonical chain according to milestones
whitelistService := whitelist.GetWhitelistingService()
if whitelistService != nil && !whitelistService.IsValidChain(headerNum, []*types.Header{header}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, we might not need this check at all in the mining stage as the block passes through sync stages anyways. But, if the plan is to refactor that part and prevent the block from going to every sync staged when it's mined by the node itself, then yeah we can keep this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manav2401 blocks are broadcast to the p2p network when they are freshly out of the mining loop (in addition to being communicated to the sync loop) - think this check is here to prevent the miner from broadcasting a bad block, ive left it in to maintain same logic as before my refactor

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright yeah makes sense. Although, it's very rare for this to happen but if the node is out of sync and still tries to produce a block, it may happen. Thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we can always revise this later once we do bigger refactors - will keep it in mind

@taratorio taratorio merged commit 74ec3a9 into devel Jan 9, 2024
7 checks passed
@taratorio taratorio deleted the fix-bor-mining-loop branch January 9, 2024 11:37
taratorio pushed a commit that referenced this pull request Jan 29, 2024
This PR fixes 2 things which are more commonly visible in multi client
devnet setups for polygon.

Context: The span logic is a bit different when it comes to first 2
spans. Bor/Erigon makes the first commit for a span during the start of
sprint (i.e. if sprint length is 16, it will call `commitSpan` at block
16 in bor consensus for the first time). Span 0 is hard coded in genesis
contracts so it needs to commit span with `id=1` on that block (see
equivalent code in bor
[here](https://github.com/maticnetwork/bor/blob/v1.2.3/consensus/bor/bor.go#L1150-L1152)).
At that time, it needs to have 1st span available. Bor fetches it on the
go while erigon processes it in a separate stage and stores in a
snapshot. Hence, we'd need to fetch 1st span as an exception in erigon
while we're still in span 0 but also need to make sure that it doesn't
block processing of any previous blocks.

Based on #9149, the span ID
used to fetch and commit span was wrong and the span 1 needs to be
loaded explicitly in 1st sprint.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants