Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot initialize BABE when epoch data is not stored in database #3886

Open
1 task
EclesioMeloJunior opened this issue Apr 13, 2024 · 1 comment · May be fixed by #4105
Open
1 task

Cannot initialize BABE when epoch data is not stored in database #3886

EclesioMeloJunior opened this issue Apr 13, 2024 · 1 comment · May be fixed by #4105
Assignees
Labels
C-complex Complex changes across multiple modules. Possibly will require additional research. S-babe issues related to block production functionality. T-bug this issue covers unexpected and/or wrong behaviour.
Milestone

Comments

@EclesioMeloJunior
Copy link
Member

EclesioMeloJunior commented Apr 13, 2024

Describe the bug

The log error:

CRITICAL failed to run block production engine: cannot handle epoch: cannot initiate and get epoch handler: failed to initiate epoch: cannot get epoch data and start slot: getting epoch data for epoch 19: failed to get epoch data from memory: epoch not found in memory map: 19	pkg=babe

Motivation

  • The critical error happens if a new epoch starts and but its data still in memory due to no block finalization (not a problem since block finalization can take sometime to happen)
  • If the node is shutdown the next time it restarts it will face the problem: epoch not found in memory map

Explanation about the problem

  • The problem happens due to the fact that Gossamer holds valuable information in memory until a finalization happens.
  • Currently that happens since is easy to hold and acesses information that needs to be fork-aware (which means, different informations needs to be stored/read for different forks of the chain), for that we use in-memory maps

Suggested approach

  • Use the disk to hold such informations as well as keep them in the memory for fast lookup. Once the node starts (fresh or not) read the infos from the disk and place them into memory again, enabling the BABE to start in the correct epoch.
  • You can use the current disk database to hold these informations in the following format: key as the following string ${epoch_number}:${fork_hash} and holds the value as the SCALE encoded types.NextEpochData or types.NextConfigDataV1. Look at dot/state/epoch.go the type nextEpochMap which is the one that must be in sync with the disk
  • Free the database once a finalization happens, since you're keeping informations about forks, once they got pruned you must remove them (you don't need to keep informations about the finalized one since it will be stored again but in another key/value format)

Acceptance criteria

  • Make sure that Gossamer can provide the correct epoch data (simulate using forks) after a shutdown before finalization happens
@EclesioMeloJunior EclesioMeloJunior self-assigned this Apr 13, 2024
@EclesioMeloJunior EclesioMeloJunior added C-complex Complex changes across multiple modules. Possibly will require additional research. S-babe issues related to block production functionality. T-bug this issue covers unexpected and/or wrong behaviour. labels Apr 13, 2024
@P1sar P1sar removed this from the Block production. BABE milestone Apr 15, 2024
@ramiroJCB
Copy link
Contributor

@EclesioMeloJunior helloo, is there an easy way to replicate this behavior ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-complex Complex changes across multiple modules. Possibly will require additional research. S-babe issues related to block production functionality. T-bug this issue covers unexpected and/or wrong behaviour.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants