Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Erigon 3: Purified states repressentation #13227

Merged
merged 63 commits into from
Jan 8, 2025
Merged

Erigon 3: Purified states repressentation #13227

merged 63 commits into from
Jan 8, 2025

Conversation

Giulio2002
Copy link
Contributor

Purifification of states works by pruning historical states from dangling nodes.

The process is to collect all keys into a temporary MDBX with mapping node -> layer , then we iterate over each node again and remove all of them whose node->layer is not matching in the historical states DB.

@AskAlexSharov
Copy link
Collaborator

AskAlexSharov commented Dec 27, 2024

I plan on having this done only once every 2 years - main problem of such commands - nobody know that they are exist. for example: we have "skip jump-dest analysis feature" core/skip_analysis.go: there was CLI command and release.md doc with run this command step. Now i can't find release.md and can't find this command. Same - will happen with your command: no comments, you didn't add step to release.md, etc...

aha, while i wrote this comment: i accidentally found RELEASE_INSTRUCTIONS.md and there is section state checkChangeSets. aha, few months ago i removed state checkChangeSets command because it's was not compatible with E3 and we have erigon snapshots integrity command. and now i found that this command is part of release process - and nobody complained about it.

:-)

@Giulio2002
Copy link
Contributor Author

Giulio2002 commented Dec 27, 2024

I can do some tweaks so that it can be added as part of the automation.

the main modification will be adding a check to L0 and see the last date of modification. if it is >3 months then we purify. how does that sounds? naturally - if we regen snapshots, then we need to force it through. another approach (perhaps better) maybe is to require a minimum skip ratio for L0... say >10%.

Also forgot to specify benefits:

ethmainnet: -73 GB
polygon: -147 GB

@Giulio2002 Giulio2002 enabled auto-merge (squash) December 31, 2024 12:29
@Giulio2002
Copy link
Contributor Author

Giulio2002 commented Jan 4, 2025

Putting here your questions and my answers from discord

  • do you see any way to make it more determenistic?

Not really.

  • do you see any way to make it less human-mistake-proof?

I made it so - it is fully automatic and you just need to purify.

  • Which domains do you plan to purify? only commitment is enough? Or all which pass min-skip-ratio-l0? only L0? we always purify only L0?

We purify L0,L1,L2,... L0 has biggest benefit and commitment is enough.

  • need add step to release.md or we will forget purify command.

Yes, I did that

  • this line feels a bit confusing StringVar(&purifyDir, "purifiedDomain", "purified-output", "")

that is the output dir which is only relevant with --replace-in-datadir=false

  • If we release purified domains - we will not have backup (almost) of non-purified .kv files. in case we need rollback. Let's expicitrly backup our R2 buckets before purified files release - and maybe put some meaningful label on this backup (devopses have this button).

I think this is not so important as functionally they are the same.

  • then i will try run purify command on ethmainnet snapshotter.

Did it for you

@AskAlexSharov
Copy link
Collaborator

“ Did it for you” a- i didn’t try. Left it for you.

@Giulio2002 Giulio2002 merged commit 62c8e81 into main Jan 8, 2025
13 checks passed
@Giulio2002 Giulio2002 deleted the ololo branch January 8, 2025 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants