Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Resharding] Offline prototype for using Flat Storage to reconstruct trie #9105

Closed
Tracked by #8992
robin-near opened this issue May 24, 2023 · 4 comments · Fixed by #9161
Closed
Tracked by #8992

[Resharding] Offline prototype for using Flat Storage to reconstruct trie #9105

robin-near opened this issue May 24, 2023 · 4 comments · Fixed by #9161
Assignees

Comments

@robin-near
Copy link
Contributor

As background context, there are three candidate solutions for the resharding problem: (S1) restructure the trie so that shard range corresponds to a trie range; (S2) when resharding, shallow copy the account subtries while prefixing each trie node with account ID or prefix; (S3) when resharding, always reconstruct new tries from flat storage.

(S1) requires a protocol upgrade and data migration; (S2) requires a data migration, and both of these data migrations require rebuilding the entire trie. Iterating through the old trie to rebuild the new trie is infeasible and will not complete within an epoch. Therefore, all three solutions require the ability to reconstruct a trie from flat storage; the only difference being that (S3) requires it permanently and (S1) and (S2) only require it for the initial migration.

For this task, we need to build a prototype, an offline tool to reconstruct tries using flat storage. The goal is to (G1) show that it works; (G2) estimate how long it takes. This will help us make a decision for whether (S3) is good enough for the foreseeable future, or we need to pursue (S1) or (S2).

In addition to the ability to reconstructing tries, we also need the ability to snapshot the flat storage state at the beginning of an epoch.

There is existing work by @Longarithm to use flat storage to perform state sync, so this should already include these abilities.

@robin-near
Copy link
Contributor Author

Existing work for state sync mentioned above: #8927

@Longarithm
Copy link
Member

@nikurt's work on snapshots in the beginning of an epoch: #9090

@shreyan-gupta
Copy link
Contributor

shreyan-gupta commented Jun 13, 2023

Implemented a batched version of flat storage to trie with batch size as 500 MB so as to not overwhelm memory usage. Please see attached PR.

I ran this using the storage from a mainnet canary node as that has flat state, and using a standard n2-standard-8 VM with 32 GB RAM, but I noticed the neard process only uses one processor.

Shard Time to build trie Size of trie
shard0 6 min 6 sec 8.0 GB
shard1 5 min 31 sec 6.6 GB
shard2 4 min 41 sec 6.1 GB
shard3 13 min 23 sec 18 GB

Couple of observations

  • We can iterate through the flat storage and build the trie in a reasonable amount of time.
  • Varying the batch size didn't lead to too much difference in the processing time.
  • Having a very large batch size, 2 GB, lead to out of memory error.
  • Typically most of the time goes in writing to the trie storage. Fetching/iterating through the flat storage is quick.
  • The time taken is typically proportional to the number of entries (iterated over) and not the size of the entries.

This hopefully means it should be feasible to do a one time migration for the trie and flat storage for validator nodes. We still however need to think about the practicality of archival nodes.

@wacban
Copy link
Contributor

wacban commented Jun 14, 2023

This hopefully means it should be feasible to do a one time migration for the trie and flat storage for validator nodes. We still however need to think about the practicality of archival nodes.

This also means that we likely can afford to perform resharding by reconstructing trie from flat storage every time. The resharding will be happening in the background and the time constraint is that catchup + resharding fit within one epoch with some healthy margin.

practicality of archival nodes

This issue is primarily about the resharding strategy so it should be orthogonal to the archival data. When resharding we keep the old data as is so there won't be a need for any lengthy migrations on archival nodes. Archival nodes are important to consider but they only come into play when considering the storage structure. Let us consider it as part of that workflow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants