[Resharding] Offline prototype for using Flat Storage to reconstruct trie #9105

robin-near · 2023-05-24T18:24:38Z

As background context, there are three candidate solutions for the resharding problem: (S1) restructure the trie so that shard range corresponds to a trie range; (S2) when resharding, shallow copy the account subtries while prefixing each trie node with account ID or prefix; (S3) when resharding, always reconstruct new tries from flat storage.

(S1) requires a protocol upgrade and data migration; (S2) requires a data migration, and both of these data migrations require rebuilding the entire trie. Iterating through the old trie to rebuild the new trie is infeasible and will not complete within an epoch. Therefore, all three solutions require the ability to reconstruct a trie from flat storage; the only difference being that (S3) requires it permanently and (S1) and (S2) only require it for the initial migration.

For this task, we need to build a prototype, an offline tool to reconstruct tries using flat storage. The goal is to (G1) show that it works; (G2) estimate how long it takes. This will help us make a decision for whether (S3) is good enough for the foreseeable future, or we need to pursue (S1) or (S2).

In addition to the ability to reconstructing tries, we also need the ability to snapshot the flat storage state at the beginning of an epoch.

There is existing work by @Longarithm to use flat storage to perform state sync, so this should already include these abilities.

robin-near · 2023-05-24T18:26:08Z

Existing work for state sync mentioned above: #8927

Longarithm · 2023-05-24T18:30:40Z

@nikurt's work on snapshots in the beginning of an epoch: #9090

shreyan-gupta · 2023-06-13T21:45:56Z

Implemented a batched version of flat storage to trie with batch size as 500 MB so as to not overwhelm memory usage. Please see attached PR.

I ran this using the storage from a mainnet canary node as that has flat state, and using a standard n2-standard-8 VM with 32 GB RAM, but I noticed the neard process only uses one processor.

Shard	Time to build trie	Size of trie
shard0	6 min 6 sec	8.0 GB
shard1	5 min 31 sec	6.6 GB
shard2	4 min 41 sec	6.1 GB
shard3	13 min 23 sec	18 GB

Couple of observations

We can iterate through the flat storage and build the trie in a reasonable amount of time.
Varying the batch size didn't lead to too much difference in the processing time.
Having a very large batch size, 2 GB, lead to out of memory error.
Typically most of the time goes in writing to the trie storage. Fetching/iterating through the flat storage is quick.
The time taken is typically proportional to the number of entries (iterated over) and not the size of the entries.

This hopefully means it should be feasible to do a one time migration for the trie and flat storage for validator nodes. We still however need to think about the practicality of archival nodes.

wacban · 2023-06-14T19:48:18Z

This hopefully means it should be feasible to do a one time migration for the trie and flat storage for validator nodes. We still however need to think about the practicality of archival nodes.

This also means that we likely can afford to perform resharding by reconstructing trie from flat storage every time. The resharding will be happening in the background and the time constraint is that catchup + resharding fit within one epoch with some healthy margin.

practicality of archival nodes

This issue is primarily about the resharding strategy so it should be orthogonal to the archival data. When resharding we keep the old data as is so there won't be a need for any lengthy migrations on archival nodes. Archival nodes are important to consider but they only come into play when considering the storage structure. Let us consider it as part of that workflow.

robin-near mentioned this issue May 24, 2023

🔷 [Tracking issue] Resharding v2 #8992

Closed

shreyan-gupta self-assigned this May 24, 2023

shreyan-gupta linked a pull request Jun 12, 2023 that will close this issue

[resharding] Offline tool to construct trie from flat storage #9161

Merged

near-bulldozer bot closed this as completed in #9161 Jun 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Resharding] Offline prototype for using Flat Storage to reconstruct trie #9105

[Resharding] Offline prototype for using Flat Storage to reconstruct trie #9105

robin-near commented May 24, 2023

robin-near commented May 24, 2023

Longarithm commented May 24, 2023

shreyan-gupta commented Jun 13, 2023 •

edited

Loading

wacban commented Jun 14, 2023

[Resharding] Offline prototype for using Flat Storage to reconstruct trie #9105

[Resharding] Offline prototype for using Flat Storage to reconstruct trie #9105

Comments

robin-near commented May 24, 2023

robin-near commented May 24, 2023

Longarithm commented May 24, 2023

shreyan-gupta commented Jun 13, 2023 • edited Loading

wacban commented Jun 14, 2023

shreyan-gupta commented Jun 13, 2023 •

edited

Loading