Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Resharding] Investigate the migration feasibility of <account id> <node hash> #9204

Closed
wacban opened this issue Jun 15, 2023 · 1 comment
Closed
Assignees
Labels
T-core Team: issues relevant to the core team

Comments

@wacban
Copy link
Contributor

wacban commented Jun 15, 2023

  • figure out the right way to perform the migration
  • benchmark how long it could be
  • keep in mind archival nodes and cold storage archival nodes
@wacban wacban self-assigned this Jun 20, 2023
@wacban wacban added the T-core Team: issues relevant to the core team label Jun 20, 2023
@wacban
Copy link
Contributor Author

wacban commented Jun 22, 2023

As discussed it does not seem to be feasible to perform the migration for an archival node.

The main issue here is that in order to migrate a record stored with key shard_uid node_hash to a record stored with key account_id node_hash we would need to know what the account id is. This information is not stored within the value for most (all?) value types and so it's impossible to easily get it by just iterating the state column in rocksdb.

The only way that I can think of to find out the account id is to iterate the trie starting from a root and migrate all nodes in the trie. For an RPC node it may still be possible by migrating all of the nodes at the current head by using flat storage and then waiting for 5-6 epochs to let garbage collection removed the unmigrated data.

I believe this approach would not work for an archival node unfortunately and I can't think of any good way to go about it.

One potential alternative could be to maintain the historical data in the old format. In this approach we keep old data with old keys and newer data with new keys. We would need to either duplicate a lot of the data or be prepared to fallback to the old key format thus incurring a 2x slowdown.

I'm very much open to any new ideas but as far as I see it, it doesn't seem a possible solution right now.

It's worth mentioning that the archival nodes may be replaced by read-rpc in the future but it seems unlikely to happen before we would need it to implement resharding.

@wacban wacban closed this as completed Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-core Team: issues relevant to the core team
Projects
None yet
Development

No branches or pull requests

1 participant