Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ReshardingV3] Storage - FlatState #11909

Closed
4 tasks
Longarithm opened this issue Aug 8, 2024 · 4 comments
Closed
4 tasks

[ReshardingV3] Storage - FlatState #11909

Longarithm opened this issue Aug 8, 2024 · 4 comments
Assignees

Comments

@Longarithm
Copy link
Member

Answer questions

  • How to split state in the db (inc migration if needed)
  • How to clean state after state sync
  • How to generate parts for state sync
  • How to clean state after resharding (parent state)
@Longarithm
Copy link
Member Author

Longarithm commented Aug 9, 2024

Prerequisite: Reworking flat state key prefix

Current Situation

  • All keys are prefixed with ShardUId
  • ShardUId is redefined after every resharding, requiring rewriting all keys

Goals

  • Make changes as local as possible
  • Avoid migration for CPs and RPC nodes if tracked shards don't change

Proposed Changes

  1. For keys with account IDs:

    • Remove the ShardUId prefix entirely
  2. For other keys:

    • Replace the ShardUId prefix with ShardId or another stable way to index shards (TBD)
    • Short-term solution: Set prefix to [255] + ShardUId to separate these keys

Implementation Details

Current implementation

Currently, query to get all keys for shard_uid in range [L, R) is equivalent to iter_flat_state_entries(shard_uid_bytes + L, shard_uid_bytes + R). We needed that because of "fixed accounts" feature in shard layout. As it is now removed, shard uid is no longer required.

We need to modify only the iter_flat_state_entries which accepts shard_uid, from, to and returns iterator over flat state entries for the requested range.
The major usecases are

  • Loading memtrie, where the whole range for shard is required
  • State part generation. Could be served through memtrie, but we don't support concurrent access yet.

Except for the second case, range is (None, None) (whole range).
To switch to the range of actual trie keys, replace first None with [0] + boundary_account_bytes(i) and the second None with [TRIE_KEY_MAX_TYPE_BYTE + 1].

Translating TrieKey range [L, R) to DBCol::FlatState queries

Depends on the first byte of the keys (trie key type)

  1. If L[0] > R[0]: Return vec![]

  2. If L[0] == R[0]: One query

    if L[0] type contains account id:
        Store::iter_range(L, R)
    else:
        Store::iter_range(shard_uid_prefix + L, shard_uid_prefix + R)
    
  3. If L[0] < R[0]: Multiple queries

    if L[0] contains account id:
        iter_range(L, [L[0]] + boundary_account_bytes(i + 1) or iter_range(L, [L[0] + 1]) if this is the last shard)
    else:
        iter_range(shard_uid_prefix + L, shard_uid_prefix + [L[0] + 1])
    
    for B in L[0] + 1..=R[0] - 1:
        if B contains account id:
            iter_range([B] + boundary_account_bytes(i), [B] + boundary_account_bytes(i + 1) or [B + 1])
        else:
            iter_range(shard_uid_prefix + [B], shard_uid_prefix + [B + 1])
    
    if R[0] contains account id:
        iter_range([R[0]] + boundary_account_bytes(i), R)
    else:
        iter_range(shard_uid_prefix + [R[0]], shard_uid_prefix + R)
    

Migration Process

  1. For keys with account IDs:
    • Remove first 8 bytes from key
  2. For keys without IDs:
    • Prepend key with [255]

Resharding State Update (continuous)

  1. Copying state:
    • We copy state for one shard to another for specific big account or small accounts range
    • Directly commit new state items to FlatState
    • Translation from trie key to flat state key:
      • With account IDs: No changes required
      • Without account IDs: Prepend key with shard_uid_prefix
  2. Removal:
    • Once resharding completes, we need to remove range of accounts from old shard
    • Issue range removal in the same fashion as described in the previous section

For instant resharding, we are likely to know the range of accounts to remove as well.

Alternative ideas

This is definitely nontrivial logic, so other helpful alternatives could be considered:

  • Store keys without account ids in a separate column. Allows to drop [255] but the drawback is that now two columns must be locked for snapshot.
  • Require state part generator to request parts with keys only belonging to the same type. This would allow less DB requests in one iter_flat_state_entries call. But then, logic to create state parts would become more complicated.
  • We could change the trie key "layout" in FlatState column. For example, in trie, all keys with account ids have view [B] + account id + suffix. Inside FlatState, we can swap these parts of key, thus store account id + [B] + suffix. If we didn't have keys without account ids, this would allow us to get all keys for a shard in a single range query [boundary_account_bytes(i), boundary_account_bytes(i+1)) which would not touch keys from other shards now.
    However, this exact approach still would change order of iteration over trie keys.
  • Migrate trie to some better format, in the same fashion as described in the previous item. This would be very reasonable. However, requirements are unfortunate:
    • each state root has to be fully recomputed;
    • old format still would have to be supported to replay old blocks.
  • As a slightly easier migration, we could consider starting all account id keys from 0... and non-account id keys from 128... which could make logic a bit more beautiful.

@wacban
Copy link
Contributor

wacban commented Aug 16, 2024

@Longarithm In Instant Resharding, how fast do you think can we split the queued and buffered receipts? Would we need and additional tricks to make it work?

@Longarithm
Copy link
Member Author

I think we should update these receipts lazily, as described here: #11966

@Longarithm
Copy link
Member Author

Superseded by #12174

@Longarithm Longarithm closed this as not planned Won't fix, can't repro, duplicate, stale Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants