Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support block replayability with enabled flat storage #8741

Closed
Longarithm opened this issue Mar 16, 2023 · 3 comments · Fixed by #11767
Closed

Support block replayability with enabled flat storage #8741

Longarithm opened this issue Mar 16, 2023 · 3 comments · Fixed by #11767
Labels
A-storage Area: storage and databases

Comments

@Longarithm
Copy link
Member

With flat storage enabled, we currently don't support state-viewer apply range command. See original discussions here and here.

So far we know that:

  • it is annoying to some developers, as this command is useful for them for replaying blocks and get proxies for performance estimation;
  • replaying blocks is an actual usecase by some NEAR users, though there was no critical demand for it so far.

So this is considered lower priority than #7327. However, let's provide several options to solve it.

Recreate FS

Just create FS from scratch, using trie. @mm-near confirmed once that it takes < 1h on rpc nodes.

Concerns

  • If shard size is too big, FS creation time grows with it.
  • Waiting for 1h to replay a single block is annoying.

Recreate "partial" FS

Some kind of hack: store all trie keys ever read for each specific block; if we need to replay a block, create "partial" FS including only these keys and execute a block, no issue should occur.

Concerns

  • We don't store these "trie keys ever read" now. If we have some range in history for which they are not stored, it is not clear how to create this "snapshot".
  • Increase of disk usage

Fallback to Trie

Fallback to Trie makes a lot of sense; however, it would spoil chunk cache. But it can be solved if we specifically disable chunk cache during storage_read, and other storage ops when we use FS there as well.

Concerns

  • Design of chunk cache will be even more hacky. It is already annoying to enable/disable chunk cache before/after funcall execution.

FS rollback

If block to replay is not far away, we can rollback flat storage, until its head is the same as desired block.

Concerns

  • We don't support rollback yet. Storing "old" values in deltas is technically annoying but can be done in, say, 2-3 weeks.
  • If block is too far away (years ago?), rollback takes a lot of time.
@Longarithm
Copy link
Member Author

WIth in-memory trie implemented, we have an option to implement which shouldn't consume crazy amount of time.
Say, in 1 hour we create in-memory trie for shard from archival node data, then create flat storage for shard, then replay chunk. It will be helpful to replay incidents.

@robin-near
Copy link
Contributor

On the "Recreate partial FS" idea, can we run these transactions once, mark all the storage that is needed to run them, populate partial FS, and then execute them again for real?

@Longarithm
Copy link
Member Author

Longarithm commented Apr 12, 2024

Configuration removed in #10490 can be helpful to replay chunks without having flat storage at all.

github-merge-queue bot pushed a commit that referenced this issue Apr 18, 2024
It was broken after #10961 because format of command changed. Fixing the
format.
Note that it is not a complete solution because of #8741, but the issue
may occur only if charged costs are complicated, which is unlikely the
case for the test.

Nayduck https://nayduck.nearone.org/#/run/50

Co-authored-by: Longarithm <the.aleksandr.logunov@gmail.com>
github-merge-queue bot pushed a commit that referenced this issue Jul 12, 2024
This PR fixes commands such as `neard view-state apply` and `neard
view-state apply-range` when used on a block height that doesn't have
flat storage built for it.

It works by re enabling the feature of accessing trie nodes without
paying gas costs (removed in #10490).

Probably it also fixes #8741.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-storage Area: storage and databases
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants