Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flat Storage #399

Merged
merged 29 commits into from
Mar 21, 2023
Merged

Flat Storage #399

merged 29 commits into from
Mar 21, 2023

Conversation

mzhangmzz
Copy link
Contributor

@mzhangmzz mzhangmzz commented Sep 29, 2022

Proposal of Flat Storage to improve state storage reads, which aims to make state storage more stable and make fees more predictable, and sketch the plan for future improvements.

@render
Copy link

render bot commented Sep 29, 2022

@frol frol added WG-protocol Protocol Standards Work Group should be accountable S-draft/needs-author-revision A NEP in the DRAFT stage that needs an author revision. labels Sep 30, 2022
Copy link
Contributor

@akhi3030 akhi3030 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial feedback. In general, looks very good to me.

neps/nep-0399.md Outdated Show resolved Hide resolved
neps/nep-0399.md Outdated Show resolved Hide resolved
neps/nep-0399.md Outdated Show resolved Hide resolved
neps/nep-0399.md Outdated Show resolved Hide resolved
neps/nep-0399.md Outdated Show resolved Hide resolved
neps/nep-0399.md Show resolved Hide resolved
neps/nep-0399.md Outdated Show resolved Hide resolved
neps/nep-0399.md Outdated Show resolved Hide resolved
neps/nep-0399.md Outdated
Since one value ref takes 40 bytes, limit of total size of changed value refs in a block
is then 3.2 MiB.

To sum it up, we will have < 54 MiB for one block, and ~1.1 GiB for 20 blocks.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • How do we ensure that FSDs are kept in memory? Will we use a memory only data struct for them like hash maps?
  • Do we know how many memory accesses we need to check if a worst case FSD of 54 MiB has a given key or not?

Copy link
Contributor Author

@mzhangmzz mzhangmzz Oct 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Yes. They will be stored in FlatStateDelta. They are also persisted on disk, but we don't read from disk during block processing. We only read from disk for this column when node first starts and FlatStorage loads deltas for all blocks after the flat head into memory.
  • Most of the times 2 because in most case there is no fork, and the chain head should be at last final block + 2. Worst case, it can be up to he number of blocks between the last final block to this block. If we implement the idea of setting gas limit to zero for blocks that are further than X blocks from the final block, then the number of in memory hashmap lookup will be X.

@mzhangmzz mzhangmzz marked this pull request as ready for review October 5, 2022 02:54
@mzhangmzz mzhangmzz requested a review from a team as a code owner October 5, 2022 02:54
@frol frol added S-draft/needs-moderator-review A NEP in the DRAFT stage that needs a moderator review. and removed S-draft/needs-author-revision A NEP in the DRAFT stage that needs an author revision. labels Oct 5, 2022
@frol
Copy link
Collaborator

frol commented Oct 5, 2022

  • The NEP is well-formatted
  • Reference implementation (@near/wg-protocol should we hold this NEP in S-draft/needs-implementation or are you good to transition it to S-review/needs-wg-review?)

@frol frol added S-draft/needs-implementation A NEP in the DRAFT stage that needs implementation. and removed S-draft/needs-moderator-review A NEP in the DRAFT stage that needs a moderator review. labels Oct 5, 2022
neps/nep-0399.md Outdated Show resolved Hide resolved
neps/nep-0399.md Outdated

However, the consensus algorithm doesn’t provide any guarantees in the distance of blocks
that we need to process since it could be arbitrarily long for a block to be finalized.
To solve this problem, we make another proposal (TODO: attach link for the proposal) to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This TODO needs to be resolved (or removed) before accepting.

Copy link
Contributor Author

@mzhangmzz mzhangmzz Oct 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I plan to write separate NEP for this. WIP.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the additional NEP will cover this topic in detail. I am curious about the case where h or one its children has many outgoing branches vs. h has a single deep branch. Will the proposal below work just fine in both cases and ensure that we do not store too many FSD in memory?

mfornet
mfornet previously approved these changes Oct 6, 2022
neps/nep-0399.md Outdated
FlatStorage only needs to store at most X FSDs in memory.

### FSD size estimation
To set the value of X, we need to see how many block deltas can fit in memory.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any chance on abnormal conditions of the network (maybe during an attack), that this value of X is not large enough, and the client will return a different value after reading than the one expected?

As I understand, we are fetching this value from a non-finalized block, so there is a chance for this to happen, even if highly improbable. How would the system react in that case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question! This is actually the part that I left out about the proposal to set gas limit to zero for blocks with height larger than the latest final block’s height + X. If we implement that proposal, blocks that are further than X blocks away from the last final block can't include any new chunks, in other words, they are just empty blocks. This is a way to allow the blockchain to recover in case of some abnormal conditions. With that change, FlatStorage only need to store deltas for blocks from the last final block to last final block + X, thus at most X blocks. Any other block further than last final block + X will just have an empty delta.

@mfornet mfornet self-requested a review October 6, 2022 15:10
Copy link
Member

@mfornet mfornet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't mean to approve, it was a misclick.

@mfornet mfornet dismissed their stale review October 6, 2022 15:12

Misclick

@mfornet mfornet self-requested a review October 6, 2022 15:13
Co-authored-by: Marcelo Fornet <mfornet94@gmail.com>
@ori-near ori-near added the A-NEP A NEAR Enhancement Proposal (NEP). label Oct 13, 2022
@jakmeier jakmeier mentioned this pull request Nov 9, 2022
26 tasks
@ori-near ori-near added S-review/needs-wg-to-assign-sme A NEP that needs working group to assign two SMEs. and removed S-draft/needs-implementation A NEP in the DRAFT stage that needs implementation. labels Nov 18, 2022
@ori-near
Copy link
Contributor

As the moderator, I would like to thank @mzhangmzz for submitting this NEP, and for the @near/wg-protocol working group members for your review. Based on the voting indications above, it seems like this proposal is close to a decision. We are therefore scheduling the fourth Protocol Work Group call, where this NEP can enter the final voting stage. Anyone can discuss the technical details by adding your comments to this discussion. And join the call to learn about the final decision and how to get more involved.

Meeting Info
📅 Date: Thursday, March 16, 4 PM UTC
✏️ Register

@ori-near
Copy link
Contributor

Thank you to everyone who attended the fourth Protocol Working Group meeting last Thursday! The working group members reviewed the NEP and reached the following consensus:

Status: Approved

Meeting Recording:
https://youtu.be/4VxRoKwLXIs

@mzhangmzz and @Longarithm Thank you for authoring this NEP!

Next Steps:

  • NEP moderator to embed the benefits & concerns into the NEP document
  • @near/nep-moderators To approve and merge the NEP

@frol
Copy link
Collaborator

frol commented Mar 20, 2023

@mzhangmzz Are we ready to merge this NEP? I just want to double-check that there are no outstanding last-minute fixes.

@frol frol added S-approved A NEP that was approved by a working group. and removed S-voting/final-comment-period A NEP in the final REVIEW stage. Last chance to bring up concerns before the voting call. labels Mar 20, 2023
Looogarithm and others added 6 commits March 21, 2023 02:59
Co-authored-by: Marcelo Fornet <mfornet94@gmail.com>
Co-authored-by: Marcelo Fornet <mfornet94@gmail.com>
Co-authored-by: Marcelo Fornet <mfornet94@gmail.com>
Co-authored-by: Marcelo Fornet <mfornet94@gmail.com>
@Longarithm
Copy link
Member

We are ready to merge. I applied Bowen's feedback about mentioning protocol change explicitly, and I agree that it is worth a separate section. Also shortened "storage costs" sections because now we have better-written proposals.

@frol frol merged commit 3046258 into master Mar 21, 2023
@frol frol deleted the flat_storage branch March 21, 2023 07:57
@robert-zaremba
Copy link
Contributor

In Cosmos I made a research for a similar problem. I proposed a two DB model:

  • state store: direct kv store for fast reads
  • state commitment: a Sparse Mekle Tree, with potential future updates.
    Early benchmarks showed ~ 50-70% performance improvements in mixed load. There was one try to implement it, but essentially it was not managed correctly, and the integration failed. Currently there is new working group to revisit the design and implementation.

Note, that the state design in Cosmos SDK is a bit different: currently each module can have it's own store (which leads to potential problems related to failures caused by not atomic writes).

The ADR: https://github.com/cosmos/cosmos-sdk/blob/main/docs/architecture/adr-040-storage-and-smt-state-commitments.md
The Research: https://paper.dropbox.com/doc/State-commitments-and-storage-review--B0_4gHURRnDy1Z1azKY4WfbXAg-wKl2RINZWD9I0DUmZIFwQ

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-NEP A NEAR Enhancement Proposal (NEP). S-approved A NEP that was approved by a working group. WG-protocol Protocol Standards Work Group should be accountable
Projects
Status: APPROVED NEPs
Development

Successfully merging this pull request may close these issues.