-
Notifications
You must be signed in to change notification settings - Fork 632
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flat storage MVP #7327
Comments
First step towards flat storage: #7327 Continue work here because #7295 was broken. Here we start building it under protocol feature without adding it to nightly. The goal is to support it for localnet with one node which never sends skip approvals, so later we could extend it to more nodes and eventually implement a migration to mainnet. Now, everyone should be able to build neard with this feature, set `max_block_production_delay` to something big and run localnet. I currently couldn't make all tests work with enabled feature, e.g. `test_care_about_shard`, but I open it for review because I don't think it affects the idea. ## Testing * Check that encoding and decoding value reference gives expected results. `ValueRef` can be reused in other places in code * Check that flat state is used in regular trie handler and not used in view trie handler * Check that after block processing, getting value from regular trie and view trie gives the same result
First step towards flat storage: #7327 Continue work here because #7295 was broken. Here we start building it under protocol feature without adding it to nightly. The goal is to support it for localnet with one node which never sends skip approvals, so later we could extend it to more nodes and eventually implement a migration to mainnet. Now, everyone should be able to build neard with this feature, set `max_block_production_delay` to something big and run localnet. I currently couldn't make all tests work with enabled feature, e.g. `test_care_about_shard`, but I open it for review because I don't think it affects the idea. ## Testing * Check that encoding and decoding value reference gives expected results. `ValueRef` can be reused in other places in code * Check that flat state is used in regular trie handler and not used in view trie handler * Check that after block processing, getting value from regular trie and view trie gives the same result
As per the TODO, take `&[u8; 36]` in ValueRef::decode and stop using Cursor. This makes the method considerably simpler without making the call site more complex. While dealing with ValueRef, stop using Cursor in account_id_to_shard_id as well. Issue: near#7327
As per the TODO, take `&[u8; 36]` in ValueRef::decode and stop using Cursor. This makes the method considerably simpler without making the call site more complex. While dealing with ValueRef, stop using Cursor in account_id_to_shard_id as well. Issue: #7327
@Longarithm is the progress in this tracking issue up to date? |
Yes. The PR for catchups needs some polishing to be merged #8193, tests for migration and proper flat state removal are TODO, other items weren't started yet. |
As per the TODO, take `&[u8; 36]` in ValueRef::decode and stop using Cursor. This makes the method considerably simpler without making the call site more complex. While dealing with ValueRef, stop using Cursor in account_id_to_shard_id as well. Issue: near#7327
Progress update (created together with @Longarithm):
Total progress for Q1: 45% |
Yes, closing |
We completed the work for flat storage reads. There are some remaining tasks to release it in 1.34, which I listed in #8827. We plan to spend some time on it in Q2, as 1.34 is expected to happen in Q2. |
The code is literally removing `protocol_feature_flat_state` and moving feature to stable protocol. We also disable `test_state_sync` as this is part of refactor we can do in Q2. ## Feature to stabilize Here we stabilize Flat Storage for reads, which means that all state reads in the client during block processing will query flat storage instead of Trie. Flat Storage is another index for blockchain state, reducing number of DB accesses for state read from `2 * key.len()` in the worst case to 2. This will trigger background creation of flat storage, using 8 threads and finishing in 15h for RPC node and 2d for archival node. After that all non-contract reads will go through flat storage, for which special "chunk views" will be created. When protocol upgrade happens, contracts reads will go through flat storage as well. Also compute costs will change as Option 3 suggests [here](#8006 (comment)). It is to be merged separately, but we need to ensure that both costs change and flat storage go into next release together. ## Context Find more details in: - Overview: https://near.github.io/nearcore/architecture/storage/flat_storage.html - Approved NEP: https://github.com/near/NEPs/blob/master/neps/nep-0339.md - Tracking issue: #7327 ## Testing and QA * Flat storage structs are covered by unit tests; * Integration tests check that chain behaviour is preserved and costs are changed as expected; * Flat storage spent ~2 months in betanet with assertion that flat and trie `ValueRef`s are the same; * We were running testnet/mainnet nodes for ~2 months with the same assertion. We checked that performance is not degraded, see e.g. https://nearinc.grafana.net/d/Vg9SREA4k/flat-storage-test?orgId=1&var-chain_id=mainnet&var-node_id=logunov-mainnet-fs-1&from=1677804289279&to=1678088806154 checking that even with finality lag of 50 blocks performance is not impacted. Small exception is that we updated data layout several times during development, but we checked that results are unchanged. ## Checklist - [x] Include compute costs after they are merged - #8924 - [x] https://nayduck.near.org/#/run/2916 - [x] Update CHANGELOG.md to include this protocol feature in the `Unreleased` section.
The code is literally removing `protocol_feature_flat_state` and moving feature to stable protocol. We also disable `test_state_sync` as this is part of refactor we can do in Q2. ## Feature to stabilize Here we stabilize Flat Storage for reads, which means that all state reads in the client during block processing will query flat storage instead of Trie. Flat Storage is another index for blockchain state, reducing number of DB accesses for state read from `2 * key.len()` in the worst case to 2. This will trigger background creation of flat storage, using 8 threads and finishing in 15h for RPC node and 2d for archival node. After that all non-contract reads will go through flat storage, for which special "chunk views" will be created. When protocol upgrade happens, contracts reads will go through flat storage as well. Also compute costs will change as Option 3 suggests [here](#8006 (comment)). It is to be merged separately, but we need to ensure that both costs change and flat storage go into next release together. ## Context Find more details in: - Overview: https://near.github.io/nearcore/architecture/storage/flat_storage.html - Approved NEP: https://github.com/near/NEPs/blob/master/neps/nep-0339.md - Tracking issue: #7327 ## Testing and QA * Flat storage structs are covered by unit tests; * Integration tests check that chain behaviour is preserved and costs are changed as expected; * Flat storage spent ~2 months in betanet with assertion that flat and trie `ValueRef`s are the same; * We were running testnet/mainnet nodes for ~2 months with the same assertion. We checked that performance is not degraded, see e.g. https://nearinc.grafana.net/d/Vg9SREA4k/flat-storage-test?orgId=1&var-chain_id=mainnet&var-node_id=logunov-mainnet-fs-1&from=1677804289279&to=1678088806154 checking that even with finality lag of 50 blocks performance is not impacted. Small exception is that we updated data layout several times during development, but we checked that results are unchanged. ## Checklist - [x] Include compute costs after they are merged - #8924 - [x] https://nayduck.near.org/#/run/2916 - [x] Update CHANGELOG.md to include this protocol feature in the `Unreleased` section.
The code is literally removing `protocol_feature_flat_state` and moving feature to stable protocol. We also disable `test_state_sync` as this is part of refactor we can do in Q2. ## Feature to stabilize Here we stabilize Flat Storage for reads, which means that all state reads in the client during block processing will query flat storage instead of Trie. Flat Storage is another index for blockchain state, reducing number of DB accesses for state read from `2 * key.len()` in the worst case to 2. This will trigger background creation of flat storage, using 8 threads and finishing in 15h for RPC node and 2d for archival node. After that all non-contract reads will go through flat storage, for which special "chunk views" will be created. When protocol upgrade happens, contracts reads will go through flat storage as well. Also compute costs will change as Option 3 suggests [here](#8006 (comment)). It is to be merged separately, but we need to ensure that both costs change and flat storage go into next release together. ## Context Find more details in: - Overview: https://near.github.io/nearcore/architecture/storage/flat_storage.html - Approved NEP: https://github.com/near/NEPs/blob/master/neps/nep-0339.md - Tracking issue: #7327 ## Testing and QA * Flat storage structs are covered by unit tests; * Integration tests check that chain behaviour is preserved and costs are changed as expected; * Flat storage spent ~2 months in betanet with assertion that flat and trie `ValueRef`s are the same; * We were running testnet/mainnet nodes for ~2 months with the same assertion. We checked that performance is not degraded, see e.g. https://nearinc.grafana.net/d/Vg9SREA4k/flat-storage-test?orgId=1&var-chain_id=mainnet&var-node_id=logunov-mainnet-fs-1&from=1677804289279&to=1678088806154 checking that even with finality lag of 50 blocks performance is not impacted. Small exception is that we updated data layout several times during development, but we checked that results are unchanged. ## Checklist - [x] Include compute costs after they are merged - #8924 - [x] https://nayduck.near.org/#/run/2916 - [x] Update CHANGELOG.md to include this protocol feature in the `Unreleased` section.
The code is literally removing `protocol_feature_flat_state` and moving feature to stable protocol. We also disable `test_state_sync` as this is part of refactor we can do in Q2. ## Feature to stabilize Here we stabilize Flat Storage for reads, which means that all state reads in the client during block processing will query flat storage instead of Trie. Flat Storage is another index for blockchain state, reducing number of DB accesses for state read from `2 * key.len()` in the worst case to 2. This will trigger background creation of flat storage, using 8 threads and finishing in 15h for RPC node and 2d for archival node. After that all non-contract reads will go through flat storage, for which special "chunk views" will be created. When protocol upgrade happens, contracts reads will go through flat storage as well. Also compute costs will change as Option 3 suggests [here](#8006 (comment)). It is to be merged separately, but we need to ensure that both costs change and flat storage go into next release together. ## Context Find more details in: - Overview: https://near.github.io/nearcore/architecture/storage/flat_storage.html - Approved NEP: https://github.com/near/NEPs/blob/master/neps/nep-0339.md - Tracking issue: #7327 ## Testing and QA * Flat storage structs are covered by unit tests; * Integration tests check that chain behaviour is preserved and costs are changed as expected; * Flat storage spent ~2 months in betanet with assertion that flat and trie `ValueRef`s are the same; * We were running testnet/mainnet nodes for ~2 months with the same assertion. We checked that performance is not degraded, see e.g. https://nearinc.grafana.net/d/Vg9SREA4k/flat-storage-test?orgId=1&var-chain_id=mainnet&var-node_id=logunov-mainnet-fs-1&from=1677804289279&to=1678088806154 checking that even with finality lag of 50 blocks performance is not impacted. Small exception is that we updated data layout several times during development, but we checked that results are unchanged. ## Checklist - [x] Include compute costs after they are merged - #8924 - [x] https://nayduck.near.org/#/run/2916 - [x] Update CHANGELOG.md to include this protocol feature in the `Unreleased` section.
Owner: Alex L
Requires completion of READ only flat storage. Tracking issue for flat storage #7327
LoE for Q1 includes:
Picture: https://miro.com/app/board/uXjVP3ziycI=/
Implementation roadmap for Flat Storage (FS)
This section is kept up-to-date to provide a high-level progress overview for anyone not directly involved in the implementation.
We used it when we didn't have github tracking issues for subtask. Detailed subtasks are also tracked on Jira.
Basic flat storage (DONE)
Data migration (DONE)
test and handle corner cases in migration, see Improve flat storage tests coverage #8500Fully operational read-only FS (DONE)
Add docs and update existing documentation to include flat storage (DONE)
There is also a blocker for shipping: #8032. Without it, undercharging coming from non-charged TTNs can't be controlled, but TTN parameter = 10 would help.
Links
Existing documentation:
https://nearinc.atlassian.net/wiki/spaces/EAP/pages/58556568/Flat+Storage
https://docs.google.com/document/d/1Tla2JtHHc6rJwZPUOF0skoaLiExrcV1PW2F65llfHG8/edit#heading=h.2r15ioo581ig
Related works and discussions:
Main flat storage discussion channel
https://nearinc.atlassian.net/browse/CRT-8 and subsequently linked tasks
https://nearinc.atlassian.net/browse/CRT-86
Picture of flat storage structure
https://near.zulipchat.com/#narrow/stream/329092-pagoda.2Fstorage.2Fprivate/topic/Action.20receipt.20undercharging
The text was updated successfully, but these errors were encountered: