Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support block catchups for flat storage #8193

Merged
merged 37 commits into from
Dec 21, 2022
Merged

Conversation

Longarithm
Copy link
Member

@Longarithm Longarithm commented Dec 8, 2022

This PR enables flat storage on nightly, which adds it to the CI. We also add block catchup support here, because it is implicitly required by our nayduck tests.

Idea

You can read about catchups here: https://github.com/near/nearcore/blob/master/docs/architecture/how/sync.md

  • First, we remove obsolete flat storage data. This is critical, because any leftover key in flat storage may lead to incorrect transaction processing, which can lead to kickout or being slashed in the future. So far we just remove keys naively to pass nayduck tests, but when we enable catchups on mainnet/testnet, we have to optimize this part.
  • During applying state parts and setting KV trie items we set KV flat storage items as well.
  • When we finalize prepared shard state, we set flat storage head and create FlatStorageState.
  • During block catchups we check is tracked shard is new, and if yes, we update flat storage head for it.

cc @mzhangmzz @pugachAG

Testing

  • couple of existing catchup tests, e.g. test_catchup_gas_price_change
  • Manual check of affected nayduck tests. python3 pytest/tests/sanity/block_production.py passes with this change and doesn't pass without it.
  • Note that some nayduck tests still fail after that: https://nayduck.near.org/#/run/2785. I would merge the PR as is and introduce fixes later, because they seem to be orthogonal to catchups, but catchups are necessary to support to have a chance to pass these tests.

@Longarithm Longarithm marked this pull request as ready for review December 8, 2022 23:13
@Longarithm Longarithm requested a review from a team as a code owner December 8, 2022 23:13
@Longarithm Longarithm self-assigned this Dec 9, 2022
@@ -3119,6 +3149,10 @@ impl Chain {
num_parts: u64,
state_parts_task_scheduler: &dyn Fn(ApplyStatePartsRequest),
) -> Result<(), Error> {
// Before working with state parts, remove existing flat storage data.
let epoch_id = self.get_block_header(&sync_hash)?.epoch_id().clone();
self.runtime_adapter.remove_flat_storage_state_for_shard(shard_id, &epoch_id)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you create an issue mentioning we need to support removing flat storage keys when we enable state syncs, and also add the issue link in comment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also suggest moving remove_flat_storage_state_for_shard to apply_state_part because it is time consuming. schedule_apply_state_parts is supposed to be light weight because it happens in ClientActor. The time consuming work should happen in apply_state_parts which happens in a different thread.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, if you think this logic is only used for now and will be completely removed later when we need to enable flat storage for state sync/catchup for real, not just for nayduck tests, please also add a comment there mentioning these logic will be removed later.

@mzhangmzz
Copy link
Contributor

mzhangmzz commented Dec 12, 2022

If I understand it correctly, here we reuse the migration logic for creating flat state after state parts are applied, and state sync + block catchup process will consist the following steps: 1) state sync downloads state parts and stores that on disk 2) set_state_finalize calls update_status, which creates a new process to iterate the State column to create the FlatState column which will be executed asynchronously 3) catch up blocks, but here the FlatState column may not be ready yet. The deltas for these blocks will be applied in a different process as well, just like how we migrate flat storage in parallel to regular block processing.

This creates several problems. 1) FlatStorage is not ready to use when state sync is finished. It is not a problem now when FlatStorage is optional. However, when we fully enable Flat Storage and state sync, a node must be ready to process new blocks after state sync is done and that means FlatStorage should be ready. 2) Similarly, when catching up blocks, FlatStorage may not be ready. We also cannot have that when FlatStorage is fully enabled. 3) It is inefficient to apply state parts and create the FlatState column in two different processes, and that can cause the whole block catchup process to take more than one epoch. Currently state sync already takes close to or even more than one epoch to finish, and we can't have that. Adding 5 more hours to migrate flat storage will make it worse. Alternatively, we should create FlatState column right after we download the state parts and applying the state parts into the State column. This will add minimum delay, because instead of iterating the state column on disk, we are just reading from the state parts in memory.

It seems that that a lot of the code here will need to be replaced when we actually want to support state syncs. Now that we moved the FlatStorage timeline to Q1 and I think we will also want to work on state sync in Q1, I would advocate that even if we don't want to support state sync fully now, we should at least implement it in a way that is more compatible with the eventual implementation. Given that this PR is a non-trivial change by itself, It doesn't seem worth it to add the code here just for the sake of passing tests and remove them later in a few months.

@mm-near
Copy link
Contributor

mm-near commented Dec 12, 2022

@mzhangmzz - you said that "1) FlatStorage is not ready to use when state sync is finished. " -- but I thought that it is ready when state sync has finished. Or am I missing something ?

Copy link
Contributor

@mzhangmzz mzhangmzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Longarithm @mm-near yes sorry I misunderstood the code. I got confused by the call of try_create_flat_storage_state_for_shard and I missed the change in apply_state_parts. The code is correct, so please ignore my previous comment. I added a few suggestions on how we can make the logic easier to understand.

nearcore/src/runtime/mod.rs Show resolved Hide resolved
store_update.commit()?;
}

match self.runtime_adapter.try_create_flat_storage_state_for_shard(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to call try_create_flat_storage_state_for_shard here? Isn't it true that FlatStorageState is for sure not ready here? Is it just for checking that we are not in the process of flat storage migration? If so, can we add a comment here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make it less confusing, maybe it is better to split try_create_flat_storage_state_for_shard into two methods check_flat_storage_state_creation_status and create_flat_storage_state_for_shard. Otherwise, it is hard to tell if we are just checking the status here or actually trying to create the flat storage state.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will happen in the following scenario, if we release the code into mainnet, a node starts the flat storage migration process, but the node is also behind for more than an epoch, so it triggers the state sync process as well?

Copy link
Member Author

@Longarithm Longarithm Dec 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is some solution, like "interrupt existing creation process and create everything using state sync". I would even say that DB is corrupted in such case and node owner needs to download a dump. However, DB may not look corrupted for a node owner so they may not even notice a problem...
So it looks like a separate non-trivial question outside of the PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is ok if state sync and flat storage migration is not compatible, because the migration process only happens once and nodes can sync without state sync (for example, downloading the snapshot). But, let's implement something in the code that explicitly makes the node panic if it tries to state sync when the flat storage migration is in process and tell them to download from the snapshot (which should already have the flat storage ready)

// If flat storage doesn't exist, update its creation status.
match &mut self.flat_storage_creator {
Some(flat_storage_creator) => {
flat_storage_creator.update_status(shard_id, &self.store)?;
Copy link
Contributor

@mzhangmzz mzhangmzz Dec 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we move the call to flat_storage_creator.update_status to outside of block processing? Technically, we don't need to call this function every time when we process a block. We only need to call it periodically to update the process of flat state migration. And if the node is not processing any new blocks (if it is in state sync), that means flat storage creation will not make progress either. I'd suggest creating a new event in ClientActor that is triggered periodically (say every 100ms) and calls update_status (see

fn check_triggers(&mut self, ctx: &mut Context<ClientActor>) -> Duration {
).

Copy link
Member Author

@Longarithm Longarithm Dec 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense, but let's do it on separate PR.
I think current impl assumes that all flat state changes are saved on disk before FS creation. In other words, we can call try_create_flat_storage_state_for_shard inside update_status, and then the following events may happen:

  1. FlatStorageState::new is called
  2. save_flat_state_changes is called - but deltas are not added to FSS
  3. FSS is added to flat state factory

This is fixable but I'm not sure if it is the only issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah are you worried about the potential race conditions between the triggers of update_status and save_flat_state_changes. I think the scenario you described won't happen because we are not moving update_status to a different thread, it is still inside ClientActor. It's just triggered differently. Instead of triggered at block processing time, it is triggered periodically every 100ms.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, for some reason I thought that we can save flat state changes in a thread spawned for applying chunk, but it always happens in main thread. And I didn't realize that catchup is also there, only the heavy work is moved outside.
So I'll just try to move update_status as you suggested.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created an issue for that: #8250

chain/chain/src/chain.rs Outdated Show resolved Hide resolved
@Longarithm Longarithm mentioned this pull request Dec 16, 2022
26 tasks
shard_id,
true,
) {
self.update_flat_storage_for_block(&block, shard_id)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a boolean argument require_flat_storage_ready for update_flat_storage_for_block so that if it is true, update_flat_storage_for_block will require the flat storage to be ready, instead of allowing it being created? We can set it to true in here, to make sure that when catch up happens, flat storage is not in migration.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioned this in issue for migration support: #8250

Copy link
Contributor

@mzhangmzz mzhangmzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good! Please address the concern for compatibility of state sync by either creating a tracking issue or addressing it in this PR.

@near-bulldozer near-bulldozer bot merged commit ff1ff05 into master Dec 21, 2022
@near-bulldozer near-bulldozer bot deleted the fs-nightly-catchup branch December 21, 2022 12:09
near-bulldozer bot pushed a commit that referenced this pull request Jan 13, 2023
As discussed in #8193 (comment), flat storage creation makes more sense inside `Client` and `check_triggers`. `update_status` is a job which should be triggered periodically, and it doesn't have to be connected with finishing of block processing.

To support that, we introduce config option `flat_storage_creation_period` which defines frequency with which creation status update will be triggered. Node owners could change it to higher values if this work executed in main thread is time consuming for some reasion.

Also we fix `TestEnv::restart` a bit, because now we can call `cares_about_shard` in newly created client, and it fails, as described here: #8269.

P.S. It makes #8254 not necessary because `Client` already has information about validator signer, what is even more convenient.

## Testing

* test `test_flat_storage_creation` needed minor changes and still passes;
* https://nayduck.near.org/#/run/2811: nayduck test `python3 pytest/tests/sanity/repro_2916.py` passes now - without a change, a node crashed on restart trying to create FS for non-tracked shard.
nikurt pushed a commit to nikurt/nearcore that referenced this pull request Jan 15, 2023
As discussed in near#8193 (comment), flat storage creation makes more sense inside `Client` and `check_triggers`. `update_status` is a job which should be triggered periodically, and it doesn't have to be connected with finishing of block processing.

To support that, we introduce config option `flat_storage_creation_period` which defines frequency with which creation status update will be triggered. Node owners could change it to higher values if this work executed in main thread is time consuming for some reasion.

Also we fix `TestEnv::restart` a bit, because now we can call `cares_about_shard` in newly created client, and it fails, as described here: near#8269.

P.S. It makes near#8254 not necessary because `Client` already has information about validator signer, what is even more convenient.

## Testing

* test `test_flat_storage_creation` needed minor changes and still passes;
* https://nayduck.near.org/#/run/2811: nayduck test `python3 pytest/tests/sanity/repro_2916.py` passes now - without a change, a node crashed on restart trying to create FS for non-tracked shard.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants