-
Notifications
You must be signed in to change notification settings - Fork 632
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support block catchups for flat storage #8193
Conversation
@@ -3119,6 +3149,10 @@ impl Chain { | |||
num_parts: u64, | |||
state_parts_task_scheduler: &dyn Fn(ApplyStatePartsRequest), | |||
) -> Result<(), Error> { | |||
// Before working with state parts, remove existing flat storage data. | |||
let epoch_id = self.get_block_header(&sync_hash)?.epoch_id().clone(); | |||
self.runtime_adapter.remove_flat_storage_state_for_shard(shard_id, &epoch_id)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you create an issue mentioning we need to support removing flat storage keys when we enable state syncs, and also add the issue link in comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also suggest moving remove_flat_storage_state_for_shard
to apply_state_part
because it is time consuming. schedule_apply_state_parts
is supposed to be light weight because it happens in ClientActor. The time consuming work should happen in apply_state_parts
which happens in a different thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or, if you think this logic is only used for now and will be completely removed later when we need to enable flat storage for state sync/catchup for real, not just for nayduck tests, please also add a comment there mentioning these logic will be removed later.
If I understand it correctly, here we reuse the migration logic for creating flat state after state parts are applied, and state sync + block catchup process will consist the following steps: 1) state sync downloads state parts and stores that on disk 2) This creates several problems. 1) FlatStorage is not ready to use when state sync is finished. It is not a problem now when FlatStorage is optional. However, when we fully enable Flat Storage and state sync, a node must be ready to process new blocks after state sync is done and that means FlatStorage should be ready. 2) Similarly, when catching up blocks, FlatStorage may not be ready. We also cannot have that when FlatStorage is fully enabled. 3) It is inefficient to apply state parts and create the FlatState column in two different processes, and that can cause the whole block catchup process to take more than one epoch. Currently state sync already takes close to or even more than one epoch to finish, and we can't have that. Adding 5 more hours to migrate flat storage will make it worse. Alternatively, we should create FlatState column right after we download the state parts and applying the state parts into the State column. This will add minimum delay, because instead of iterating the state column on disk, we are just reading from the state parts in memory. It seems that that a lot of the code here will need to be replaced when we actually want to support state syncs. Now that we moved the FlatStorage timeline to Q1 and I think we will also want to work on state sync in Q1, I would advocate that even if we don't want to support state sync fully now, we should at least implement it in a way that is more compatible with the eventual implementation. Given that this PR is a non-trivial change by itself, It doesn't seem worth it to add the code here just for the sake of passing tests and remove them later in a few months. |
@mzhangmzz - you said that "1) FlatStorage is not ready to use when state sync is finished. " -- but I thought that it is ready when state sync has finished. Or am I missing something ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Longarithm @mm-near yes sorry I misunderstood the code. I got confused by the call of try_create_flat_storage_state_for_shard
and I missed the change in apply_state_parts
. The code is correct, so please ignore my previous comment. I added a few suggestions on how we can make the logic easier to understand.
store_update.commit()?; | ||
} | ||
|
||
match self.runtime_adapter.try_create_flat_storage_state_for_shard( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to call try_create_flat_storage_state_for_shard
here? Isn't it true that FlatStorageState is for sure not ready here? Is it just for checking that we are not in the process of flat storage migration? If so, can we add a comment here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make it less confusing, maybe it is better to split try_create_flat_storage_state_for_shard
into two methods check_flat_storage_state_creation_status
and create_flat_storage_state_for_shard
. Otherwise, it is hard to tell if we are just checking the status here or actually trying to create the flat storage state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will happen in the following scenario, if we release the code into mainnet, a node starts the flat storage migration process, but the node is also behind for more than an epoch, so it triggers the state sync process as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is some solution, like "interrupt existing creation process and create everything using state sync". I would even say that DB is corrupted in such case and node owner needs to download a dump. However, DB may not look corrupted for a node owner so they may not even notice a problem...
So it looks like a separate non-trivial question outside of the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is ok if state sync and flat storage migration is not compatible, because the migration process only happens once and nodes can sync without state sync (for example, downloading the snapshot). But, let's implement something in the code that explicitly makes the node panic if it tries to state sync when the flat storage migration is in process and tell them to download from the snapshot (which should already have the flat storage ready)
// If flat storage doesn't exist, update its creation status. | ||
match &mut self.flat_storage_creator { | ||
Some(flat_storage_creator) => { | ||
flat_storage_creator.update_status(shard_id, &self.store)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we move the call to flat_storage_creator.update_status
to outside of block processing? Technically, we don't need to call this function every time when we process a block. We only need to call it periodically to update the process of flat state migration. And if the node is not processing any new blocks (if it is in state sync), that means flat storage creation will not make progress either. I'd suggest creating a new event in ClientActor that is triggered periodically (say every 100ms) and calls update_status
(see
nearcore/chain/client/src/client_actor.rs
Line 1112 in 63d77d4
fn check_triggers(&mut self, ctx: &mut Context<ClientActor>) -> Duration { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense, but let's do it on separate PR.
I think current impl assumes that all flat state changes are saved on disk before FS creation. In other words, we can call try_create_flat_storage_state_for_shard
inside update_status
, and then the following events may happen:
FlatStorageState::new
is calledsave_flat_state_changes
is called - but deltas are not added to FSS- FSS is added to flat state factory
This is fixable but I'm not sure if it is the only issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah are you worried about the potential race conditions between the triggers of update_status
and save_flat_state_changes
. I think the scenario you described won't happen because we are not moving update_status
to a different thread, it is still inside ClientActor
. It's just triggered differently. Instead of triggered at block processing time, it is triggered periodically every 100ms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, for some reason I thought that we can save flat state changes in a thread spawned for applying chunk, but it always happens in main thread. And I didn't realize that catchup is also there, only the heavy work is moved outside.
So I'll just try to move update_status
as you suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created an issue for that: #8250
shard_id, | ||
true, | ||
) { | ||
self.update_flat_storage_for_block(&block, shard_id)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a boolean argument require_flat_storage_ready
for update_flat_storage_for_block
so that if it is true, update_flat_storage_for_block
will require the flat storage to be ready, instead of allowing it being created? We can set it to true in here, to make sure that when catch up happens, flat storage is not in migration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mentioned this in issue for migration support: #8250
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good! Please address the concern for compatibility of state sync by either creating a tracking issue or addressing it in this PR.
As discussed in #8193 (comment), flat storage creation makes more sense inside `Client` and `check_triggers`. `update_status` is a job which should be triggered periodically, and it doesn't have to be connected with finishing of block processing. To support that, we introduce config option `flat_storage_creation_period` which defines frequency with which creation status update will be triggered. Node owners could change it to higher values if this work executed in main thread is time consuming for some reasion. Also we fix `TestEnv::restart` a bit, because now we can call `cares_about_shard` in newly created client, and it fails, as described here: #8269. P.S. It makes #8254 not necessary because `Client` already has information about validator signer, what is even more convenient. ## Testing * test `test_flat_storage_creation` needed minor changes and still passes; * https://nayduck.near.org/#/run/2811: nayduck test `python3 pytest/tests/sanity/repro_2916.py` passes now - without a change, a node crashed on restart trying to create FS for non-tracked shard.
As discussed in near#8193 (comment), flat storage creation makes more sense inside `Client` and `check_triggers`. `update_status` is a job which should be triggered periodically, and it doesn't have to be connected with finishing of block processing. To support that, we introduce config option `flat_storage_creation_period` which defines frequency with which creation status update will be triggered. Node owners could change it to higher values if this work executed in main thread is time consuming for some reasion. Also we fix `TestEnv::restart` a bit, because now we can call `cares_about_shard` in newly created client, and it fails, as described here: near#8269. P.S. It makes near#8254 not necessary because `Client` already has information about validator signer, what is even more convenient. ## Testing * test `test_flat_storage_creation` needed minor changes and still passes; * https://nayduck.near.org/#/run/2811: nayduck test `python3 pytest/tests/sanity/repro_2916.py` passes now - without a change, a node crashed on restart trying to create FS for non-tracked shard.
This PR enables flat storage on nightly, which adds it to the CI. We also add block catchup support here, because it is implicitly required by our nayduck tests.
Idea
You can read about catchups here: https://github.com/near/nearcore/blob/master/docs/architecture/how/sync.md
FlatStorageState
.cc @mzhangmzz @pugachAG
Testing
test_catchup_gas_price_change
python3 pytest/tests/sanity/block_production.py
passes with this change and doesn't pass without it.