Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyValueRuntime created in TestEnv::restart is incomplete #8269

Open
Longarithm opened this issue Dec 22, 2022 · 0 comments
Open

KeyValueRuntime created in TestEnv::restart is incomplete #8269

Longarithm opened this issue Dec 22, 2022 · 0 comments
Labels
T-core Team: issues relevant to the core team

Comments

@Longarithm
Copy link
Member

In test_request_chunk_restart, when env.restart(0); is called, the Client instance is created again using setup_client function. The goal of test (apparently) is to check that client can respond to partial encoded chunk request even after restart.

But actually KeyValueRuntime created there is incomplete. It stores all data in memory, like hash_to_next_epoch, and fills it when get_epoch_and_valset is called, so there is some implicit assumption that it is called for all previous blocks. But for restart it is not the case, and if we call cares_about_shard, it leads to crash: https://buildkite.com/nearprotocol/nearcore/builds/23485#018535ca-c681-4f03-9a9f-bce3ce014ad5

@Longarithm Longarithm added the T-core Team: issues relevant to the core team label Dec 22, 2022
near-bulldozer bot pushed a commit that referenced this issue Jan 13, 2023
As discussed in #8193 (comment), flat storage creation makes more sense inside `Client` and `check_triggers`. `update_status` is a job which should be triggered periodically, and it doesn't have to be connected with finishing of block processing.

To support that, we introduce config option `flat_storage_creation_period` which defines frequency with which creation status update will be triggered. Node owners could change it to higher values if this work executed in main thread is time consuming for some reasion.

Also we fix `TestEnv::restart` a bit, because now we can call `cares_about_shard` in newly created client, and it fails, as described here: #8269.

P.S. It makes #8254 not necessary because `Client` already has information about validator signer, what is even more convenient.

## Testing

* test `test_flat_storage_creation` needed minor changes and still passes;
* https://nayduck.near.org/#/run/2811: nayduck test `python3 pytest/tests/sanity/repro_2916.py` passes now - without a change, a node crashed on restart trying to create FS for non-tracked shard.
nikurt pushed a commit to nikurt/nearcore that referenced this issue Jan 15, 2023
As discussed in near#8193 (comment), flat storage creation makes more sense inside `Client` and `check_triggers`. `update_status` is a job which should be triggered periodically, and it doesn't have to be connected with finishing of block processing.

To support that, we introduce config option `flat_storage_creation_period` which defines frequency with which creation status update will be triggered. Node owners could change it to higher values if this work executed in main thread is time consuming for some reasion.

Also we fix `TestEnv::restart` a bit, because now we can call `cares_about_shard` in newly created client, and it fails, as described here: near#8269.

P.S. It makes near#8254 not necessary because `Client` already has information about validator signer, what is even more convenient.

## Testing

* test `test_flat_storage_creation` needed minor changes and still passes;
* https://nayduck.near.org/#/run/2811: nayduck test `python3 pytest/tests/sanity/repro_2916.py` passes now - without a change, a node crashed on restart trying to create FS for non-tracked shard.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-core Team: issues relevant to the core team
Projects
None yet
Development

No branches or pull requests

1 participant