-
Notifications
You must be signed in to change notification settings - Fork 622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimise memory usage while loading genesis #4269
Conversation
posvyatokum
commented
May 2, 2021
- Adding option of streaming records to Genesis
- Applying genesis records in chunks (not only in case of streaming)
Probably should adjust the chunk size. With this patch, the node is starting in 15 minutes, without it -- in 8. As for memory, the new code consumes 3 GB (old code -- 14 GB), so there is room to go. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some Rust-related comments (although most of them are "oh, that's actually harder than I thought, I see now why you do this the way you did").
One thing I am worrying a lot is that now processing records might fail with an io::Error
and we do not reflect that in the types. We just unwrap inside.I feel that's a significant change of geneissis contract, and that's its better to expose the fact we can fail at the call site. But I see how this will make this already quite a big change larger. I don't necessary want to block on this, but it does seems that silencing IO errors that way increases the technical debt substantionaly.
runtime/runtime/src/lib.rs
Outdated
let mut genesis_state_applier = | ||
GenesisStateApplier::new(tries, shard_id, validators, config); | ||
genesis_state_applier.apply(genesis, shard_account_ids) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, at least it seems that this dosen't need to be a stateful object, as we apply all the stuff in one go anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if I've done what you wanted, it still looks weird. Also this part of code moved to runtime/runtime/src/genesis.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the peak memory usage when neard loads testnet genesis after this change?
genesis.config.total_supply = get_initial_supply(&genesis.records.as_ref()); | ||
genesis | ||
} | ||
|
||
pub fn new_as_is( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the best name :)
5 GB |
} | ||
} | ||
|
||
fn stream_records_from_file( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this is logically static method of RecordsProcessor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(not really important): I'd say it's the other way around -- this function is the public API, and RecordsProcessor
is it's private implementation detail. If it were a separate module (it doesn't have to be), it's best written as
// genesis_streaming.rs
pub fn stream_records_from_file(
reader: impl Read,
mut callback: impl FnMut(StateRecord),
) -> serde_json::Result<()> {
let mut deserializer = serde_json::Deserializer::from_reader(reader);
let records_processor = RecordsProcessor { sink: &mut callback };
deserializer.deserialize_any(records_processor)
}
struct RecordsProcessor<F> { ... }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left couple of comments, nothing blocking!
fn stream_records_with_callback( | ||
&self, | ||
callback: impl FnMut(StateRecord), | ||
) -> std::io::Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's idiomatic to use std::io;
and use io::Result
and io::Error
|
||
pub fn for_each_record(&self, mut callback: impl FnMut(&StateRecord)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I am still not feeling completely comfortable with just .expect
ing errors here. This signature looks as if it's a pure in-memory function that can't fail. I suggest adding a comment here explaining that this actually does disk IO and can panic if the genesis file is removed.
On the over hand, we already have validate_genesis
which handles errors via panics, so in that sense this doesn't make the code worse...
} | ||
} | ||
|
||
fn stream_records_from_file( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(not really important): I'd say it's the other way around -- this function is the public API, and RecordsProcessor
is it's private implementation detail. If it were a separate module (it doesn't have to be), it's best written as
// genesis_streaming.rs
pub fn stream_records_from_file(
reader: impl Read,
mut callback: impl FnMut(StateRecord),
) -> serde_json::Result<()> {
let mut deserializer = serde_json::Deserializer::from_reader(reader);
let records_processor = RecordsProcessor { sink: &mut callback };
deserializer.deserialize_any(records_processor)
}
struct RecordsProcessor<F> { ... }
@@ -299,11 +375,16 @@ impl GenesisJsonHasher { | |||
|
|||
impl Genesis { | |||
pub fn new(config: GenesisConfig, records: GenesisRecords) -> Self { | |||
let mut genesis = Self { config, records, phantom: PhantomData }; | |||
let mut genesis = | |||
Self { config, records, records_file: PathBuf::new(), phantom: PhantomData }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder it this should be records_file: None
? That is, use Option
to relfect in the type system that we might not have records_file at all.
core/primitives/src/state_record.rs
Outdated
match &state_record { | ||
StateRecord::Account { account_id, .. } | ||
| StateRecord::AccessKey { account_id, .. } | ||
| StateRecord::Contract { account_id, .. } | ||
| StateRecord::ReceivedData { account_id, .. } | ||
| StateRecord::Data { account_id, .. } => account_id, | ||
StateRecord::PostponedReceipt(receipt) | StateRecord::DelayedReceipt(receipt) => { | ||
&receipt.receiver_id | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice idiom!
core/primitives/src/state_record.rs
Outdated
@@ -147,3 +147,16 @@ fn to_printable(blob: &[u8]) -> String { | |||
} | |||
} | |||
} | |||
|
|||
pub fn state_record_to_account_id(state_record: &StateRecord) -> &AccountId { | |||
match &state_record { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
match &state_record { | |
match state_record { |
state_record
is already a ref
pub fn new(genesis_config: &'a GenesisConfig) -> Self { | ||
return Self { | ||
genesis_config, | ||
total_supply: 0, | ||
staked_accounts: HashMap::new(), | ||
account_ids: HashSet::new(), | ||
access_key_account_ids: HashSet::new(), | ||
contract_account_ids: HashSet::new(), | ||
}; | ||
} | ||
|
||
pub fn process_record(&mut self, record: &StateRecord) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pub fn new(genesis_config: &'a GenesisConfig) -> Self { | |
return Self { | |
genesis_config, | |
total_supply: 0, | |
staked_accounts: HashMap::new(), | |
account_ids: HashSet::new(), | |
access_key_account_ids: HashSet::new(), | |
contract_account_ids: HashSet::new(), | |
}; | |
} | |
pub fn process_record(&mut self, record: &StateRecord) { | |
fn new(genesis_config: &'a GenesisConfig) -> Self { | |
return Self { | |
genesis_config, | |
total_supply: 0, | |
staked_accounts: HashMap::new(), | |
account_ids: HashSet::new(), | |
access_key_account_ids: HashSet::new(), | |
contract_account_ids: HashSet::new(), | |
}; | |
} | |
fn process_record(&mut self, record: &StateRecord) { |
pub
is not needed here. I also find it useful to not have unnecessary pub
s. That way, when you see a pub
, you know that its a cross-crate interface.
.validators | ||
.clone() | ||
.into_iter() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.validators | |
.clone() | |
.into_iter() | |
.validators | |
.iter() | |
.cloned() |
This will be a tiiiny bit better, as the vector itself won't be cloned
neard/src/genesis_validate.rs
Outdated
} | ||
|
||
/// Validate genesis config and records. Panics if genesis is ill-formed. | ||
pub fn validate_genesis(genesis: &Genesis) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we have any existing conventions around this, but I personally prefer relatively strongly to keep module's public API at the top. That way, when you read a module, you can understand at a glance it's API: https://github.com/rust-analyzer/rust-analyzer/blob/master/docs/dev/style.md#order-of-items
runtime/runtime/src/lib.rs
Outdated
@@ -2251,6 +2122,8 @@ mod tests { | |||
|
|||
#[test] | |||
fn test_delete_key_add_key() { | |||
use near_primitives::account::AccessKey; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is alerady inside the cfg(test)
module, so you can put the import at the top of the module:
#[cfg(test)]
mod tests {
use super::*;
use near_crypto::{InMemorySigner, KeyType, Signer};
use near_primitives::errors::ReceiptValidationError;
use near_primitives::hash::hash;
use near_primitives::profile::ProfileData;
use near_primitives::test_utils::{account_new, MockEpochInfoProvider};
use near_primitives::transaction::{
AddKeyAction, DeleteKeyAction, FunctionCallAction, TransferAction,
};
use near_primitives::types::MerkleHash;
use near_primitives::version::PROTOCOL_VERSION;
use near_primitives::account::AccessKey; // new import
use near_store::test_utils::create_tries;
use near_store::StoreCompiledContractCache;
use std::sync::Arc;
use testlib::runtime_utils::{alice_account, bob_account};
I just started a fresh testnet node based on 1.20.0-rc.2 release (which includes this PR, I believe), and observed that it consumed 10GB of RAM on |