-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Construct bank from snapshot dir #30171
Construct bank from snapshot dir #30171
Conversation
This will be rebased after PR #30099 is committed, so the reduced changes can be ready for the review. |
9996b3d
to
833d881
Compare
833d881
to
e53b6fe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Largely happy with this.
I think we had it in the original PR (pre-split) description, but could you add what the time difference is between using snapshot from archive vs from dir?
runtime/src/serde_snapshot.rs
Outdated
assert!( | ||
next_append_vec_id.checked_sub(1).is_some(), | ||
"subtraction underflow" | ||
); | ||
let max_append_vec_id = next_append_vec_id - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some new case that requires we check this? If so, why not something like this?
assert!( | |
next_append_vec_id.checked_sub(1).is_some(), | |
"subtraction underflow" | |
); | |
let max_append_vec_id = next_append_vec_id - 1; | |
let max_append_vec_id = next_append_vec_id.checked_sub(1).unwrap(); |
I think based on the unwrap panic it should be obvious, no real need (imo) to specify "subtraction underflow".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed. updated it.
runtime/src/snapshot_utils.rs
Outdated
.unwrap(); | ||
|
||
for account_path in account_paths { | ||
for file in fs::read_dir(account_path).unwrap() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by the time we're here, we have already verified that these account paths exist, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, they went through the step to append "run/", and the run sub directories were created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please avoid unwrap
outside of tests. An expect
with a message is an option, or, this function could return a Result
and the caller could put a single expect
on its invocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed all unwraps.
let _ = &self | ||
.next_append_vec_id | ||
.fetch_max(next_appendvec_id as u32, Ordering::Relaxed); | ||
} | ||
let (slot, slot_complete) = self.insert_slot_storage_file(path, filename); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are from Dir
, we call get_slot_and_append_vec_id
above, and then call it again internally within insert_slot_storage_file
.
We should just call it once above, regardless of snapshot_from
, and pass the slot to insert_slot_storage_file
instead of filename
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
So I now call get_slot_and_append_vec_id once, removed the same call from insert_slot_storage_file, and just pass slot into it.
Then I find that the filename parameter is no longer needed.
Then the parameter list of insert_slot_storage_file is the same as insert_storage_file, and there is almost nothing in insert_slot_storage_file other than calling insert_storage_file.
So, to further simplify it, I removed the insert_slot_storage_file function.
&self.next_append_vec_id, | ||
&self.num_collisions, | ||
)?; | ||
let storage_entry = match &self.snapshot_from { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment out of scope for this PR.
We're checking snapshot_from
for every entry - I wonder what the perf impact of storing generic fns for this would be, though I suspect minimal since I'd hope branch prediction is doing a good job here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I can move "match &self.snapshot_from" out of the entry loop. I tried the following, to define two different closures, and then just make the closure call inside the entry loop. But somehow I got some compilation error.
let process_entry = match &self.snapshot_from {
SnapshotFrom::Archive =>
|path: &Path,
current_len: usize,
old_appendvec_id: usize|
-> Result<Arc<AccountStorageEntry>, std::io::Error> {
remap_and_reconstruct_single_storage(
slot,
old_appendvec_id,
current_len,
path,
&self.next_append_vec_id,
&self.num_collisions,
)
},
SnapshotFrom::Dir =>
|path: &Path,
current_len: usize,
old_appendvec_id: usize|
-> Result<Arc<AccountStorageEntry>, std::io::Error> {
reconstruct_single_storage(&slot, path, current_len, old_appendvec_id as u32)
},
};
The compilation error is
error[E0308]: `match` arms have incompatible types
--> runtime/src/snapshot_utils/snapshot_storage_rebuilder.rs:322:17
|
305 | let process_entry = match &self.snapshot_from {
| ------------------------- `match` arms have incompatible types
306 | SnapshotFrom::Archive =>
307 | |path: &Path,
| __________________-
| | _________________|
| ||
308 | || current_len: usize,
309 | || old_appendvec_id: usize|
310 | || -> Result<Arc<AccountStorageEntry>, std::io::Error> {
| ||____________________________________________________________________- the expected closure
... |
318 | | )
319 | | },
| |__________________- this is found to be of type `[closure@runtime/src/snapshot_utils/snapshot_storage_rebuilder.rs:307:17: 310:69]`
...
322 | / |path: &Path,
323 | | current_len: usize,
324 | | old_appendvec_id: usize|
325 | | -> Result<Arc<AccountStorageEntry>, std::io::Error> {
326 | | reconstruct_single_storage(&slot, path, current_len, old_appendvec_id as u32)
327 | | },
| |_________________^ expected closure, found a different closure
|
= note: expected closure `[closure@runtime/src/snapshot_utils/snapshot_storage_rebuilder.rs:307:17: 310:69]`
found closure `[closure@runtime/src/snapshot_utils/snapshot_storage_rebuilder.rs:322:17: 325:68]`
= note: no two closures, even if identical, have the same type
= help: consider boxing your closure and/or using it as a trait object
I see that two closures have the exactly the same input and return types. But it seems the compiler does not accept it. Should I box it or use a trait?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried, somehow boxing doesn't help.
= note: expected struct `Box<[closure@runtime/src/snapshot_utils/snapshot_storage_rebuilder.rs:307:17: 310:69]>`
found struct `Box<[closure@runtime/src/snapshot_utils/snapshot_storage_rebuilder.rs:323:17: 326:69]>`
= note: no two closures, even if identical, have the same type
= help: consider boxing your closure and/or using it as a trait object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah don't worry for now, like I said it's out of scope for this PR. We can come back to this once we have it working, it's just a potential for speed up, not even sure it will help in a meaningful way.
runtime/src/snapshot_utils.rs
Outdated
let account_paths = account_paths.to_vec(); | ||
let account_paths_set: HashSet<PathBuf> = HashSet::from_iter(account_paths.clone()); | ||
|
||
for dir_symlink in fs::read_dir(accounts_hardlinks).unwrap() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these unwraps safe, or should we map err into SnapshotError::Io
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minimally please do not use unwrap()
outside of tests. This applies for the whole impl in this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to expect
runtime/src/snapshot_utils.rs
Outdated
let slot_deltas: Vec<BankSlotDelta> = bincode::options() | ||
.with_limit(MAX_SNAPSHOT_DATA_FILE_SIZE) | ||
.with_fixint_encoding() | ||
.allow_trailing_bytes() | ||
.deserialize_from(stream)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this feels like it's pulled/copied from somewhere else and we should probably wrap the deserialization in a function instead of duplicating the logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is copied from the function immediately above -- fn rebuild_bank_from_unarchived_snapshots.
I think in the future we could always rebuild bank from a snapshot dir, so the functions building bank directly from archive could be removed. In that case, they will be no duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, but we haven't removed it yet so we do have duplicated code 😄
if we refactor or change slot_delta
(de)serialization before remove above function, then we have a higher chance of missing one of these than if we have a common function for doing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still open/unresolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I can put this into a function as a standalone function. Will codedev consider this new function as not tested, and asks for adding a test function? Should I generate a status_cache file to test it?
fn deserialize_status_cache(status_cache_path) {
deserialize_snapshot_data_file(&status_cache_path, |stream| {
info!(
"Rebuilding status cache from {}",
status_cache_path.display()
);
let slot_deltas: Vec<BankSlotDelta> = bincode::options()
.with_limit(MAX_SNAPSHOT_DATA_FILE_SIZE)
.with_fixint_encoding()
.allow_trailing_bytes()
.deserialize_from(stream)?;
Ok(slot_deltas)
})?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function just does one thing -- calling another function with the deserializer closure. Should the closure be factor out instead to be a named function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added fn deserialize_status_cache
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't finish the review; will pick it up again tomorrow. Here's a few things I saw:
Sure. Added into the PR description. |
Awesome thanks! 92s -> 5s is going to be pretty awesome 🚀 |
That part is about 90 sec. This PR does not touch that part of the code, so there is no change. I think we could possibly optimize this by building index at the background, but serving the needed requests first. That's beyond of the scope of this PR. |
runtime/src/snapshot_utils.rs
Outdated
.unwrap(); | ||
|
||
for account_path in account_paths { | ||
for file in fs::read_dir(account_path).unwrap() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please avoid unwrap
outside of tests. An expect
with a message is an option, or, this function could return a Result
and the caller could put a single expect
on its invocation.
db35360
to
f4a0fc7
Compare
Codecov Report
@@ Coverage Diff @@
## master #30171 +/- ##
========================================
Coverage 81.5% 81.5%
========================================
Files 723 723
Lines 203436 203603 +167
========================================
+ Hits 165863 166021 +158
- Misses 37573 37582 +9 |
bf9a873
to
036e311
Compare
d99c12f
to
686154e
Compare
02cc217
to
3df5ea6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nearly there - just some minor comments about propating errors
full_snapshot_untar_us: measure_build_storage.as_us(), | ||
incremental_snapshot_untar_us: 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't do it in this PR - but we should rename this field. We're not untaring anything here. This is just rebuild time.
snapshot_info: &BankSnapshotInfo, | ||
account_paths: &[PathBuf], | ||
next_append_vec_id: Arc<AtomicAppendVecId>, | ||
) -> Result<AccountStorageMap> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This introduced several ways to panic, but should be returning a result. Let's catch those expects
by returning an Err
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed expect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missed this. working on it now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the existing examples all use unwrap for send().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@apfitzge I'm OK with unwrap
on send
. I don't think we could recover/do anything useful if the channel goes down. Sort of like how we always unwrap locking mutexes. Wdyt?
@@ -2260,8 +2444,23 @@ fn bank_fields_from_snapshots( | |||
}) | |||
} | |||
|
|||
fn deserialize_status_cache(status_cache_path: &Path) -> Result<Vec<BankSlotDelta>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good!
Another scenario I've been wondering about is w.r.t. subsequent incremental snapshot generation after starting up from a snapshot directory. Right now we use the snapshot archives to seed into the background services that allows them to correctly generate incremental snapshots after boot. I think we'll want tests that exercise this usage with the fastboot code.
Specifically something like this:
- Run a validator for a bit
- Ensure a bank snapshot has been taken
- Stop the validator and do a fastboot restart from the snapshot directory
- Ensure the background services can generate an incremental snapshot
I think this can happen in another PR; just needs to be in/tested before we enable/allow fastboot for the real validator. And curious, maybe you've already tested this out?
if self.snapshot_from == SnapshotFrom::Dir { | ||
// Keep track of the highest append_vec_id in the system, so the future append_vecs | ||
// can be assigned a unique id. This is only needed when loading from a snapshot | ||
// dir. When loading from a snapshot archive, the max of the appendvec IDs is | ||
// updated in remap_append_vec_file() | ||
self.next_append_vec_id | ||
.fetch_max((append_vec_id + 1) as AppendVecId, Ordering::Relaxed); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Very helpful comment!
- I think this impl is fine for now. Maybe in the future we refactor out the
fetch_max
so we don't do an atomic RMW every iteration of the loop, and instead compute a local max over the whole loop and then do a simple atomicfetch_max
at the end. (Not for this PR)
In the earliest implementation (PR 28745), background service was working fine and generating archives after booting from the snapshot directory (probably it generates a full snapshot archive first). Will check this again in the next PR in which the arg snapshot_from_dir will be added, and the background service will be running with the initial bank constructed from a snapshot dir. |
f80ddad
to
a633130
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Thanks a lot! |
Problem
To boot faster, we want to boot from a bank snapshot directory, instead of an archive. This requires getting meta information from the snapshot directories indicating the snapshot versions and the completion statuses, to help identify a good directory to boot from.
Summary of Changes
Add the function bank_from_snapshot_dir() and all the supporting functions and data structures, to construct a bank from a bank snapshot directory.
Add the test function.
This is a split PR of #28745. It is the next step after #30099, which completes the bank snapshot directory.
Testing done:
On mainnet, snapshot untar took 92.4s
When the snapshot directory is used, the archive untar is skipped, so the deserialization is reduced to 5.5s
Fixes #