Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: proper flat storage deltas removal #8718

Merged
merged 16 commits into from
Mar 17, 2023

Conversation

Longarithm
Copy link
Member

@Longarithm Longarithm commented Mar 13, 2023

Proper deltas removal during flat storage creation and normal operation. Resolves #8655.

The original issue was that we removed deltas exactly twice. It was already resolved around #8683. Reason: we cached blocks and deltas separately, and flat head was cached whereas flat head delta was not - as we don't need it. But we still iterated over all blocks during GC and executed store_helper::remove_delta for each block hash. This means that remove_delta was called twice: 1) when we move TO flat head; 2) when we move FROM flat head.

Now we call get_all_deltas_metadata on flat storage creation, guarantee that all cached deltas exist, and iterate over existing metadatas, not blocks.

However, now we need to GC deltas created during background creation. We do it in two steps:

  1. once we caught up single block - its delta is not needed
  2. when we create FS object - there can be some fork blocks remaining. We don't even want to load deltas for them anytime, so it's better do to cleanup right now. Theoretical estimation on their metadatas size is fairly small, and in practice we have 2-10 such blocks - so this step is actually needed. After cleanup, on my experiments there were only 2 deltas remaining on DB, as expected.

Side fix: tests revealed that when flat storage is removed, we don't remove deltas and it's better to do it.

At the same time I introduce CACHED_CHANGES_LIMIT and CACHED_CHANGES_SIZE_LIMIT to warn users that there is too much data cached in flat storage.

Testing

Adjusting test_flat_storage_creation_sanity to check that behaviour: create fork block in the meantime; check in logs that it is actually GC-d; check that previous behaviour fails the test.

@Longarithm Longarithm self-assigned this Mar 13, 2023
@Longarithm Longarithm linked an issue Mar 13, 2023 that may be closed by this pull request
@Longarithm Longarithm changed the title draft: proper deltas removal during creation feat: proper flat storage deltas removal Mar 16, 2023
@Longarithm Longarithm changed the title feat: proper flat storage deltas removal fix: proper flat storage deltas removal Mar 16, 2023
@Longarithm Longarithm marked this pull request as ready for review March 16, 2023 12:27
@Longarithm Longarithm requested a review from a team as a code owner March 16, 2023 12:27
pub fn remove_all_deltas(store_update: &mut StoreUpdate, shard_uid: ShardUId) {
let key_from = shard_uid.to_bytes();
let mut key_to = key_from;
key_to[7] += 1;
Copy link
Contributor

@pugachAG pugachAG Mar 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if key_to[7] is u8::MAX_VALUE?
I suggest implementing fn next_shard_prefix(cur: &[u8]) -> Vec<u8> function which does +1 with proper carry over handling. Then we can re-use it for flat state removal as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. There would be no impact because we have < 256 shards, but I don't want it to suddenly fail at some point. Introduced ShardUId::next_shard_prefix(bytes) because it makes more sense as slice operation.

Side note: little-endian is quite annoying here, because I can't just increment shard id. Still we can rely on the fact that all shard uids are unique.

/// Expected limits for in-memory stored changes, under which flat storage must keep working.
/// If they are exceeded, warnings are displayed. Flat storage still will work, but its
/// performance will slow down, and eventually it can cause OOM error.
const CACHED_CHANGES_LIMIT: usize = 50;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want to track this separately? As far as I understand there is no risk of having more than 50 small deltas cached and producing warn in this case is a false-alarm, at least in the context in flat storage. I suggest removing that and only tracking the total size of cached changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I remember, there is a risk of node slowdown if there are too many deltas, even if they are small, so it is also worth a warning.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please elaborate on what those risks are. In my understanding the bottleneck is hash map access, but then this value should be much larger: I expect main memory access to be ~100ns (which is a pessimistic case with processor cache miss). Even assuming we need 2 memory accesses for each hash map access and acceptable flat storage latency is 100us then we can still support 500 deltas without major performance degradation.
In general I'm against unnecessary tracking/warn because it introduces maintenance overhead for the future. If you insist on keeping that then please make sure to write an elaborate comment describing calculation behind the value as well as what could happen when we exceed it. For example current comment eventually it can cause OOM error. behind CACHED_CHANGES_LIMIT is very misleading, because that is only applicable to CACHED_CHANGES_SIZE_LIMIT

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, looks like @jakmeier has just proven me wrong: #8006 (comment) and we can reasonable support around only around 100 deltas 😞

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tbh I didn't know exact numbers, but now I can happily add them to comments

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep it at 100 and add a link to Jakob's table in the comment

/// Expected limits for in-memory stored changes, under which flat storage must keep working.
/// If they are exceeded, warnings are displayed. Flat storage still will work, but its
/// performance will slow down, and eventually it can cause OOM error.
const CACHED_CHANGES_LIMIT: usize = 50;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please elaborate on what those risks are. In my understanding the bottleneck is hash map access, but then this value should be much larger: I expect main memory access to be ~100ns (which is a pessimistic case with processor cache miss). Even assuming we need 2 memory accesses for each hash map access and acceptable flat storage latency is 100us then we can still support 500 deltas without major performance degradation.
In general I'm against unnecessary tracking/warn because it introduces maintenance overhead for the future. If you insist on keeping that then please make sure to write an elaborate comment describing calculation behind the value as well as what could happen when we exceed it. For example current comment eventually it can cause OOM error. behind CACHED_CHANGES_LIMIT is very misleading, because that is only applicable to CACHED_CHANGES_SIZE_LIMIT

@@ -283,6 +291,15 @@ impl FlatStorage {
block_hash,
CachedFlatStateDelta { metadata: delta.metadata, changes: Arc::new(cached_changes) },
);
let cached_changes_num_items = guard.metrics.cached_changes_num_items.get() as usize;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't use prometheus metrics data source, that is super hacky :) use cached_changes.len() and cached_changes.total_size() instead

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw this line was even wrong. I wanted to count number of deltas, not number of items in them - fixed.

chain/chain/src/flat_storage_creator.rs Show resolved Hide resolved
Copy link
Contributor

@pugachAG pugachAG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update_delta_metrics is pretty neat!
overall LGTM, thanks for addressing my comments.

let cached_deltas = self.deltas.len();
let mut cached_changes_num_items = 0;
let mut cached_changes_size = 0;
for (_, changes) in self.deltas.iter() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: for changes in self.deltas.values() should work

@near-bulldozer near-bulldozer bot merged commit fddc418 into near:master Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Proper deletion of FlatStateDeltas
3 participants