Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Consensus] Garbage Collection - 2 #19385

Open
wants to merge 19 commits into
base: main
Choose a base branch
from
Open

Conversation

akichidis
Copy link
Contributor

@akichidis akichidis commented Sep 16, 2024

Description

This is the second part of Garbage Collection which implements the following ticked next steps as outlined on the previous PR

  • BlockManager to respect the gc_round when accepting blocks and trigger clean ups for new gc rounds
  • Skip blocks that are received which are <= gc_round
  • Not propose ancestors that are <= gc_round
  • Subscriber to ask for blocks from gc_round when last_fetched_round < gc_round for a peer to prevent us from fetching unnecessary blocks

Next steps:

  • Re-propose GC'ed transactions (probably not all of them)
  • Implement new timestamp approach so ancestor verification is not needed
  • Harden testing for GC & edge cases

Test plan

CI/PT


Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.

For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.

  • Protocol:
  • Nodes (Validators and Full nodes):
  • Indexer:
  • JSON-RPC:
  • GraphQL:
  • CLI:
  • Rust SDK:
  • REST API:

Copy link

vercel bot commented Sep 16, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
sui-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Sep 20, 2024 10:38am
3 Skipped Deployments
Name Status Preview Comments Updated (UTC)
multisig-toolkit ⬜️ Ignored (Inspect) Visit Preview Sep 20, 2024 10:38am
sui-kiosk ⬜️ Ignored (Inspect) Visit Preview Sep 20, 2024 10:38am
sui-typescript-docs ⬜️ Ignored (Inspect) Visit Preview Sep 20, 2024 10:38am

Comment on lines +124 to +125
let blocks_to_accept = self
.verify_block_timestamps_and_accept(iter::once(block).chain(unsuspended_blocks));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Practically most of the previous code has been extracted to the verify_block_timestamps_and_accept method. This won't be needed once we refactor to the new timestamp approach

@@ -114,69 +114,15 @@ impl BlockManager {
missing_blocks.extend(ancestors_to_fetch);
continue;
}
TryAcceptResult::Processed => continue,
TryAcceptResult::Processed | TryAcceptResult::Skipped => continue,
Copy link
Contributor Author

@akichidis akichidis Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made the decision to let received blocks with block.round <= gc_round to be allowed received up to here and not straight rejected on the AuthorityService mostly for two reasons:

  1. round_prober: this is using the highest_received_rounds keeps getting updated on the received blocks. Dropping blocks early enough due to gc (ex let's assume that we indeed have a slow proposer A, or there was some temporary network issue) , will not update those structs and might effectively make the proposer A halt on propose and not be able to recover from that as received blocks will keep getting rejected.
  2. metrics in block manager that are updated per received block. That's useful insight to have and we might don't want to lose at this point.

Another point to keep in mind, it's the amnesia recovery. Since now we'll not store blocks < gc_round we might have a case of a node whose blocks get rejected , and then tries to recover from a db wipe out. Since none of the earlier blocks would have been stored the node will try to propose from much earlier. That's quite nit and I wouldn't worry now. Just adding this here to keep in mind to show all the affected paths from this decision.

Base automatically changed from akichidis/consensus-gc-1 to main September 20, 2024 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant