-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lr/double quorum #3922
Lr/double quorum #3922
Conversation
if block_number == 0 || epoch_height == 0 { | ||
false | ||
} else { | ||
block_number % epoch_height == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned on Zulip, I think this should be block_number + 1
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nvm now I understand that block 0 is a special case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense overall, just some comments
crates/hotshot/src/lib.rs
Outdated
/// Next epoch highest QC that was seen | ||
next_epoch_high_qc: Option<NextEpochQuorumCertificate2<TYPES>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit unclear to me (also it's not clear we need it in the hotshot initializer; does it cause problems to default it to None
?)
can the comment be updated to be more descriptive about where this should come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might need it if a node restarted during epoch transition and is about to propose. I've adjusted the comment.
// If we haven't upgraded to Epochs just return None right away | ||
if self | ||
.upgrade_lock | ||
.version(self.view_number) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can use version_infallible
here I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, thank you!
if let Some(next_epoch_qc) = self.consensus.read().await.next_epoch_high_qc() { | ||
if next_epoch_qc.data.leaf_commit == high_qc.data.leaf_commit { | ||
// We have it already, no reason to wait | ||
return Some(next_epoch_qc.clone()); | ||
} | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this, can't we pass this directly to this function call as an Option<>
(like we pass high_qc
)? Do we have to lock consensus state here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean here. I've created an issue to follow up on the unresolved comments from this PR: #3978
if justify_qc.view_number() != next_epoch_justify_qc.view_number() | ||
|| justify_qc.data.epoch != next_epoch_justify_qc.data.epoch | ||
|| justify_qc.data.leaf_commit != next_epoch_justify_qc.data.leaf_commit | ||
{ | ||
bail!("Next epoch justify qc exists but it's not equal with justify qc."); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we replace this with ensure!()
? I feel it's easier to read the check that way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe the one right after this as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I agree. I used bail!
here to be consistent with the surrounding code. We might consider creating a task for rewriting all the bail
/ ensure
statements in this function. For now I've added your comment to the follow up issue: #3978
@@ -157,6 +166,65 @@ impl<TYPES: NodeType, I: NodeImplementation<TYPES>> VidTaskState<TYPES, I> { | |||
return None; | |||
} | |||
|
|||
HotShotEvent::QuorumProposalSend(proposal, _) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason we can't do this on BlockRecv
? (is it because we aren't currently passing the block number there?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there is a reason. This is related to the discussion we had about the epoch acquired at the moment when BlockRecv
is issued being unreliable. In the next PR I've slightly changed the view change logic. It seems that thanks to that the BlockRecv
's epoch has become reliable. I need to test it some more and, if everything works as expected, I will adjust the code here as well.
crates/types/src/consensus.rs
Outdated
&mut self, | ||
high_qc: NextEpochQuorumCertificate2<TYPES>, | ||
) -> Result<()> { | ||
ensure!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we rewrite this to avoid the unwrap
call even if we know it's fine?
e.g. let Some(qc) = self.next_epoch... else { return Ok(()); }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, I've changed it.
crates/hotshot/src/lib.rs
Outdated
@@ -493,6 +494,7 @@ impl<TYPES: NodeType, I: NodeImplementation<TYPES>, V: Versions> SystemContext<T | |||
|
|||
let api = self.clone(); | |||
let view_number = api.consensus.read().await.cur_view(); | |||
let epoch = api.consensus.read().await.cur_epoch(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let consensus_reader = api.consensus.read().await;
let view_number = consensus_reader.cur_view();
let epoch = consensus_reader.cur_epoch();
drop(consensus_reader);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good. I suggested one change regarding locking, but I wouldn't consider it too critical as I'll be making a cleanup pass later on for locking behavior.
I'd also address @ss-es comments before merging!
I actually answered Salman's comments earlier today but forgot to publish my answer. I'll adjust the lock / drop part. |
Closes #<ISSUE_NUMBER>
This PR:
QuorumProposal
andLeaf2
types to include the optional next epoch justify QC.NextEpochQuorumData2
. It's a thin wrapper aroundQuorumData2
. The additional type makes it possible to distinguish between vote and certificate type aliases.NextEpochQc2Formed
to signal that a QC for has been formed using votes from the nodes in the next epoch.Membership
trait. This is required because the thresholds depend on the number of nodes in the given epoch.OverallSafetyTask
. It now uses the total nodes number and threshold directly from theMembership
object. This is required to properly test the types with the variable stake table.This PR does not:
Key places to review: