Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relax the assumption of receiving own CONVERGE messages synchronously #334

Merged
merged 2 commits into from
Jun 11, 2024

Conversation

masih
Copy link
Member

@masih masih commented Jun 11, 2024

The gpbft implementation implicitly assumes that broadcast of CONVERGE messages to self are delivered immediately. In practice this assumption does not hold because of the complexity in deferred signing and async message delivery.

The changes here relax this assumption by explicitly notifying the local converge state that the self participant has begun the CONVERGE step, providing self proposal and justification for the proposal. The code then considers the given data whenever search in converge state does not bear any results, caused by asynchronous message delivery. Further, the code ignores the self converge value once at least one broadcast message is received.

Additionally, the changes remove zero-latency for messages to self in simulations to make a stronger assertion that synchronous message delivery to self is no longer required (neither for GMessage nor alarms).

Fixes #316
Reverts #318
Relates to #103 (comment)

@masih masih requested review from anorth and Kubuxu June 11, 2024 11:49
Copy link

codecov bot commented Jun 11, 2024

Codecov Report

Attention: Patch coverage is 81.81818% with 2 lines in your changes missing coverage. Please review.

Project coverage is 83.59%. Comparing base (945b2c7) to head (60f23e6).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #334      +/-   ##
==========================================
+ Coverage   83.48%   83.59%   +0.11%     
==========================================
  Files          15       15              
  Lines        1623     1634      +11     
==========================================
+ Hits         1355     1366      +11     
  Misses        158      158              
  Partials      110      110              
Files Coverage Δ
gpbft/gpbft.go 85.02% <81.81%> (+0.25%) ⬆️

@masih masih force-pushed the masih/async-recieve-self-ok branch 2 times, most recently from 1d885d8 to 64f5861 Compare June 11, 2024 13:20
@Kubuxu
Copy link
Contributor

Kubuxu commented Jun 11, 2024

Seems good to me as an approach

Copy link
Member

@anorth anorth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this is better than #331.

gpbft/gpbft.go Outdated
// has begun the CONVERGE_PHASE. This notification ensures that the convergeState
// of a round does not rely on messages broadcast by a participant destined for
// itself to be delivered synchronously. See HasSelfBegunConverge.
func (c *convergeState) NotifySelfConvergeBegun(value ECChain, justification *Justification) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: rather than the terminology of notifications in the method names and comments, I suggest something simple and direct like SetSelfValue, HasSelfValue.

gpbft/gpbft.go Outdated
@@ -1281,16 +1308,34 @@ func (c *convergeState) FindMaxTicketProposal(table PowerTable) ConvergeValue {
}
}
}

// Check if self participant has entered CONVERGE phase.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can focus the documentation here a little, but the code looks good.

I would say that if the node has received any converged messages, it finds the best ticket from those. That will include the node's own messages, once they are delivered asynchronously. Fall back to the self-value if nothing has been received yet. Note that this introduces a brief possibility of a node ignoring its own value if it has not received the ticket for it yet.

I don't think we need to talk about checks on whether the phase has started (goes with renaming the method) or make statements about what is assumed to have been sent down at this level. This is a change in focus of the comments from what other code is assumed to have done to direct statements about what this code does.

gpbft/gpbft.go Outdated

// Check if self participant has entered the CONVERGE step, and whether the chain
// matches the self proposal. This clause covers an edge-case where self
// participant has not received broadcasts about its own CONVERGE messages yet.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these comments are unnecessary. The code does what it says it will do.

@@ -40,10 +40,10 @@ func NewLogNormal(seed int64, mean time.Duration) *LogNormal {
// distribution. Latency from one participant to another may be asymmetric and
// once generated remains constant for the lifetime of a simulation.
//
// Note, where from and to are the same or mean configured latency is not larger
// than zero the latency sample will always be zero.
// Note, mean configured latency is not larger than zero the latency sample will
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Note, mean configured latency is not larger than zero the latency sample will
// Note, when mean configured latency is not larger than zero the latency sample will

The gpbft implementation implicitly assumes that broadcast of `CONVERGE`
messages to self are delivered immediately. In practice this assumption
does not hold because of the complexity in deferred signing and async
message delivery.

The changes here relax this assumption by explicitly notifying the local
converge state that the self participant has begun the `CONVERGE` step,
providing self proposal and justification for the proposal. The code
then considers the given data whenever search in converge state does not
bear any results, caused by asynchronous message delivery. Further, the
code ignores the self converge value once at least one broadcast message
is received.

Additionally, the changes remove zero-latency for messages to self in
simulations to make a stronger assertion that synchronous message
delivery to self is no longer required (neither for `GMessage` nor
alarms).

Fixes #316
Reverts #318
Relates to #103 (comment)
@anorth anorth force-pushed the masih/async-recieve-self-ok branch from 71af777 to 174bb85 Compare June 11, 2024 22:45
@anorth anorth force-pushed the masih/async-recieve-self-ok branch from 174bb85 to 60f23e6 Compare June 11, 2024 22:46
@anorth anorth enabled auto-merge June 11, 2024 22:47
@anorth anorth added this pull request to the merge queue Jun 11, 2024
Merged via the queue into main with commit 3029d7a Jun 11, 2024
6 of 7 checks passed
@anorth anorth deleted the masih/async-recieve-self-ok branch June 11, 2024 22:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Receiving alarms must take precedence over receiving broadcasted messages
3 participants