Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Receiving alarms must take precedence over receiving broadcasted messages #316

Closed
masih opened this issue Jun 7, 2024 · 4 comments · Fixed by #318 or #334
Closed

Receiving alarms must take precedence over receiving broadcasted messages #316

masih opened this issue Jun 7, 2024 · 4 comments · Fixed by #318 or #334
Assignees
Labels
integration F3 <> EC client integration safety

Comments

@masih
Copy link
Member

masih commented Jun 7, 2024

The GPBFT algorithm assumes that when an alarm is triggered it will get delivered at the designated time before a broadcasted message that so happens to arrive at the same time.

This assumption does not hold in the current simulations, where all messages including alarms use the same queue for delivery prioritised by deliverAt timestamp regardless of whether they are an alarm or broadcasted messages.

  • Respect the priority of alarms in simulation.
  • Document this requirement at gpbft.Clock.
@masih masih added this to F3 Jun 7, 2024
@masih masih self-assigned this Jun 7, 2024
@masih masih added safety integration F3 <> EC client integration labels Jun 7, 2024
@masih masih moved this to In progress in F3 Jun 7, 2024
masih added a commit that referenced this issue Jun 7, 2024
Document the implicit assumption that the delivery of alarms to a
participant must take precedence over broadcast messages.

Adjust simulation message queue to respect the alarm ordering.

Fixes #316
@masih masih moved this from In progress to In review in F3 Jun 7, 2024
masih added a commit that referenced this issue Jun 7, 2024
Document the implicit assumption that the delivery of alarms to a
participant must take precedence over broadcast messages.

Adjust simulation message queue to respect the alarm ordering.

Fixes #316
github-merge-queue bot pushed a commit that referenced this issue Jun 7, 2024
…318)

Document the implicit assumption that the delivery of alarms to a
participant must take precedence over broadcast messages.

Adjust simulation message queue to respect the alarm ordering.

Fixes #316
@masih masih closed this as completed in #318 Jun 7, 2024
@github-project-automation github-project-automation bot moved this from In review to Done in F3 Jun 7, 2024
@anorth
Copy link
Member

anorth commented Jun 10, 2024

This spawned from #242, but that says the protocol...

assumes that when it sends a message, it is going to receive that message before it receives anything else

I.e. messages first, in the particular case of a send to self. So on the surface this strikes me as not the right fix.

It's also not documented in the code why the alarm-before-message assumption exists, and e.g. what would have to change for it to no longer be required (currently, it's just asserted with no justification). This should be documented somewhere in gpbft or the host API.

(The "why" for messages first is: so a send to self is received as if synchronously)

@anorth
Copy link
Member

anorth commented Jun 10, 2024

@Kubuxu noted in #318 that for message-over-alarm precedence:

This is hard or almost impossible to guarantee in an environment where signing is async and can fail. I will open an issue about it.

Elsewhere:

There is no guarantee that anyone will sign the message we requested broadcast for, and even if there is someone that will sign it, it would require us to block everything until the do.

Alternatively, we can try to go back to a pattern of internally-delivering sends-to-self at a post-validation step. This will be difficult without knowing the participant's own ID (which was the goal #103 that resulted in async self-sends in the first place).

@masih
Copy link
Member Author

masih commented Jun 11, 2024

(The "why" for messages first is: so a send to self is received as if synchronously)

One can say the same about alarm. right?

This is my interpretation of the FIP, in that alarm triggering system is not considered a network call. I also acknowledge that the FIP can be interpreted in a different way and it does not dictate a strict ordering of messages vs alerts.

Thank you for clarifying the need for ordering of messages to self; I think both messages to self and alarm should be synchronous.

masih added a commit that referenced this issue Jun 11, 2024
The gpbft implementation implicitly assumes that broadcast of `CONVERGE`
messages to self are delivered immediately. In practice this assumption
does not hold because of the complexity in deferred signing and async
message delivery.

The changes here relax this assumption by explicitly notifying the local
converge state that the self participant has begun the `CONVERGE` step,
providing self proposal and justification for the proposal. The code
then considers the given data whenever search in converge state does not
bear any results, caused by asynchronous message delivery. Further, the
code ignores the self converge value once at least one broadcast message
is received.

Fixes #316
Reverts #318
Relates to #103 (comment)
masih added a commit that referenced this issue Jun 11, 2024
The gpbft implementation implicitly assumes that broadcast of `CONVERGE`
messages to self are delivered immediately. In practice this assumption
does not hold because of the complexity in deferred signing and async
message delivery.

The changes here relax this assumption by explicitly notifying the local
converge state that the self participant has begun the `CONVERGE` step,
providing self proposal and justification for the proposal. The code
then considers the given data whenever search in converge state does not
bear any results, caused by asynchronous message delivery. Further, the
code ignores the self converge value once at least one broadcast message
is received.

Additionally, the changes remove zero-latency for messages to self in
simulations to make a stronger assertion that synchronous message
delivery to self is no longer required (neither for `GMessage` nor
alarms).

Fixes #316
Reverts #318
Relates to #103 (comment)
masih added a commit that referenced this issue Jun 11, 2024
The gpbft implementation implicitly assumes that broadcast of `CONVERGE`
messages to self are delivered immediately. In practice this assumption
does not hold because of the complexity in deferred signing and async
message delivery.

The changes here relax this assumption by explicitly notifying the local
converge state that the self participant has begun the `CONVERGE` step,
providing self proposal and justification for the proposal. The code
then considers the given data whenever search in converge state does not
bear any results, caused by asynchronous message delivery. Further, the
code ignores the self converge value once at least one broadcast message
is received.

Additionally, the changes remove zero-latency for messages to self in
simulations to make a stronger assertion that synchronous message
delivery to self is no longer required (neither for `GMessage` nor
alarms).

Fixes #316
Reverts #318
Relates to #103 (comment)
@anorth
Copy link
Member

anorth commented Jun 11, 2024

(The "why" for messages first is: so a send to self is received as if synchronously)

One can say the same about alarm. right?

I don't think there's any need for synchronous alarms. If code setting an alarm wants synchronous triggering, it should just call the trigger directly. Alarms are specifically to invoke asynchrony. The code (but not the FIP) uses "alarm at now" to invoke asynchrony with no delay just to unify the execution path into some bits of code, like starting a protocol instance.

It would be much simpler if messages-to-self could also use synchronous invocation by direct method call (and they used to). But that became impossible when we removed knowledge of the participant ID from the protocol code.

anorth pushed a commit that referenced this issue Jun 11, 2024
The gpbft implementation implicitly assumes that broadcast of `CONVERGE`
messages to self are delivered immediately. In practice this assumption
does not hold because of the complexity in deferred signing and async
message delivery.

The changes here relax this assumption by explicitly notifying the local
converge state that the self participant has begun the `CONVERGE` step,
providing self proposal and justification for the proposal. The code
then considers the given data whenever search in converge state does not
bear any results, caused by asynchronous message delivery. Further, the
code ignores the self converge value once at least one broadcast message
is received.

Additionally, the changes remove zero-latency for messages to self in
simulations to make a stronger assertion that synchronous message
delivery to self is no longer required (neither for `GMessage` nor
alarms).

Fixes #316
Reverts #318
Relates to #103 (comment)
github-merge-queue bot pushed a commit that referenced this issue Jun 11, 2024
…ly (#334)

* Relax the assumption of receiving own `CONVERGE` messages synchronously

The gpbft implementation implicitly assumes that broadcast of `CONVERGE`
messages to self are delivered immediately. In practice this assumption
does not hold because of the complexity in deferred signing and async
message delivery.

The changes here relax this assumption by explicitly notifying the local
converge state that the self participant has begun the `CONVERGE` step,
providing self proposal and justification for the proposal. The code
then considers the given data whenever search in converge state does not
bear any results, caused by asynchronous message delivery. Further, the
code ignores the self converge value once at least one broadcast message
is received.

Additionally, the changes remove zero-latency for messages to self in
simulations to make a stronger assertion that synchronous message
delivery to self is no longer required (neither for `GMessage` nor
alarms).

Fixes #316
Reverts #318
Relates to #103 (comment)

* Adjust naming and comments.

---------

Co-authored-by: Alex North <445306+anorth@users.noreply.github.com>
@github-project-automation github-project-automation bot moved this from In review to Done in F3 Jun 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment