Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relay pending packets in the relayer loop #709

Merged
merged 3 commits into from
Feb 25, 2021
Merged

Relay pending packets in the relayer loop #709

merged 3 commits into from
Feb 25, 2021

Conversation

ancazamfir
Copy link
Collaborator

Closes: #561

Description

  • added the new relay_pending_packets() that queries the chains for the unreceived packets and acks, builds and sends transactions. This function uses the exact methods used by the corresponding CLIs. It adds a retry around these.
  • changed relay_from_events() to perform relaying of pending packets after events subscription but before the first event is processed. The height for queries and client updates must be the height of the first event minus one.
  • noticed that the signer for msg timeout is wrong in master after my last commit and fixed that. We need to add a timeout test to the CI (Add test for MsgTimeout in CI #707)
  • also opened Add CI test for the relayer loop #708 for relayer loop CI

For contributor use:

  • Updated the Unreleased section of CHANGELOG.md with the issue.
  • If applicable: Unit tests written, added test to CI.
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
  • Updated relevant documentation (docs/) and code comments.
  • Re-reviewed Files changed in the Github PR explorer.

@codecov-io
Copy link

codecov-io commented Feb 24, 2021

Codecov Report

Merging #709 (b2d3d7f) into master (b1b37f5) will increase coverage by 30.2%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #709      +/-   ##
=========================================
+ Coverage    13.6%   43.8%   +30.2%     
=========================================
  Files          69     150      +81     
  Lines        3752    9821    +6069     
  Branches     1374       0    -1374     
=========================================
+ Hits          513    4309    +3796     
- Misses       2618    5512    +2894     
+ Partials      621       0     -621     
Impacted Files Coverage Δ
modules/src/address.rs 100.0% <ø> (ø)
...application/ics20_fungible_token_transfer/error.rs 0.0% <ø> (ø)
...ion/ics20_fungible_token_transfer/msgs/transfer.rs 0.0% <ø> (ø)
modules/src/events.rs 0.0% <ø> (ø)
modules/src/handler.rs 100.0% <ø> (ø)
modules/src/ics02_client/client_def.rs 48.3% <ø> (ø)
modules/src/ics02_client/client_type.rs 79.1% <ø> (+31.5%) ⬆️
modules/src/ics02_client/context.rs 100.0% <ø> (ø)
modules/src/ics02_client/error.rs 100.0% <ø> (ø)
modules/src/ics02_client/events.rs 15.7% <ø> (+15.7%) ⬆️
... and 255 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7011aa8...ff602fc. Read the comment docs.

Copy link
Member

@romac romac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, that's much less code than I thought it would take :)

Left a suggestion inline, but feel free to ignore.

relayer/src/link.rs Outdated Show resolved Hide resolved
fn relay_from_events(&mut self) -> Result<(), LinkError> {
fn relay_pending_packets(&mut self) -> Result<(), LinkError> {
println!("clearing old packets on {:#?}", self.channel);
for _i in 0..MAX_ITER {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More general question, which does not be addressed in this PR: would it be sensible to introduce a delay or even implement some kind of exponential backoff strategy when retrying an operation which has failed (in this case, relaying the pending packets)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeh, we've been avoiding the whole error handling for some time. We have this pattern everywhere! We blindly retry in a loop for all errors. We need to sit down and see how to deal with each error. We have RPC, light client, on-chain and internal errors. The most complex ones are the on-chain errors that are a bit more subtle (leaving aside errors in the relayer code) and also they are not typed (we just get an error message and the reason is embedded in a string). The retry with exponential backoff may apply for some cases, in some cases we should not retry (e.g. a transaction has already been submitted by another relayer), in other cases we should rebuild stuff, etc. I think we should start a little doc and try to cover all errors and what actions we should take.
Because the problem is so broad I would leave this handling here for later.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #712 for error handling.

ancazamfir and others added 2 commits February 25, 2021 17:05
Co-authored-by: Romain Ruetschi <romain@informal.systems>
@ancazamfir ancazamfir merged commit 3e513db into master Feb 25, 2021
@ancazamfir ancazamfir deleted the anca/restart branch February 25, 2021 16:48
hu55a1n1 pushed a commit to hu55a1n1/hermes that referenced this pull request Sep 13, 2022
* Added relaying of pending packets in the relayer loop

* Apply Romain's suggestion

* Cargo.lock

Co-authored-by: Romain Ruetschi <romain@informal.systems>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cleanup pending packets when (re)starting the relayer loop
3 participants