Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workers should retry indefinitely #1290

Closed
5 tasks
Tracked by #1350
greg-szabo opened this issue Aug 16, 2021 · 3 comments · Fixed by #1340
Closed
5 tasks
Tracked by #1350

Workers should retry indefinitely #1290

greg-szabo opened this issue Aug 16, 2021 · 3 comments · Fixed by #1340
Labels
E: gravity External: related to Gravity DEX I: logic Internal: related to the relaying logic O: usability Objective: cause to improve the user experience (UX) and ease using the product
Milestone

Comments

@greg-szabo
Copy link
Member

Crate

hermes

Summary of Bug

sentry node restart took too long and it killed hermes worker:

Aug 16 13:34:34 hermes hermes[25520]: Aug 16 13:34:34.641 ERROR [cosmoshub-4:transfer/channel-141 -> osmosis-1] worker: handling command encountered error: failed during query to chain id osmosis-1 with underlying error: Light client error for RPC address http://10.10.51.151:26657/: node at http://10.10.51.151:26657/ running chain osmosis-1 not caught up path=packet::channel-141/transfer:cosmoshub-4->osmosis-1
Aug 16 13:34:34 hermes hermes[25520]: Aug 16 13:34:34.642 ERROR [core-1:transfer/channel-6 -> osmosis-1] worker: handling command encountered error: failed during query to chain id osmosis-1 with underlying error: Light client error for RPC address http://10.10.51.151:26657/: node at http://10.10.51.151:26657/ running chain osmosis-1 not caught up path=packet::channel-6/transfer:core-1->osmosis-1
Aug 16 13:34:34 hermes hermes[25520]: Aug 16 13:34:34.876 ERROR [cosmoshub-4:transfer/channel-141 -> osmosis-1] worker: handling command encountered error: failed during query to chain id osmosis-1 with underlying error: Light client error for RPC address http://10.10.51.151:26657/: node at http://10.10.51.151:26657/ running chain osmosis-1 not caught up path=packet::channel-141/transfer:cosmoshub-4->osmosis-1
Aug 16 13:34:34 hermes hermes[25520]: Aug 16 13:34:34.876 ERROR [packet::channel-141/transfer:cosmoshub-4->osmosis-1#103] worker aborted with error: Packet worker failed after 7 retries

Version

0.6.1

Steps to Reproduce

Acceptance Criteria

Workers should retry infinitely


For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate milestone (priority) applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned
@greg-szabo greg-szabo changed the title workers should retry infinetly workers should retry infinitely Aug 16, 2021
@greg-szabo greg-szabo added this to the 08.2021 milestone Aug 17, 2021
@adizere adizere added I: logic Internal: related to the relaying logic O: usability Objective: cause to improve the user experience (UX) and ease using the product labels Aug 17, 2021
@adizere
Copy link
Member

adizere commented Aug 17, 2021

Looking at logs from gravity dex/emeris, the same problem shows up there. We'll on this ASAP, thanks for reporting Greg!

@romac romac changed the title workers should retry infinitely Workers should retry indefinitely Aug 17, 2021
@romac
Copy link
Member

romac commented Aug 17, 2021

As discussed during today's call, there are multiple, mostly orthogonal solutions for this:

a) Retry indefinitely
b) Make retry limit/strategy configurable
c) Detect if node is in fast sync mode and backoff temporarily
d) Add backup nodes to configuration and automatically fall back

We eventually want (b) (c) and (d), with (b) enabling (a), but will start with (a) directly for now.

@greg-szabo
Copy link
Member Author

greg-szabo commented Aug 17, 2021 via email

@ancazamfir ancazamfir added the E: gravity External: related to Gravity DEX label Sep 1, 2021
@adizere adizere modified the milestones: 08.2021, 09.2021 Sep 6, 2021
@romac romac mentioned this issue Sep 14, 2021
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
E: gravity External: related to Gravity DEX I: logic Internal: related to the relaying logic O: usability Objective: cause to improve the user experience (UX) and ease using the product
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants