Modified packet worker to use stubborn strategy #1340

adizere · 2021-09-10T08:56:49Z

Closes: #1290

Description

The following describes worker behavior after this PR:

packet worker:
- old behavior: used a default retry strategy that retried for 6 iterations and then gave up;
  - the individual retry backoff varied between 200ms and 500ms
  - the cumulative backoff time was 2 seconds
- new behavior: using a stubborn strategy, which retries for infinite iterations and never gives up
  - the individual backoff time (between subsequent retries) is 1sec in the beginning, and increases steadily by 10ms at every step

The rationale for this choice is as follows. I expect past a certain duration (I choose in this PR 6 hours), Hermes might as well give up, as there is little chance that the connection can become healthy again without operator intervention, but not sure if my intuition is appropriate.

Unchanged behavior:

client worker
- this worker is in charge of regularly updating clients (unless there is already channel traffic, which will trigger client updates), at a frequency of 2/3 of trusting period
- this worker is already stubborn: never quits unless the supervisor instructs it to do so, or if the client is expired or frozen
channel and connection workers
- these are in charge of finishing up initialized (but incomplete) connection handshakes or initialized channel handshakes
- these workers don't need to retry stubbornly, because their activity is time-sensitive

For contributor use:

Added a changelog entry, using unclog.
If applicable: Unit tests written, added test to CI.
Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
Updated relevant documentation (docs/) and code comments.
Re-reviewed Files changed in the Github PR explorer.

adizere · 2021-09-10T08:58:56Z

relayer/src/worker/retry_strategy.rs

+    use crate::worker::retry_strategy::{worker_stubborn_strategy, worker_default_strategy};
+
+    #[test]
+    fn default_strategy() {


This test is almost identical to this test
https://github.com/informalsystems/ibc-rs/blob/85446586daf6dba652271eef31dae29f35da86c2/relayer/src/util/retry.rs#L153

but I only realized after writing it.

I still found it useful to keep it around, for the purpose of understanding exactly the details of the strategy we use in practice.

hu55a1n1

👌💯

* Modified packet worker to use stubborn strategy * Longer retry strategy, bigger backoff * Fmt & clippy * Made retry indefinite * Better comments * changelog

adizere added 2 commits September 10, 2021 10:52

Modified packet worker to use stubborn strategy

3428577

Longer retry strategy, bigger backoff

4177a32

adizere requested review from ancazamfir and romac as code owners September 10, 2021 08:56

adizere commented Sep 10, 2021

View reviewed changes

Fmt & clippy

94c290d

adizere changed the title ~~Modified packet worker to use more stubborn strategy~~ Modified packet worker to use stubborn strategy Sep 10, 2021

adizere requested a review from mircea-c September 10, 2021 12:42

adizere marked this pull request as draft September 15, 2021 15:16

adizere added 3 commits September 16, 2021 17:46

Made retry indefinite

7217771

Better comments

c7c9650

changelog

d95beec

adizere marked this pull request as ready for review September 16, 2021 15:53

hu55a1n1 approved these changes Sep 21, 2021

View reviewed changes

adizere merged commit 66049e2 into master Sep 22, 2021

adizere deleted the adi/1290_stubborn_workers branch September 22, 2021 08:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modified packet worker to use stubborn strategy #1340

Modified packet worker to use stubborn strategy #1340

adizere commented Sep 10, 2021 •

edited

Loading

adizere Sep 10, 2021 •

edited

Loading

hu55a1n1 left a comment

Modified packet worker to use stubborn strategy #1340

Modified packet worker to use stubborn strategy #1340

Conversation

adizere commented Sep 10, 2021 • edited Loading

Description

adizere Sep 10, 2021 • edited Loading

Choose a reason for hiding this comment

hu55a1n1 left a comment

Choose a reason for hiding this comment

adizere commented Sep 10, 2021 •

edited

Loading

adizere Sep 10, 2021 •

edited

Loading