Add PubSub publish timeout #8

rnarubin · 2022-04-22T21:20:35Z

This adds a timeout to publishing messages to PubSub, as well as
improves some logging in the default retry implementation.

The initial timeout implementation here is done a semver-compatible way,
and with a relatively simple configurable fixed timeout. In the next
major semver I'd like to make the configuration required, and
potentially add some time-based retrying and timeout backoff.

This adds a timeout to publishing messages to PubSub, as well as improves some logging in the default retry implementation. The initial timeout implementation here is done a semver-compatible way, and with a relatively simple configurable fixed timeout. In the next major semver I'd like to make the configuration required, and potentially add some time-based retrying and timeout backoff.

blogle · 2022-04-25T20:03:43Z

In the next major semver I'd like to make the configuration required, and potentially add some time-based retrying and timeout backoff.

I see logic for exponential backoff and that code is part of the pubsub client struct. Are we just not using that logic yet, or is this referring to something else?

blogle · 2022-04-25T19:38:22Z

src/pubsub/publish_sink.rs

+                let publish = client.publish(tonic_request);
+
+                // apply a timeout to the publish
+                let publish_fut = tokio::time::timeout(timeout, publish).map_err(


I don't know all that much about grpc... do we need both a timeout on the client and server side?

Yes. Client side is the more obvious one, in case the server never responds, or a network problem, or whatever. Server side is more of a good convention; if the server takes too long and knows the client will timeout, the server is allowed to free resources after that time, which can help responsiveness overall when under load.

blogle · 2022-04-25T19:39:42Z

src/pubsub/publish_sink.rs

+
+                // apply a timeout to the publish
+                let publish_fut = tokio::time::timeout(timeout, publish).map_err(
+                    |tokio::time::error::Elapsed { .. }| {


if we hit the server side timeout, does tokio return the same time::error:Elapsed error here?

Server timeout would result in an error from the wrappee publish future. That's why the match is on a double Result<Result<T, Status> Status>. I'd have used flatten but it's not stable yet rust-lang/rust#70142

blogle · 2022-04-25T19:56:09Z

src/retry_policy/exponential_backoff.rs

+                    backoff_ms = %interval.as_millis(),
+                    remaining_attempts = match self.intervals.size_hint() {
+                        (_lower_bound, Some(upper_bound)) => upper_bound,
+                        (lower_bound, None) => lower_bound


I don't grok why the lower bound of the iterator would be indicative of the remaining attempts

I guess it's not exactly indicative. I thought about having "unknown" if there's no upper bound. In practice this will never matter because the only possible iterator will always have an upper bound (.take(max_retries.unwrap_or(usize::MAX)))

rnarubin · 2022-04-25T20:16:50Z

In the next major semver I'd like to make the configuration required, and potentially add some time-based retrying and timeout backoff.

I see logic for exponential backoff and that code is part of the pubsub client struct. Are we just not using that logic yet, or is this referring to something else?

There's exponential backoff for retries (i.e. try again after increasing sleep). Timeouts could also use such a backoff (i.e. wait for a response with increasing timeout), instead of the current fixed timeout, and potentially the same or similar backoff code. However the current retry logic is count-limited and not time-limited, which means if the backoff is expontential and the timeout is exponential, retrying a series of timed-out requests will take N attempts which might add to many minutes of time. If retries were time-limited, then there'd be a better overall cap to the possible delay from a series of timeouts.

Having time-limited retries is a worthwhile change, but will take some work and I want to get these timeouts in place first.

rnarubin requested a review from blogle April 22, 2022 21:20

rnarubin self-assigned this Apr 22, 2022

rnarubin added 2 commits April 22, 2022 16:21

Bump version to 0.7.3

b220eea

rnarubin force-pushed the better_publish_timeout branch from 0bd0ded to b220eea Compare April 22, 2022 23:22

blogle reviewed Apr 25, 2022

View reviewed changes

blogle approved these changes Apr 26, 2022

View reviewed changes

rnarubin merged commit 53b8cae into standard-ai:master Apr 26, 2022

rnarubin deleted the better_publish_timeout branch April 26, 2022 18:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add PubSub publish timeout #8

Add PubSub publish timeout #8

rnarubin commented Apr 22, 2022

blogle commented Apr 25, 2022

blogle Apr 25, 2022

rnarubin Apr 25, 2022

blogle Apr 25, 2022

rnarubin Apr 25, 2022

blogle Apr 25, 2022

rnarubin Apr 25, 2022

rnarubin commented Apr 25, 2022

Add PubSub publish timeout #8

Add PubSub publish timeout #8

Conversation

rnarubin commented Apr 22, 2022

blogle commented Apr 25, 2022

blogle Apr 25, 2022

Choose a reason for hiding this comment

rnarubin Apr 25, 2022

Choose a reason for hiding this comment

blogle Apr 25, 2022

Choose a reason for hiding this comment

rnarubin Apr 25, 2022

Choose a reason for hiding this comment

blogle Apr 25, 2022

Choose a reason for hiding this comment

rnarubin Apr 25, 2022

Choose a reason for hiding this comment

rnarubin commented Apr 25, 2022