Unify timeouts #156

reiddraper · 2014-02-18T22:02:38Z

There are three different types of timeouts used. They are not consistently named. The three timeouts are:

gen_server:call/3. This is sometimes called the call_timeout, and if not provided, defaults to 60 seconds. I think it should always be infinity.
Request timeout. This timeout is most often the last parameter to a given exported function. Sometimes it's also specified as the 'timeout' value in an options proplist. When requests are queued up (see Request timers are started too early #155), the timer is started. If the timer goes off before the response has been read from the socket, the caller received an {error, timeout} response. I think this value should always be an integer (ie. never infinity).
Remote timeout. This is a value that is encoded in the protocol buffers request, and is used on the Riak side as a timeout. Some functions (inconsistently) also call this value the 'Timeout' (last parameter), and hardcode the 'Request timeout' to infinity. I think this value should never be specified instead of the 'Request timeout' in the function parameters, but be called remote_timeout in an options property. This value can be an integer or infinity. Some examples of functions that hardcode infinity Request timeouts and instead take in the server timeout are cs_bucket_fold, stream_list_buckets and stream_list_keys.

Because of this confusion, some tests have been removed because they were failing (and could be fixed easily). ie. 82e7964.

Mentioning folks who've touched this code recently: @evanmcc, @russelldb.

See also #155.

evanmcc · 2014-02-13T22:29:11Z

A sketch of how this might work:

all requests made via a function that hardcodes the gen_server timeout to infinity
optional final parameter is the preferred way going forward, refactor everything to have it and use it
final parameter sets the remote_timeout and the request_timeout, setting the latter a little bit longer than the former
discourage but allow timeouts via opts
call_timeout ignored

To make this work correctly, #155 would need to be fixed beforehand.

reiddraper · 2014-02-13T22:47:05Z

+1 @evanmcc. @andrewjstone points out another fun issue, which is that the 'request timer' is actually reset each time bytes are received from the socket, which means the request timer is not really an overall request timer, as an individual response might be made up of 100s of messages from the gen_tcp server.

reiddraper · 2014-02-13T22:55:39Z

@jonmeredith might also have some opinions, as it looks like he was one of the original client authors.

okeuday · 2014-02-14T18:28:57Z

@reiddraper At least in the 1.4.1 version of the client, it is possible to feed a timeout value of infinity and have the client die due to being unable to encode an integer. I did this with a get_index function call where both timeouts were set to infinity. I am not sure if this is still a problem, but it seems like it should be based on the typespecs and the discussion. If a particular timeout needs to always be an integer, it would be best if the type for the timeout variable reflected this fact... otherwise it leads to this odd error.

reiddraper · 2014-02-14T18:57:41Z

@okeuday thanks. I'll make sure the typespecs all reflect the changes I make.

This test was 'fixed' in 82e7964, but further changes will allow it to pass as-is.

As described in #156, there are several types of timeouts in the client. The timeout that is generally provided as the last argument to client operations is used to create timers which prevent us from waiting for every on messages for TCP data (from gen_tcp). There are several cases where this timeout was hardcoded to infinity. This can cause the client to hang on these requests for a (mostly) unbounded time. Even when using a gen_server timeout, the gen_server itself will continue to wait for the message to come, with no timeout. Further, because of #155, we simply use the `ServerTimeout` as the `RequestTimeout`, if there is not a separate `RequestTimeout`. It's possible that the `RequestTimeout` can fire before the `ServerTimeout` (this timeout is remote), but we'd otherwise just be picking some random number to be the difference between them. Addressing #155 will shed more light on this.

The server has its own timeout mechanism for receiving TCP data, so infinity gen_server timeouts are sufficient, barring any unknown bugs where the server would never reply. Some functions still accept `CallTimeout`s, but they're ignored.

* Functions that have been part of a tag and use `CallTimeout` are marked deprecated * Functions that are part of the 2.0 additions and haven't been tagged * yet are removed. See 5aa1ab0 for an explanation of why all gen_server:calls are now using infinity timeouts.

Backport of 5aa1ab0 As described in #156, there are several types of timeouts in the client. The timeout that is generally provided as the last argument to client operations is used to create timers which prevent us from waiting for every on messages for TCP data (from gen_tcp). There are several cases where this timeout was hardcoded to infinity. This can cause the client to hang on these requests for a (mostly) unbounded time. Even when using a gen_server timeout, the gen_server itself will continue to wait for the message to come, with no timeout. Further, because of #155, we simply use the `ServerTimeout` as the `RequestTimeout`, if there is not a separate `RequestTimeout`. It's possible that the `RequestTimeout` can fire before the `ServerTimeout` (this timeout is remote), but we'd otherwise just be picking some random number to be the difference between them. Addressing #155 will shed more light on this.

coderoshi · 2014-02-19T00:00:30Z

src/riakc_pb_socket.erl

 -spec set_bucket(pid(), bucket(), bucket_props(), timeout(), timeout()) -> ok | {error, term()}.
-set_bucket(Pid, Bucket, BucketProps, Timeout, CallTimeout) ->
+set_bucket(Pid, Bucket, BucketProps, Timeout, _CallTimeout) ->


maybe this could print an annoying deprecation message?

There is an @deprecated tag on L574, which has historically been how we've dealt with this (release notes would be ideal I think).

coderoshi · 2014-02-19T03:58:34Z

+1 Everything looks good, tests pass after some changes I had to make for yokozuna. I'll put those in a separate PR.

reiddraper · 2014-02-19T16:42:21Z

Is that a +1, or should I wait you the yoko PR?

coderoshi · 2014-02-19T16:46:25Z

That's a +1. Yoko PR pending

Unify timeouts

reiddraper added the Enhancement label Feb 13, 2014

reiddraper added 4 commits February 14, 2014 17:55

Reintroduce failing queue_disconnected_test test

cf248b1

This test was 'fixed' in 82e7964, but further changes will allow it to pass as-is.

Use infinity gen_server calls

65a6b7d

The server has its own timeout mechanism for receiving TCP data, so infinity gen_server timeouts are sufficient, barring any unknown bugs where the server would never reply. Some functions still accept `CallTimeout`s, but they're ignored.

reiddraper mentioned this pull request Feb 18, 2014

Never use infinity request timeouts #160

Merged

coderoshi reviewed Feb 19, 2014
View reviewed changes

reiddraper added a commit that referenced this pull request Feb 19, 2014

Merge pull request #156 from basho/feature/unify-timeouts

12d97a6

Unify timeouts

reiddraper merged commit 12d97a6 into master Feb 19, 2014

reiddraper deleted the feature/unify-timeouts branch February 19, 2014 16:48

seancribbs mentioned this pull request May 30, 2014

Request timers are started too early #155

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unify timeouts #156

Unify timeouts #156

reiddraper commented Feb 18, 2014

evanmcc commented Feb 13, 2014

reiddraper commented Feb 13, 2014

reiddraper commented Feb 13, 2014

okeuday commented Feb 14, 2014

reiddraper commented Feb 14, 2014

coderoshi Feb 19, 2014

reiddraper Feb 19, 2014

coderoshi commented Feb 19, 2014

reiddraper commented Feb 19, 2014

coderoshi commented Feb 19, 2014

Unify timeouts #156

Unify timeouts #156

Conversation

reiddraper commented Feb 18, 2014

evanmcc commented Feb 13, 2014

reiddraper commented Feb 13, 2014

reiddraper commented Feb 13, 2014

okeuday commented Feb 14, 2014

reiddraper commented Feb 14, 2014

coderoshi Feb 19, 2014

Choose a reason for hiding this comment

reiddraper Feb 19, 2014

Choose a reason for hiding this comment

coderoshi commented Feb 19, 2014

reiddraper commented Feb 19, 2014

coderoshi commented Feb 19, 2014