Request timers are started too early #155

reiddraper · 2014-02-13T17:01:11Z

riakc_pb_socket uses {active, once} to receive TCP data as messages. In order to support timeouts on reading this data, it also sends itself messages using erlang:send_after. Since only one request can be outstanding at once, concurrent requests are queued up, and processed FIFO. However, the timer for an individual request is started when the request is queued, not when it is actually sent to Riak. The problem is that we start the timer (send_after) inside of new_request, when this request might just be queued. This has two consequences:

The timer may go off during a different request, in which case the queued request is removed.
The timer may go off when we've been waiting on TCP data for less than timeout.

This may actually be on purpose, but to me it conflates a TCP read timeout with an 'overall request' timeout, which would include time spent waiting in the queue.

The text was updated successfully, but these errors were encountered:

reiddraper · 2014-02-13T17:07:41Z

@jonmeredith looks like you wrote this about four years ago. Any recollection whether it was designed this way on purpose?

As described in #156, there are several types of timeouts in the client. The timeout that is generally provided as the last argument to client operations is used to create timers which prevent us from waiting for every on messages for TCP data (from gen_tcp). There are several cases where this timeout was hardcoded to infinity. This can cause the client to hang on these requests for a (mostly) unbounded time. Even when using a gen_server timeout, the gen_server itself will continue to wait for the message to come, with no timeout. Further, because of #155, we simply use the `ServerTimeout` as the `RequestTimeout`, if there is not a separate `RequestTimeout`. It's possible that the `RequestTimeout` can fire before the `ServerTimeout` (this timeout is remote), but we'd otherwise just be picking some random number to be the difference between them. Addressing #155 will shed more light on this.

Backport of 5aa1ab0 As described in #156, there are several types of timeouts in the client. The timeout that is generally provided as the last argument to client operations is used to create timers which prevent us from waiting for every on messages for TCP data (from gen_tcp). There are several cases where this timeout was hardcoded to infinity. This can cause the client to hang on these requests for a (mostly) unbounded time. Even when using a gen_server timeout, the gen_server itself will continue to wait for the message to come, with no timeout. Further, because of #155, we simply use the `ServerTimeout` as the `RequestTimeout`, if there is not a separate `RequestTimeout`. It's possible that the `RequestTimeout` can fire before the `ServerTimeout` (this timeout is remote), but we'd otherwise just be picking some random number to be the difference between them. Addressing #155 will shed more light on this.

seancribbs · 2014-05-30T19:59:15Z

@reiddraper Has this been adequately resolved by #156 and #160?

reiddraper added the Bug label Feb 13, 2014

reiddraper mentioned this issue Feb 13, 2014

Add timeouts to ensure that the list_objects FSM eventually stops. -- 1.4 basho/riak_cs#806

Merged

reiddraper mentioned this issue Feb 13, 2014

Unify timeouts #156

Merged

reiddraper mentioned this issue Feb 18, 2014

Never use infinity request timeouts #160

Merged

seancribbs mentioned this issue May 30, 2014

Inconsistent timeout handling #79

Closed

seancribbs added the 2.0-known-issue label Sep 6, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request timers are started too early #155

Request timers are started too early #155

reiddraper commented Feb 13, 2014

reiddraper commented Feb 13, 2014

seancribbs commented May 30, 2014

Request timers are started too early #155

Request timers are started too early #155

Comments

reiddraper commented Feb 13, 2014

reiddraper commented Feb 13, 2014

seancribbs commented May 30, 2014