Confusing behavior when trying to set deadlines on a gRPC call #1013

nielm · 2021-05-26T09:50:36Z

I want to set a server-side deadline on a gRPC call so that the call is cancelled if it takes too long, This works in Java/GAX by setting the totalTimeout in RetrySettings, so the server-side appears to be working correctly.

Attempting to do the same in Node.js/GAX seems to be very confusing, and not matching the documentation:

TL/DR:

it seems that the CallOptions.timeout which is documented as a client-side timeout seems to also work as a server-side timeout, but only when RetryOptions are specified, which seems to go against the documentation.
Certain values for RetryOptions give unexpected errors when specified together.
RetryOptions are ignored completely unless at least one RetryCode is given.
Cannot set both totalTimeoutMillis and maxRetries in BackoffSettings
BackoffSettings.totalTimeoutMillis seems to be completely ignored

https://github.com/googleapis/gax-nodejs/blob/master/src/gax.ts#L157

Demonstration of CallOptions.timeout on DEADLINE_EXCEEDED

callOptions = {
  retry: {
    backoffSettings: {
      totalTimeoutMillis: 200
    }
  }
}

Has no effect - the long-running gRPC is not timed out.

callOptions = {
  timeout: 200
}

Has no effect - the long-running gRPC is not timed out.

callOptions = {
  timeout: 200,
  retry: {}
  }
}

Has no effect - the long-running gRPC is not timed out.

callOptions = {
  timeout: 200,
  retry: {
    retryCodes: [],
    }
  }

DEADLINE_EXCEEDED Raised

These tests indicate that the CallOptions.timeout is being passed as a deadline to the server (when it is documented as a client-side timeout), BUT only when a CallOptions.retry.retryCodes object is specified. The values in the CallOptions.retry object seem to be ignored.

Demonstration of totalTimeoutMillis

callOptions = {
  timeout: 200000,
  retry: {
    retryCodes: [],
    backoffSettings: {
      totalTimeoutMillis: 200
    }
  }
}

Has no effect - the long-running gRPC is not timed out.

callOptions = {
  timeout: 200000,
  retry: {
    retryCodes: [ 99999 ],
      backoffSettings: {
        totalTimeoutMillis: 200
     }
  }
}

Has no effect - the long-running gRPC is not timed out.
totalTimeoutMillis seems to be ignored

Demonstration of CallOptions.retry on DEADLINE_EXCEEDED

callOptions = {
  retry: {
    retryCodes: [ 99999 ],
      backoffSettings: {
        totalTimeoutMillis: 200
     }
  }
}

Has no effect - the long-running gRPC is not timed out.

callOptions = {
  retry: {
    retryCodes: [ 99999 ],
      backoffSettings: {
        maxRetries: 1,
        totalTimeoutMillis: 200
     }
  }
}

Client-side error:

Error: Cannot set both totalTimeoutMillis and maxRetries in backoffSettings

This makes no sense, as it seems perfectly possible to have retries and a total timeout (and the documentation implies this: regardless of the retrying attempts made meanwhile.

callOptions = {
  retry: {
    retryCodes: [ 99999 ],
      backoffSettings: {
        maxRetries: 1,
        initialRpcTimeoutMillis: 200,
        maxRpcTimeoutMillis: 200
     }
  }
}

DEADLINE_EXCEEDED Raised (woohoo!)

callOptions = {
  timeout: 200,
  retry: {
    retryCodes: [ 99999 ],
      backoffSettings: {
        maxRetries: 1,
        initialRpcTimeoutMillis: 200000,
        maxRpcTimeoutMillis: 200000
     }
  }
}

Has no effect - the long-running gRPC is not timed out.
This does correspond to the docs - that CallOptions.timeout is ignored for retryable calls.

callOptions = {
  retry: {
    retryCodes: [ ],
      backoffSettings: {
        maxRetries: 1,
        initialRpcTimeoutMillis: 200,
        maxRpcTimeoutMillis: 200
     }
  }
}

Has no effect - the long-running gRPC is not timed out.

So at least 1 retryCode must be specified

Environment details

OS: Debian Rodete (20210511.01.04RD)
node: v12.21.0
npm: 7.5.2
google-gax@2.13.0
grpc/grpc-js@1.3.2
grpc-gcp@0.3.3

The text was updated successfully, but these errors were encountered:

skuruppu · 2021-05-26T22:52:04Z

@alexander-fenster if you could please chime in here, that would be great. I filed #902 a while back, which is related.

@AVaksman you had some ideas on how to fix some of these issues, so please chime in.

@hengfengli, FYI.

nielm · 2021-05-27T09:30:31Z

Adding some more debugging:

export GRPC_VERBOSITY=DEBUG GRPC_TRACE=all

allows me to see the the grpc-timeout header and status responses from GRPC.

It seems to be set inconsistently, and depending on the call, even when retries are not specified, the call is still retried...

For example, using the CallOptions with what seems to be the most reliable way to set the deadline:

callOptions = {
  timeout: 100,
  retry: {
      retryCodes: [ ]
    }
}

A Spanner commit will have thegrpc-timeout set correctly to ~97ms and will abort immediately

A Spanner ExecuteSql (database.run) will also have thegrpc-timeout set correctly to ~97ms, BUT will retry 2 more times with 2sec, then 4sec backoff before returning a failure back to the client.

Fully specifying the CallOptions.retry without specifying a timeout:

callOptions = {
  retry: {
      retryCodes: [99999],
      backoffSettings: {
         maxRetries: 1,
         initialRetryDelayMillis: 1,
         retryDelayMultiplier: 1,
         maxRetryDelayMillis: 100,
         initialRpcTimeoutMillis: 100,
         maxRpcTimeoutMillis: 100,
         totalTimeoutMillis: 100,
         rpcTimeoutMultiplier: 1,
       }
   }
}

results in the default grpc-timeout header value of 1hr being used, and no deadline exceeded. Specifying a timeout results in the same behaviour above, with 2 retries over 6 seconds.

It seems the retry object in CallOptions is completely ignored for ExecuteStreamingSQL

alexander-fenster · 2021-07-09T17:26:18Z

Assigning to @summer-ji-eng for this sprint as discussed.

hkdevandla · 2022-01-21T22:20:17Z

Folks, what is the status here? Is this issue still in effect or resolved? Thanks!

summer-ji-eng · 2022-01-22T00:26:46Z

This has been fixed #1100. Close for now.

Feel free to open for any further question.

dzou · 2023-03-28T02:51:29Z

So what is the proper usage of the call options param now to enable timeouts?

yoshi-automation added the triage me I really want to be triaged. label May 27, 2021

nielm mentioned this issue May 27, 2021

Undocumented ability to set RequestOptions or CallOptions on many APIs googleapis/nodejs-spanner#1378

Closed

yoshi-automation added the 🚨 This issue needs some love. label May 31, 2021

alexander-fenster assigned summer-ji-eng Jul 9, 2021

skuruppu mentioned this issue Aug 5, 2021

Upgrading retry-request to 4.2.2 makes some requests take a lot longer googleapis/nodejs-spanner#1444

Closed

summer-ji-eng assigned alicejli Aug 16, 2021

alicejli mentioned this issue Sep 15, 2021

fix: edit timeout and retry logic #1100

Merged

yoshi-automation added 🚨 This issue needs some love. and removed 🚨 This issue needs some love. labels Nov 14, 2021

yoshi-automation added the 🚨 This issue needs some love. label Nov 22, 2021

summer-ji-eng closed this as completed Jan 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Confusing behavior when trying to set deadlines on a gRPC call #1013

Confusing behavior when trying to set deadlines on a gRPC call #1013

nielm commented May 26, 2021 •

edited

Loading

skuruppu commented May 26, 2021

nielm commented May 27, 2021

alexander-fenster commented Jul 9, 2021

hkdevandla commented Jan 21, 2022

summer-ji-eng commented Jan 22, 2022

dzou commented Mar 28, 2023

Confusing behavior when trying to set deadlines on a gRPC call #1013

Confusing behavior when trying to set deadlines on a gRPC call #1013

Comments

nielm commented May 26, 2021 • edited Loading

Demonstration of CallOptions.timeout on DEADLINE_EXCEEDED

Demonstration of totalTimeoutMillis

Demonstration of CallOptions.retry on DEADLINE_EXCEEDED

Environment details

skuruppu commented May 26, 2021

nielm commented May 27, 2021

alexander-fenster commented Jul 9, 2021

hkdevandla commented Jan 21, 2022

summer-ji-eng commented Jan 22, 2022

dzou commented Mar 28, 2023

nielm commented May 26, 2021 •

edited

Loading