Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

timeout_millis is ignored if retry settings are provided #902

Closed
skuruppu opened this issue Sep 17, 2020 · 11 comments · Fixed by #1100
Closed

timeout_millis is ignored if retry settings are provided #902

skuruppu opened this issue Sep 17, 2020 · 11 comments · Fixed by #1100
Assignees
Labels
priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@skuruppu
Copy link

The Spanner team has recently been through an effort to make the timeout and retry settings consistent across all client library languages. The result of which is now in spanner_client_config.json.

We need to set different timeout_millis for each RPC method. But we also set specific retry settings for certain groups of methods.

We found that in gax-nodejs, if retry settings are set, then timeout_millis is ignored.

  • gax-nodejs/src/gax.ts

    Lines 158 to 159 in aa5487b

    * @param {number} settings.timeout - The client-side timeout for API calls.
    * This parameter is ignored for retrying calls.
  • const retry = thisSettings.retry;
    if (retry && retry.retryCodes && retry.retryCodes.length > 0) {
    return retryable(
    func,
    thisSettings.retry!,
    thisSettings.otherArgs as GRPCCallOtherArgs,
    thisSettings.apiName
    );
    }
    return addTimeoutArg(
    func,
    thisSettings.timeout,
    thisSettings.otherArgs as GRPCCallOtherArgs
    );

This essentially means that all of our methods have a timeout_millis of 60 secs now instead of the method-specific ones we intended to set.

Would this be something that can be fixed in Node.js to respected the per-method timeout_millis?

CC @hengfengli

@AVaksman AVaksman self-assigned this Sep 17, 2020
@alexander-fenster
Copy link
Contributor

Hi @skuruppu,

We at ACTools are considering implementing a new (simpler) "logical timeout" logic this month and this code will likely be changed and/or rewritten. I'll work on this refactor. Thank you for mentioning your new Spanner config, I'll make sure it's processed correctly with my changes.

@JustinBeckwith JustinBeckwith added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. priority: p2 Moderately-important priority. Fix may not be included in next release. labels Sep 17, 2020
@AVaksman
Copy link
Contributor

initial_rpc_timeout_millis is an optional parameter and is used in place of timeout_millis when retry_codes are provided.

With changes from #896 (when 2.8.1 is released) a workaround would be to not to provide initial_rpc_timeout_millis in the config file, then the settings.timeout value will default to timeout_millis value which is separately provided for each method in the config file.

Need to verify with logic in other language APIs.

@hengfengli
Copy link

spanner_client_config.json is only used by nodejs and is generated from grpc_service_config.json. If we can change the generator to not generate initial_rpc_timeout_millis, then I think it would work.

But how about maxRpcTimeoutMillis and totalTimeoutMillis? If we set a timeout_millis which has a higher value than them, what will happen then?

In Java, we set initial_rpc_timeout_millis = maxRpcTimeoutMillis = totalTimeoutMillis = timeout_millis code.

@AVaksman
Copy link
Contributor

totalTimeoutMillis => deadline
maxRpcTimeoutMillis => maxTimeout
timeout_millis => timeout

The smaller value will be used

const timeoutCal =
timeout && timeoutMult ? timeout * timeoutMult : 0;
const rpcTimeout = maxTimeout ? maxTimeout : 0;
const newDeadline = deadline ? deadline - now.getTime() : 0;
timeout = Math.min(timeoutCal, rpcTimeout, newDeadline);

@hengfengli
Copy link

hengfengli commented Sep 21, 2020

But we expect the timeout_millis should be respected, e.g., ExecuteStreamingSql's timeout should be 3600s, instead of a smaller value.

@yoshi-automation yoshi-automation added 🚨 This issue needs some love. and removed 🚨 This issue needs some love. labels Dec 21, 2020
@alexander-fenster
Copy link
Contributor

(going back to this old issue - apologize for its slipping through my fingers!)

@skuruppu @hengfengli With the gax version with Alex's fixes released, do you folks still need changes here in this logic?

The "logical timeout" refactor that I mentioned earlier in this issue got postponed but will still happen sooner or later, I would probably not want to make any major changes to the current timeout code (which is already more complicated than it should be).

@skuruppu
Copy link
Author

Thanks for getting back to us @alexander-fenster.

I think even with Alex's fixes, it would still mean that we have to remove initial_rpc_timeout_millis from the config. But this is not possible to do just for Node.js since other languages use this value for a different purpose and we'll be then changing the settings across all languages.

Sorry if I misunderstood something.

@alexander-fenster
Copy link
Contributor

Assigning to @summer-ji-eng for this sprint as discussed.

@alicejli
Copy link
Contributor

Hi @skuruppu - the latest version of gax should fix this: https://github.com/googleapis/gax-nodejs/releases/tag/v2.25.4

Once nodejs-spanner is updated to use the latest version of gax, please give it a try and let me know if that fixes the issue. Thanks!

@skuruppu
Copy link
Author

Thanks a lot @alicejli. We're still at v2.17.1 which is surprising since you've had so many releases since then. I thought the bots updated this automatically. In any case, we can try this manually.

@hkdevandla hkdevandla removed the 🚨 This issue needs some love. label Sep 24, 2021
@alicejli
Copy link
Contributor

@skuruppu I'm going to close this issue as I assume the latest version of gax fixes this. Please feel free to re-open if not. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants