Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry batchGet RESOURCE_EXHAUSTED #1387

Closed
IchordeDionysos opened this issue Jan 4, 2021 · 4 comments · Fixed by #1544
Closed

Retry batchGet RESOURCE_EXHAUSTED #1387

IchordeDionysos opened this issue Jan 4, 2021 · 4 comments · Fixed by #1544
Assignees
Labels
api: firestore Issues related to the googleapis/nodejs-firestore API. priority: p3 Desirable enhancement or fix. May not be included in next release. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@IchordeDionysos
Copy link
Contributor

I've experienced the RESOURCE_EXHAUSTED error recently a lot. So, I wanted to dig into the error and understand why occurs and how it could be fixed.

I've found this answer from a Google Cloud engineer suggesting that this is a quota error, however, I did not find any quota that is exhausted...

To verify that this is indeed a temporary issue (as this kept occurring for the same read document), I've validated that I could read the document outside of the script where it occurs.

Under that assumption, it is a temporary issue and a quota issue. I thought why doesn't it get retried?

I've also found this issue: #909
Where @schmidt-sebastian confirms that some issues do get retried, but not RESOURCE_EXHAUSTED is there a reason why this error does not get retried?

From another issue for Node PubSub, it suggests that usually also a RESOURCE_EXHAUSTED error will be retried (at least by other libraries).
googleapis/google-cloud-node#2661 (comment)

So I guess we should also retry on RESOURCE_EXHAUSTED errors to make them occur less!

Environment details

  • OS: Compute Engine image (debian-9-stretch-v20200902)
  • Node.js version: 12.20.0
  • npm version: 6.14.8
  • @google-cloud/firestore version: 4.7.1
@IchordeDionysos
Copy link
Contributor Author

@schmidt-sebastian what are your thoughts on this?

@yoshi-automation yoshi-automation added the triage me I really want to be triaged. label Jan 5, 2021
@product-auto-label product-auto-label bot added the api: firestore Issues related to the googleapis/nodejs-firestore API. label Jan 5, 2021
@yoshi-automation yoshi-automation added the 🚨 This issue needs some love. label Jan 9, 2021
@dmahugh dmahugh added priority: p3 Desirable enhancement or fix. May not be included in next release. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. labels Jan 14, 2021
@yoshi-automation yoshi-automation removed triage me I really want to be triaged. 🚨 This issue needs some love. labels Jan 14, 2021
@schmidt-sebastian
Copy link
Contributor

We could retry on Resource Exhausted, but it is a little more complicated then just changing a setting. We currently rely on a shared retry configuration for all of our retries (https://github.com/googleapis/nodejs-firestore/blob/master/dev/src/v1/firestore_client_config.json). This retry configuration configures all error codes that we deem retryable, as well as the backoff between attempts. In this configuration, it is currently not possible to use a longer retry interval for Resource Exhausted, and using the existing retry intervals would further starve your resources.

If we do address this, it would mean that we have to add custom backoff handling for this error code. I am not opposed to this, but I don't know when we will have the bandwidth.

Thanks for filing this issue!

@IchordeDionysos
Copy link
Contributor Author

IchordeDionysos commented Jan 15, 2021

@schmidt-sebastian is in the config file already such a thing to consider various retry configs based on the error code/group?

Because there would be the option to add additional retry configurations other than default.

{
  "retry_params": {
    "default": {
      "initial_retry_delay_millis": 100,
      "retry_delay_multiplier": 1.3,
      "max_retry_delay_millis": 60000,
      "initial_rpc_timeout_millis": 60000,
      "rpc_timeout_multiplier": 1,
      "max_rpc_timeout_millis": 60000,
      "total_timeout_millis": 600000
    }
  }
}

So I suppose if one would tackle this issue, they would create a new retry config there. That's straightforward.

But it looks like there wouldn't be a straight forward option to tailor the retry config based on the error code, right now.
Or am I missing something?

I am asking to understand what would be amount of work that would go into this.

But from looking at how the retry config works is probably a bigger issue as this would involve changing the service definitions in gax right?

@schmidt-sebastian
Copy link
Contributor

We currently don't have the ability to use different retry settings based on error codes. We can use a different retry configuration for different methods, but unfortunately, that is not enough here. Any change here would need to plumbed through to GAX in all of our supported languages. If we just wanted to address this for Firestore and just in Node, we could add special handling for Resource Exhausted in this repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: firestore Issues related to the googleapis/nodejs-firestore API. priority: p3 Desirable enhancement or fix. May not be included in next release. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants