Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: retry certain RESOURCE_EXHAUSTED errors observed during ReadRows and report retry attempts #1257

Merged
merged 1 commit into from
Aug 24, 2021

Conversation

esert-g
Copy link
Contributor

@esert-g esert-g commented Aug 20, 2021

Bq Storage Read service will start returning a retryable RESOURCE_EXHAUSTED error in the next few weeks when a read session's parallelism is considered to be excessive, so this PR expands retry handling logic for ReadRows with 2 changes:

  1. If a ReadRows request fails with a RESOURCE_EXHAUSTED error and the error has an associated RetryInfo, it is now considered to be retryable and retry delay is set according to the RetryInfo.
  2. If the client decides to retry, it now notifies the user with the provided RetryAttemptListener object. This will be useful as a negative feedback mechanism for future SplitReadStream requests which in return will reduce the likelihood of receiving the new retryable RESOURCE_EXHAUSTED error.

@esert-g esert-g requested review from a team and shollyman August 20, 2021 21:55
@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Aug 20, 2021
@product-auto-label product-auto-label bot added the api: bigquerystorage Issues related to the googleapis/java-bigquerystorage API. label Aug 20, 2021
@shollyman shollyman added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Aug 23, 2021
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Aug 23, 2021
@shollyman shollyman added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Aug 23, 2021
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Aug 23, 2021
Copy link
Contributor

@shollyman shollyman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with some minor nits.

private RetryAttemptListener readRowsRetryAttemptListener = null;

/**
* If a non null readRowsRetryAttemptListener is provided, client will call onRetryAtempt function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/onRetryAtempt/onRetryAttempt here and in the other versions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

public Duration retryDelay = null;
}

private static final Metadata.Key<RetryInfo> KEY_RETRY_INFO =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, we'll need to see if we have compatible key resolvers for other langs. I've not seen this before, but apparently its descriptor fullname and a "-bin" suffix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's exactly what it does. I'm having a hard time finding external docs about why it is supposed to be like that, but you can find other libraries interacting with gcp services using the same keys, e.g. https://github.com/googleapis/google-cloud-go/blob/master/spanner/retry.go#L33

Errors.IsRetryableStatusResult result = Errors.isRetryableStatus(status, metadata);
if (result.isRetryable) {
// If result.retryDelay isn't null, we know exactly how long we must wait, so both regular
// and randomized delays are the same.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there still be variance for the randomized delay? result.retryDelay + jitter? Looks like the previous impl didn't jitter either so likely can be ignored if its not been a source of issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is needed in this case.

@esert-g esert-g force-pushed the retry_attempts branch 2 times, most recently from c8c2e70 to e6f0db2 Compare August 24, 2021 17:33
Handle certain RESOURCE_EXHAUSTED errors and report the retry attempts.
@shollyman shollyman added the automerge Merge the pull request once unit tests and other checks pass. label Aug 24, 2021
@shollyman shollyman changed the title Retry certain RESOURCE_EXHAUSTED errors observed during ReadRows and report retry attempts feat: retry certain RESOURCE_EXHAUSTED errors observed during ReadRows and report retry attempts Aug 24, 2021
@gcf-merge-on-green gcf-merge-on-green bot merged commit d56e1ca into googleapis:master Aug 24, 2021
@gcf-merge-on-green gcf-merge-on-green bot removed the automerge Merge the pull request once unit tests and other checks pass. label Aug 24, 2021
gcf-merge-on-green bot pushed a commit that referenced this pull request Aug 24, 2021
shubhwip pushed a commit to shubhwip/java-bigquerystorage that referenced this pull request Oct 7, 2023
shubhwip pushed a commit to shubhwip/java-bigquerystorage that referenced this pull request Oct 7, 2023
🤖 I have created a release *beep* *boop*
---


## [0.127.4](https://github.com/googleapis/java-storage-nio/compare/v0.127.3...v0.127.4) (2023-09-26)


### Dependencies

* Update dependency com.google.apis:google-api-services-storage to v1-rev20230914-2.0.0 ([googleapis#1254](https://github.com/googleapis/java-storage-nio/issues/1254)) ([efe45f0](https://github.com/googleapis/java-storage-nio/commit/efe45f029dcbf318f36f5681ad10935bfcdc2808))
* Update dependency com.google.apis:google-api-services-storage to v1-rev20230922-2.0.0 ([googleapis#1259](https://github.com/googleapis/java-storage-nio/issues/1259)) ([80a7dbb](https://github.com/googleapis/java-storage-nio/commit/80a7dbbbaf523d5771d161e9df43415cee990b6d))
* Update dependency com.google.cloud:google-cloud-shared-dependencies to v3.16.0 ([googleapis#1257](https://github.com/googleapis/java-storage-nio/issues/1257)) ([7f6d165](https://github.com/googleapis/java-storage-nio/commit/7f6d165e04c3e3bde0416b05d06076493806c1ac))
* Update dependency com.google.cloud:google-cloud-shared-dependencies to v3.16.1 ([googleapis#1261](https://github.com/googleapis/java-storage-nio/issues/1261)) ([69f15c0](https://github.com/googleapis/java-storage-nio/commit/69f15c004c96fef4337d9dae30258a38fa29cad3))
* Update dependency com.google.cloud:google-cloud-storage to v2.27.1 ([googleapis#1263](https://github.com/googleapis/java-storage-nio/issues/1263)) ([b559148](https://github.com/googleapis/java-storage-nio/commit/b559148c1e084446c31a10731bfe6810ac8b5245))

---
This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquerystorage Issues related to the googleapis/java-bigquerystorage API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants