-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add Batcher#close(timeout) and Batcher#cancelOutstanding #3141
Conversation
…le. Currently it is impossible to debug because we dont expose any internal state to analyze. This PR adds 2 additional methods that should help in diagnosing issues: 1. close(timeout) will try to close the batcher, but if any of the underlying batch operations fail, the exception message will contain a wealth of information describing the underlying state of operations as provided by googleapis#3140 2. cancelOutstanding this allows for remediation for close(timeout) throwing an exception. The intended usecase is dataflow connector's FinishBundle: try { batcher.close(Duration.ofMinutes(1)); } catch(BatchingException e) { batcher.cancelOutstanding(); batcher.close(Duration.ofMinutes(1)); }
🤖 I detect that the PR title and the commit message differ and there's only one commit. To use the PR title for the commit history, you can use Github's automerge feature with squashing, or use -- conventional-commit-lint bot |
api compatibility is failing with:
But it is wrong because the Batcher interface is marked as |
Co-authored-by: Blake Li <blakeli@google.com>
Ok, everything should be ready |
gax-java/gax/src/main/java/com/google/api/gax/batching/BatcherImpl.java
Outdated
Show resolved
Hide resolved
gax-java/gax/src/main/java/com/google/api/gax/batching/BatcherImpl.java
Outdated
Show resolved
Hide resolved
gax-java/gax/src/main/java/com/google/api/gax/batching/BatcherImpl.java
Outdated
Show resolved
Hide resolved
gax-java/gax/src/main/java/com/google/api/gax/batching/Batcher.java
Outdated
Show resolved
Hide resolved
…sh thread and read by the user thread during cancel()
Quality Gate failed for 'gapic-generator-java-root'Failed conditions See analysis details on SonarCloud Catch issues before they fail your Quality Gate with our IDE extension SonarLint |
Quality Gate failed for 'java_showcase_integration_tests'Failed conditions See analysis details on SonarCloud Catch issues before they fail your Quality Gate with our IDE extension SonarLint |
🤖 I have created a release *beep* *boop* --- <details><summary>2.45.0</summary> ## [2.45.0](v2.44.0...v2.45.0) (2024-09-09) ### Features * add Batcher#close(timeout) and Batcher#cancelOutstanding ([#3141](#3141)) ([b5a92e4](b5a92e4)) * add full RetrySettings sample code to Settings classes ([#3056](#3056)) ([8fe3a2d](8fe3a2d)) * add toString to futures returned by operations ([#3140](#3140)) ([afecb8c](afecb8c)) * bake gapic-generator-java into the hermetic build docker image ([#3067](#3067)) ([a372e82](a372e82)) ### Bug Fixes * **gax:** prevent truncation/overflow when converting time values ([#3095](#3095)) ([699074e](699074e)) ### Dependencies * add opentelemetry exporter-metrics and shared-resoucemapping to shared dependencies ([#3078](#3078)) ([fc8d80d](fc8d80d)) * update dependency certifi to v2024.8.30 ([#3150](#3150)) ([c18b705](c18b705)) * update dependency com.google.api-client:google-api-client-bom to v2.7.0 ([#3151](#3151)) ([5f43e43](5f43e43)) * update dependency com.google.errorprone:error_prone_annotations to v2.31.0 ([#3153](#3153)) ([3071509](3071509)) * update dependency com.google.errorprone:error_prone_annotations to v2.31.0 ([#3154](#3154)) ([335ee63](335ee63)) * update dependency com.google.guava:guava to v33.3.0-jre ([#3119](#3119)) ([41174b0](41174b0)) * update dependency dev.cel:cel to v0.7.1 ([#3155](#3155)) ([b1ddd16](b1ddd16)) * update dependency filelock to v3.16.0 ([#3175](#3175)) ([6681113](6681113)) * update dependency idna to v3.8 ([#3156](#3156)) ([82f5326](82f5326)) * update dependency io.netty:netty-tcnative-boringssl-static to v2.0.66.final ([#3148](#3148)) ([a7efaa8](a7efaa8)) * update dependency net.bytebuddy:byte-buddy to v1.15.1 ([#3115](#3115)) ([0e06c5f](0e06c5f)) * update dependency org.apache.commons:commons-lang3 to v3.17.0 ([#3157](#3157)) ([8d3b9fd](8d3b9fd)) * update dependency org.checkerframework:checker-qual to v3.47.0 ([#3166](#3166)) ([365674d](365674d)) * update dependency org.yaml:snakeyaml to v2.3 ([#3158](#3158)) ([e67ea9a](e67ea9a)) * update dependency platformdirs to v4.3.2 ([#3176](#3176)) ([4f2f9e0](4f2f9e0)) * update dependency virtualenv to v20.26.4 ([#3177](#3177)) ([080e078](080e078)) * update google api dependencies ([#3118](#3118)) ([67342ea](67342ea)) * update google auth library dependencies to v1.25.0 ([#3168](#3168)) ([715884a](715884a)) * update google http client dependencies to v1.45.0 ([#3159](#3159)) ([a3fe612](a3fe612)) * update googleapis/java-cloud-bom digest to 6626f91 ([#3147](#3147)) ([658e40e](658e40e)) * update junit5 monorepo to v5.11.0 ([#3111](#3111)) ([6bf84c8](6bf84c8)) * update netty dependencies to v4.1.113.final ([#3165](#3165)) ([9b5957d](9b5957d)) * update opentelemetry-java monorepo to v1.42.0 ([#3172](#3172)) ([413c44e](413c44e)) ### Documentation * Update DEVELOPMENT.md ([#3126](#3126)) ([92bdf4e](92bdf4e)) </details> --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com> Co-authored-by: ldetmer <1771267+ldetmer@users.noreply.github.com>
There have been reports of batcher.close() hanging every once in awhile. Currently it is impossible to debug because we dont expose any internal state to analyze. This PR adds 2 additional methods that should help in diagnosing issues: 1. close(timeout) will try to close the batcher, but if any of the underlying batch operations fail, the exception message will contain a wealth of information describing the underlying state of operations as provided by #3140 2. cancelOutstanding this allows for remediation for close(timeout) throwing an exception. The intended usecase is dataflow connector's FinishBundle: ```java try { batcher.close(Duration.ofMinutes(1)); } catch(TimeoutException e) { // log details why the batch failed to close with the help of #3140 logger.error(e); batcher.cancelOutstanding(); batcher.close(Duration.ofMinutes(1)); } ``` Example exception message: > Exception in thread "main" com.google.api.gax.batching.BatchingException: Timed out trying to close batcher after PT1S. Batch request prototype: com.google.cloud.bigtable.data.v2.models.BulkMutation@2bac9ba. Outstanding batches: Batch{operation=CallbackChainRetryingFuture{super=null, latestCompletedAttemptResult=ImmediateFailedFuture@6a9d5dff[status=FAILURE, cause=[com.google.cloud.bigtable.data.v2.models.MutateRowsException: Some mutations failed to apply]], attemptResult=null, attemptSettings=TimedAttemptSettings{globalSettings=RetrySettings{totalTimeout=PT10M, initialRetryDelay=PT0.01S, retryDelayMultiplier=2.0, maxRetryDelay=PT1M, maxAttempts=0, jittered=true, initialRpcTimeout=PT1M, rpcTimeoutMultiplier=1.0, maxRpcTimeout=PT1M}, retryDelay=PT1.28S, rpcTimeout=PT1M, randomizedRetryDelay=PT0.877S, attemptCount=8, overallAttemptCount=8, firstAttemptStartTimeNanos=646922035424541}}, elements=com.google.cloud.bigtable.data.v2.models.RowMutationEntry@7a344b65} Co-authored-by: Blake Li <blakeli@google.com>
🤖 I have created a release *beep* *boop* --- <details><summary>2.45.0</summary> ## [2.45.0](v2.44.0...v2.45.0) (2024-09-09) ### Features * add Batcher#close(timeout) and Batcher#cancelOutstanding ([#3141](#3141)) ([b5a92e4](b5a92e4)) * add full RetrySettings sample code to Settings classes ([#3056](#3056)) ([8fe3a2d](8fe3a2d)) * add toString to futures returned by operations ([#3140](#3140)) ([afecb8c](afecb8c)) * bake gapic-generator-java into the hermetic build docker image ([#3067](#3067)) ([a372e82](a372e82)) ### Bug Fixes * **gax:** prevent truncation/overflow when converting time values ([#3095](#3095)) ([699074e](699074e)) ### Dependencies * add opentelemetry exporter-metrics and shared-resoucemapping to shared dependencies ([#3078](#3078)) ([fc8d80d](fc8d80d)) * update dependency certifi to v2024.8.30 ([#3150](#3150)) ([c18b705](c18b705)) * update dependency com.google.api-client:google-api-client-bom to v2.7.0 ([#3151](#3151)) ([5f43e43](5f43e43)) * update dependency com.google.errorprone:error_prone_annotations to v2.31.0 ([#3153](#3153)) ([3071509](3071509)) * update dependency com.google.errorprone:error_prone_annotations to v2.31.0 ([#3154](#3154)) ([335ee63](335ee63)) * update dependency com.google.guava:guava to v33.3.0-jre ([#3119](#3119)) ([41174b0](41174b0)) * update dependency dev.cel:cel to v0.7.1 ([#3155](#3155)) ([b1ddd16](b1ddd16)) * update dependency filelock to v3.16.0 ([#3175](#3175)) ([6681113](6681113)) * update dependency idna to v3.8 ([#3156](#3156)) ([82f5326](82f5326)) * update dependency io.netty:netty-tcnative-boringssl-static to v2.0.66.final ([#3148](#3148)) ([a7efaa8](a7efaa8)) * update dependency net.bytebuddy:byte-buddy to v1.15.1 ([#3115](#3115)) ([0e06c5f](0e06c5f)) * update dependency org.apache.commons:commons-lang3 to v3.17.0 ([#3157](#3157)) ([8d3b9fd](8d3b9fd)) * update dependency org.checkerframework:checker-qual to v3.47.0 ([#3166](#3166)) ([365674d](365674d)) * update dependency org.yaml:snakeyaml to v2.3 ([#3158](#3158)) ([e67ea9a](e67ea9a)) * update dependency platformdirs to v4.3.2 ([#3176](#3176)) ([4f2f9e0](4f2f9e0)) * update dependency virtualenv to v20.26.4 ([#3177](#3177)) ([080e078](080e078)) * update google api dependencies ([#3118](#3118)) ([67342ea](67342ea)) * update google auth library dependencies to v1.25.0 ([#3168](#3168)) ([715884a](715884a)) * update google http client dependencies to v1.45.0 ([#3159](#3159)) ([a3fe612](a3fe612)) * update googleapis/java-cloud-bom digest to 6626f91 ([#3147](#3147)) ([658e40e](658e40e)) * update junit5 monorepo to v5.11.0 ([#3111](#3111)) ([6bf84c8](6bf84c8)) * update netty dependencies to v4.1.113.final ([#3165](#3165)) ([9b5957d](9b5957d)) * update opentelemetry-java monorepo to v1.42.0 ([#3172](#3172)) ([413c44e](413c44e)) ### Documentation * Update DEVELOPMENT.md ([#3126](#3126)) ([92bdf4e](92bdf4e)) </details> --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com> Co-authored-by: ldetmer <1771267+ldetmer@users.noreply.github.com>
There have been reports of batcher.close() hanging every once in awhile. Currently it is impossible to debug because we dont expose any internal state to analyze.
This PR adds 2 additional methods that should help in diagnosing issues:
The intended usecase is dataflow connector's FinishBundle:
Example exception message:
Thank you for opening a Pull Request! Before submitting your PR, please read our contributing guidelines.
There are a few things you can do to make sure it goes smoothly:
Fixes #<issue_number_goes_here> ☕️