Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HttpJson Implementation is recording Operation(Success|Fail) metric before retries are attempted #2503

Closed
lqiu96 opened this issue Feb 26, 2024 · 0 comments · Fixed by #3300 · May be fixed by #2546
Closed

HttpJson Implementation is recording Operation(Success|Fail) metric before retries are attempted #2503

lqiu96 opened this issue Feb 26, 2024 · 0 comments · Fixed by #3300 · May be fixed by #2546
Assignees
Labels
priority: p3 Desirable enhancement or fix. May not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@lqiu96
Copy link
Contributor

lqiu96 commented Feb 26, 2024

Discovered in the Otel Draft PR: #2500

Issue

For Otel, this is causing the number of attempts to always be recorded as 1 instead of the number of retries actually attempted. It does not impact retries being invoked, but will record incorrect metrics.

Steps for creating a UnaryCallable (1 is the inner most callable and 5 is the last callable invoked):

HttpJson Echo Unary:

  1. HttpJsonDirectCallable
  2. HttpJsonUnaryRequestParamCallable (optional -- created if params exist)
  3. TracedUnaryCallable
  4. HttpJsonExceptionCallable
  5. RetryingCallable

gRPC Echo Unary:

  1. GrpcDirectCallable
  2. GrpcUnaryRequestParamCallable (optional -- created if params exist)
  3. GrpcExceptionCallable
  4. RetryingCallable
  5. TracedUnaryCallable

TracedUnaryCallable

Inside the TracedUnaryCallable, the callback is invoked once in innerCallable has been invoked. The callback implementation is the TraceFinisher.

From the implementation, it'll record an operation success or failure, which is intended to invoked after all the retries are exhausted (as the retries are attempts).

Possible Solution

We should move creating the TracedUnaryCallable to the last step for HttpJson. This would require refactoring this file: https://github.com/googleapis/sdk-platform-java/blob/7902a41c87240d607179d07c28cce462ea135c5f/gax-java/gax-httpjson/src/main/java/com/google/api/gax/httpjson/HttpJsonCallableFactory.java

Concerns

Would this impact existing behavior? Do we need to do this for the other Callable types (StreamingCallables?)

@lqiu96 lqiu96 added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. priority: p2 Moderately-important priority. Fix may not be included in next release. labels Feb 26, 2024
@lqiu96 lqiu96 changed the title HttpJson Implementation is recording OperationFailure metric on a retry attempt HttpJson Implementation is recording Operation(Success|Fail) metric on a retry attempt Feb 26, 2024
@lqiu96 lqiu96 changed the title HttpJson Implementation is recording Operation(Success|Fail) metric on a retry attempt HttpJson Implementation is recording Operation(Success|Fail) metric before retries are attempted Feb 27, 2024
blakeli0 added a commit that referenced this issue Mar 14, 2024
Builds off #2433,
based on the design in go/java-gapic-otel-metrics-design.

Discovered two issues via showcase tests:
1. #2502
2. #2503

These issues are not blocking for this PR.

---------

Co-authored-by: Blake Li <blakeli@google.com>
@lqiu96 lqiu96 added priority: p3 Desirable enhancement or fix. May not be included in next release. and removed priority: p2 Moderately-important priority. Fix may not be included in next release. labels May 20, 2024
@diegomarquezp diegomarquezp self-assigned this Sep 8, 2024
diegomarquezp added a commit that referenced this issue Nov 13, 2024
Fixes #2503 

### Approach
This PR uses the approach suggested by the same bug.
> We should move creating the TracedUnaryCallable to the last step for
HttpJson. This would require refactoring this file:
https://github.com/googleapis/sdk-platform-java/blob/7902a41c87240d607179d07c28cce462ea135c5f/gax-java/gax-httpjson/src/main/java/com/google/api/gax/httpjson/HttpJsonCallableFactory.java

### Confirmation that it works
The confirmation that it works is in the modified tests [(e.g.
`retry()`)](https://github.com/googleapis/sdk-platform-java/blob/1edf55754d1f602a3bf70b3a44d1c42689ae961b/gax-java/gax-httpjson/src/test/java/com/google/api/gax/httpjson/RetryingTest.java#L158-L159).
When using the proposed test version against the version of
`HttpJsonCallableFactory` from the main branch, it will fail because it
will not record attempts started or failed whatsoever:


![image](https://github.com/user-attachments/assets/1cb89d0c-7f34-4b1c-93a5-25f2112040f7)

The image above shows all tests having failed when using the production
files from `main`, implying that no tracer attempts are recorded for any
of the tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: p3 Desirable enhancement or fix. May not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
2 participants