Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OTLP exporter status codes to retry on are incorrect #1123

Closed
kbrockhoff opened this issue Jun 15, 2020 · 6 comments
Closed

OTLP exporter status codes to retry on are incorrect #1123

kbrockhoff opened this issue Jun 15, 2020 · 6 comments
Assignees
Labels
bug Something isn't working help wanted Good issue for contributors to OpenTelemetry Service to pick up priority:p2 Medium

Comments

@kbrockhoff
Copy link
Member

Describe the bug
OTLP exporter retries sending of data when receiving back a ResourceExhausted status code even though the call will never succeed.

Steps to reproduce
Remove this line from correctness_test.go and then run tests.
send_batch_size: 1024

What did you expect to see?
Log message reporting the problem and dropping of the message.

What did you see instead?
Nothing. Have to put a breakpoint in exporter/otlpexporter/exporter.go at line 116 to identity the problem.

What version did you use?
Version: 6dcfa47

What config did you use?
Config: See testbed/correctness/correctness_test.go

Environment
OS: OS X
Compiler(if manually compiled): go1.14.3 darwin/amd64

Additional context

@kbrockhoff kbrockhoff added the bug Something isn't working label Jun 15, 2020
@flands flands added this to the GA 1.0 milestone Jun 18, 2020
@bogdandrutu
Copy link
Member

@kbrockhoff is this resolved?

@tigrannajaryan tigrannajaryan added the help wanted Good issue for contributors to OpenTelemetry Service to pick up label Sep 2, 2020
@tigrannajaryan
Copy link
Member

@kbrockhoff I see this is assigned to you. Did you intend to work on this?

@kbrockhoff
Copy link
Member Author

I won't have time to work on it for at least two weeks. If somebody else does not pick it up by then, I will work on it

@tigrannajaryan
Copy link
Member

@kbrockhoff OK. From the description I think what you are saying is that we should treat ResourceExhausted as a fatal error and should not retry. Is ResourceExhausted fatal in the sense that retries will never succeed?

@kbrockhoff
Copy link
Member Author

In my experience if the message rate is high enough to trigger throttling, you will never be able to catch up. Dropping data is the only way to restore service.

@flands flands removed this from the GA 1.0 milestone Oct 1, 2020
tigrannajaryan pushed a commit that referenced this issue Apr 19, 2022
…etryInfo (#5147)

This makes us retry on`ResourceExhausted` only if retry info is provided. It also makes code a bit more explicit.

In my case it caused an issue in production where upstream denied requests above `max_recv_msg_size` and we kept retrying.

**Link to tracking Issue:**
#1123
Nicholaswang pushed a commit to Nicholaswang/opentelemetry-collector that referenced this issue Jun 7, 2022
…etryInfo (open-telemetry#5147)

This makes us retry on`ResourceExhausted` only if retry info is provided. It also makes code a bit more explicit.

In my case it caused an issue in production where upstream denied requests above `max_recv_msg_size` and we kept retrying.

**Link to tracking Issue:**
open-telemetry#1123
@bogdandrutu
Copy link
Member

I think this is fixed.

hughesjj pushed a commit to hughesjj/opentelemetry-collector that referenced this issue Apr 27, 2023
Troels51 pushed a commit to Troels51/opentelemetry-collector that referenced this issue Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Good issue for contributors to OpenTelemetry Service to pick up priority:p2 Medium
Projects
None yet
Development

No branches or pull requests

5 participants