Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it possible to indicate partial success in an OTLP export response [2] #414

Conversation

joaopgrassi
Copy link
Member

@joaopgrassi joaopgrassi commented Jul 22, 2022

This PR is a second attempt in solving/addressing open-telemetry/opentelemetry-specification#2454. This uses the rejected semantics instead of accepted as discussed in the initial PR #390.

Related spec PR: open-telemetry/opentelemetry-specification#2696

The need for this new PR was that with accepted we were running into problems with the 0 value being the same as the default/not set, plus general intention to avoid having optional on the fields.

I did some tests with sample apps to confirm the cases for backwards compatibility. Those can be checked in: joaopgrassi/otlp-partial-success-experimental. The TL;DR: Since it's a new field no breaking change is introduced. New sender will always have to check if the partial_success is set because they can be working with servers using the old protos.

Some context on why this feature is desired:

This feature IS intended to help in the following scenarios:

  • Enable servers to provide information to users about a partial success without having to return a 400 - BadRequest (status quo today)
  • Enable senders to generate metrics about themselves (otlp.exporter.rejected.data_points=x)
  • Enable senders to log messages when a partial success is encountered (very useful to troubleshoot things quickly, without having to reach/search for your vendor's logs for ex)
  • Capture situations that are not necessarily client or server errors.
    • Change the type of metric instrument after an initial submission - Gauge > Counter
    • Customers reaching their quotas in the middle of an export request
    • Data not conforming with vendor's constraints - metric keys with not accepted characters coming from a library outside the application's owner control (with the errors they can rename them with a view for ex)
    • OTLP data arriving with timestamp in the past - no explicit contract for this in OTel. Might be tricky for users to debug what's going on

This feature IS NOT intended to help in the following scenarios:

  • Enable clients to retry a partial success request
    • Complicated to handle. Needs more structure, semantic convention on typical errors. Was discussed in the initial proto PR
    • This can always be added later on in a non-breaking manner. I started a separated discussion on this topic for metrics here
  • Identify which exact log line/data point/span was rejected. Similar to the point above, it needs more discussion on how this can be done and which structure we should have. Always can be done/improved later

cc @tigrannajaryan @jmacd @jsuereth @jack-berg @reyang

@joaopgrassi joaopgrassi requested a review from a team July 22, 2022 10:20
Copy link
Contributor

@jmacd jmacd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thank you!

opentelemetry/proto/collector/logs/v1/logs_service.proto Outdated Show resolved Hide resolved
@jmacd jmacd merged commit 200ccff into open-telemetry:main Aug 3, 2022
@joaopgrassi joaopgrassi deleted the otlp-partial-success-rejected-semantics branch August 3, 2022 16:36
bogdandrutu added a commit that referenced this pull request Aug 3, 2022
…#419)

* Use buf to detect for wire breaking changes (#415)

* Make it possible to indicate partial success in an OTLP export response [2] (#414)

* Make it possible to indicate partial success in an OTLP export response

* Update changelog with PR number

* PR suggestions

* Improve docs around using rejected as 0 to convey warnings

* Adapt comments to match recent spec wording changes

* Clarify behavior of partial success not set/empty

Co-authored-by: Aaron Abbott <aaronabbott@google.com>
Co-authored-by: Joao Grassi <joao@joaograssi.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants