New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Manually report status for core exporters #8788

Closed

mwear wants to merge 4 commits into open-telemetry:main from mwear:core-exporters-manual-status

Member

mwear commented Nov 1, 2023

Description:
This is a PR is an alternative to #8684. It adds manual status reporting to the otlp, otlphttp, and debug exporters. Reviewers, please compare with #8684 and weigh in on what approach you prefer, and/or additional suggestions.

Link to tracking Issue: #7682

Testing: Units, manual.

mwear requested review from a team and codeboten

November 1, 2023 05:01

mwear mentioned this pull request

Automatic status reporting for exporterhelper and core exporters #8684

Closed

codecov bot commented Nov 1, 2023 •

edited

Loading

Codecov Report

Attention: Patch coverage is 94.25287% with 5 lines in your changes are missing coverage. Please review.

Project coverage is 90.27%. Comparing base (946dc24) to head (9772f45).
Report is 292 commits behind head on main.

Files	Patch %	Lines
exporter/otlphttpexporter/otlp.go	87.50%	5 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #8788   +/-   ##
=======================================
  Coverage   90.27%   90.27%           
=======================================
  Files         343      343           
  Lines       18003    18069   +66     
=======================================
+ Hits        16252    16312   +60     
- Misses       1424     1429    +5     
- Partials      327      328    +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

codeboten reviewed

View reviewed changes

Contributor

codeboten left a comment

Looking at this in comparison with #8684, I like the approach in the other PR better, it feels like the opportunities for most of the cases is covered automatically for components, while still giving enough flexibility for edge cases.

With the approach in this PR, it feels like most of the code will end up being quite repetitive across components. I'm thinking of how much code this would mean for the contrib repo especially.

Member Author

mwear commented Nov 1, 2023

If I were only reporting status for exporters, I would choose #8684 because it introduces a flexible opt-in approach that reduces the overall amount of code that needs to be written. My only reservation is that that approach wouldn't apply fully to other component types. We could add similar WithStatusReporting options to the processorhelper and receiverhelper, but they would likely only allow you to wrap start. Receivers are not consumers. While processors are consumers, they are in the middle of a pipeline, and errors returned from consume do not necessarily belong to the processor and might be propagated from further down the pipeline. That is to say, status reporting for receivers and processors will be more manual, so it might make sense to do things manually for exporters so they aren't a special case.

mwear mentioned this pull request

Automate status reporting on start #8836

Merged

Contributor

github-actions bot commented Nov 16, 2023

This PR was marked stale due to lack of activity. It will be closed in 14 days.

github-actions bot added the Stale label

codeboten pushed a commit that referenced this pull request


          Automate status reporting on start (#8836)

433f7ae

This is part of the continued component status reporting effort.
Currently we have automated status reporting for the following component
lifecycle events: `Starting`, `Stopping`, `Stopped` as well as
definitive errors that occur in the starting or stopping process (e.g.
as determined by an error return value). This leaves the responsibility
to the component to report runtime status after start and before stop.
We'd like to be able to extend the automatic status reporting to report
`StatusOK` if `Start` completes without an error. One complication with
this approach is that some components spawn async work (via goroutines)
that, depending on the Go scheduler, can report status before `Start`
returns. As such, we cannot assume a nil return value from `Start` means
the component has started properly. The solution is to detect if the
component has already reported status when start returns, if it has, we
will use the component-reported status and will not automatically report
status. If it hasn't, and `Start` returns without an error, we can
report `StatusOK`. Any subsequent reports from the component (async or
otherwise) will transition the component status accordingly.

The tl;dr is that we cannot control the execution of async code, that's
up to the Go scheduler, but we can handle the race, report the status
based on the execution, and not clobber status reported from within the
component during the startup process. That said, for components with
async starts, you may see a `StatusOK` before the component-reported
status, or just the component-reported status depending on the actual
execution of the code. In both cases, the end status will be same.

The work in this PR will allow us to simplify #8684 and #8788 and
ultimately choose which direction we want to go for runtime status
reporting.

**Link to tracking Issue:** #7682

**Testing:** units / manual

---------

Co-authored-by: Alex Boten <aboten@lightstep.com>

mwear force-pushed the core-exporters-manual-status branch 3 times, most recently from c465fe3 to 8f96171 Compare

November 29, 2023 02:00

github-actions bot removed the Stale label

pantuza pushed a commit to pantuza/opentelemetry-collector that referenced this pull request


          Automate status reporting on start (open-telemetry#8836)

fa319ca

This is part of the continued component status reporting effort.
Currently we have automated status reporting for the following component
lifecycle events: `Starting`, `Stopping`, `Stopped` as well as
definitive errors that occur in the starting or stopping process (e.g.
as determined by an error return value). This leaves the responsibility
to the component to report runtime status after start and before stop.
We'd like to be able to extend the automatic status reporting to report
`StatusOK` if `Start` completes without an error. One complication with
this approach is that some components spawn async work (via goroutines)
that, depending on the Go scheduler, can report status before `Start`
returns. As such, we cannot assume a nil return value from `Start` means
the component has started properly. The solution is to detect if the
component has already reported status when start returns, if it has, we
will use the component-reported status and will not automatically report
status. If it hasn't, and `Start` returns without an error, we can
report `StatusOK`. Any subsequent reports from the component (async or
otherwise) will transition the component status accordingly.

The tl;dr is that we cannot control the execution of async code, that's
up to the Go scheduler, but we can handle the race, report the status
based on the execution, and not clobber status reported from within the
component during the startup process. That said, for components with
async starts, you may see a `StatusOK` before the component-reported
status, or just the component-reported status depending on the actual
execution of the code. In both cases, the end status will be same.

The work in this PR will allow us to simplify open-telemetry#8684 and open-telemetry#8788 and
ultimately choose which direction we want to go for runtime status
reporting.

**Link to tracking Issue:** open-telemetry#7682

**Testing:** units / manual

---------

Co-authored-by: Alex Boten <aboten@lightstep.com>

Contributor

github-actions bot commented Dec 13, 2023

This PR was marked stale due to lack of activity. It will be closed in 14 days.

github-actions bot added Stale and removed Stale labels

Contributor

github-actions bot commented Dec 31, 2023

This PR was marked stale due to lack of activity. It will be closed in 14 days.

github-actions bot added Stale and removed Stale labels

This was referenced Jan 19, 2024

[healthcheckextensionv2] Introduce health check extension based on component status reporting open-telemetry/opentelemetry-collector-contrib#30673

Closed

add gRPC health check #8397

Closed

Contributor

github-actions bot commented Jan 20, 2024

This PR was marked stale due to lack of activity. It will be closed in 14 days.

github-actions bot added the Stale label

mwear added 4 commits

January 23, 2024 14:56


          Manually report status for core exporters

a359acc


          Treat 413 as permament error

5ea8b45

It implies the batch size is misconfigured.


          Do not report status on start

a36f4c7


          Post-rebase fixes

9772f45

mwear force-pushed the core-exporters-manual-status branch from 8f96171 to 9772f45 Compare

January 23, 2024 22:58

github-actions bot removed the Stale label

Contributor

github-actions bot commented Feb 8, 2024

This PR was marked stale due to lack of activity. It will be closed in 14 days.

github-actions bot added Stale and removed Stale labels

Contributor

github-actions bot commented Mar 1, 2024

This PR was marked stale due to lack of activity. It will be closed in 14 days.

github-actions bot added Stale and removed Stale labels

Contributor

github-actions bot commented Mar 23, 2024

This PR was marked stale due to lack of activity. It will be closed in 14 days.

github-actions bot added Stale and removed Stale labels

Contributor

github-actions bot commented Apr 8, 2024

This PR was marked stale due to lack of activity. It will be closed in 14 days.

github-actions bot added the Stale label

mwear mentioned this pull request

[component] Runtime status reporting #9957

Open

Contributor

github-actions bot commented Apr 24, 2024

Closed as inactive. Feel free to reopen if this PR is still being worked on.

github-actions bot closed this

mwear mentioned this pull request

[exporter/otlp] Report runtime status #11366

Closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels