Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disabling and later reenabling a metric stream does not reset the start time when using cumulative temporality #5064

Open
alanwest opened this issue Nov 18, 2023 · 3 comments
Labels
bug Something isn't working
Milestone

Comments

@alanwest
Copy link
Member

With #4958, it is now possible to disable and later reenable a metric stream. A number of bugs were identified while in the process of working on #4958. Some were fixed, but this one remains.

Disabling a measurement can be thought of as introducing an intentional gap in a metric stream. When using cumulative temporality the current aggregated value depends on all prior measurements. The start time and current timestamp mark an unbroken sequence of measurements for that time period. When measurements stop - i.e., by disabling a metric stream - and then are later reenabled, the start time must be reset to when the measurements began again. This is important so that backends can accurately identify gaps in a metric stream.

For reference, see the documentation in the proto definition

@alanwest alanwest added the bug Something isn't working label Nov 18, 2023
@CodeBlanch
Copy link
Member

I left some "TODOs" in the code but here are some thoughts/questions I had working on this initially:

  • Should we set end time as soon as the instrument is deactivated? At the moment IIRC we set end time during the next collect/export. Let's say we stop listening to some counter and then five minutes later we do a collect. If the end time reflects five minutes of dead air would that cause issues?

  • Let's say we are collecting every five minutes and we had measurements for two minutes when some instrument is deactivated. Then a minute later it gets reactivated. We haven't shipped off the data we collected while the instrument was active. Do we throw away the data and set a new start time and reset state during that reactivation?

  • Edge case. If some library author fails on their instrumentation implementation and is repeatedly creating & disposing a meter this reset could lead to no data or really odd data. Should we have some threshold (time) between when we'll respect a reset to handle some jitter gracefully?

@CodeBlanch
Copy link
Member

CodeBlanch commented Nov 22, 2023

Update...

After this issue was opened we reverted some of the fixes/changes that were on #4958. At the moment the SDK is still removing and recreating Metric instances when something is completed/disposed and then reactivated. We also log a "duplicate instrument" warning anytime a metric is recreated. If a Metric is completed/disposed and then reactivated before a collect has fired, duplicate metrics will be exported (the one being removed and the new one which was created).

Copy link
Contributor

This issue was marked stale due to lack of activity and will be closed in 7 days. Commenting will instruct the bot to automatically remove the label. This bot runs once per day.

@github-actions github-actions bot added the Stale Issues and pull requests which have been flagged for closing due to inactivity label Oct 15, 2024
@CodeBlanch CodeBlanch added this to the Future milestone Oct 16, 2024
@CodeBlanch CodeBlanch removed the Stale Issues and pull requests which have been flagged for closing due to inactivity label Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants