Skip to content
This repository has been archived by the owner on Oct 3, 2023. It is now read-only.

Stop including artificial 0 bound when existing lowest bound is negative #262

Merged
merged 1 commit into from
Jun 9, 2020

Conversation

x13n
Copy link
Contributor

@x13n x13n commented Jun 4, 2020

@codecov-commenter
Copy link

Codecov Report

Merging #262 into master will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #262   +/-   ##
=======================================
  Coverage   72.02%   72.02%           
=======================================
  Files          17       17           
  Lines        1691     1691           
=======================================
  Hits         1218     1218           
  Misses        397      397           
  Partials       76       76           
Impacted Files Coverage Δ
stats.go 78.27% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e191b7c...570d2d2. Read the comment docs.

Copy link
Contributor

@nilebox nilebox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change itself LGTM, as long as negative bounds are actually supported / allowed.
Please confirm with @rghetia if that's a legit change.

Also note that we will need to make a release and update the dependency in https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/master/exporter/stackdriverexporter to pick up this change in the OpenTelemetry Collector.

@@ -536,7 +536,7 @@ func newTypedValue(vd *view.View, r *view.Row) *monitoringpb.TypedValue {
}

func shouldInsertZeroBound(bounds ...float64) bool {
if len(bounds) > 0 && bounds[0] != 0.0 {
if len(bounds) > 0 && bounds[0] > 0.0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I digged a bit into the history behind this condition, and it seems like in OpenCensus the negative bounds were explicitly forbidden, see census-instrumentation/opencensus-go#963

You supposedly have encountered this issue in the OpenTelemetry Collector pipeline? Are negative bounds allowed there?

This specific condition was discussed in the original PR, see #89 (comment)

@rghetia @mayurkale22 do you happen to know more context on this, and possible difference between OpenCensus vs OpenTelemetry in that regard?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I came across this when trying to export Prometheus metrics (using OpenTelemetry Prometheus receiver) to Stackdriver (using Stackdriver exporter). The error I got came from Stackdriver Monitoring API, essentially saying that bounds[0] > bounds[1]. So that distribution came all the way through OpenTelemetry just fine:

rpc error: code = InvalidArgument desc = Field timeSeries[35].points[0].distributionValue had an invalid value: Distribution |explicit_buckets.bounds| entry 1 has a value of -4.94065645841247e-324 which is less than the value of entry 0 which is 0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. @mayurkale22 claimed in census-instrumentation/opencensus-go#963 that both Prometheus and Stackdriver don't support negative bounds... Has that changed?

Either way, this seems like a harmless change.

If it turns out that Stackdriver indeed doesn't support negative values, we should explicitly validate the buckets in the exporter rather than relying on the OpenCensus / OpenTelemetry code being correct, i.e. fail earlier before invoking this function.

Copy link
Contributor Author

@x13n x13n Jun 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So from what I have seen I'm pretty sure Prometheus client libs have no problem exposing negative bounds. I'm not actually sure about Stackdriver, it is possible that this change will only turn one error into another. I'll check that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs only seem to require that bucket bounds are increasing, nothing about forbidding negative buckets: https://cloud.google.com/monitoring/api/ref_v3/rest/v3/TypedValue#explicit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually opentelemetry-proto also doesn't support negative bounds.
/cc @bogdandrutu

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. So, next culprit: Prometheus receiver. I saw negative bounds on Prometheus endpoint and that was ingested into OpenTelemetry Collector via Prometheus receiver just fine. Should the negative bound be dropped there instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the -ve bound is a legitimate use case then we have to address that in all relevant libraries. If it is not then it is okay to drop it OR with an optional flag allow -ve bounds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the right thing to do is to allow negative bounds, they are not really useful for things like request latencies, but may have other use cases (e.g. Prometheus docs give temperatures as an example).

I just created open-telemetry/opentelemetry-proto#156 to extend the histogram support to negative bucket bounds.

Can this change be merged though? If OT won't end up extending histogram support to cover negative values, this change should be a no-op. If it does, this change will make that compatible with the updated model.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with merging this change. However it will still be incorrect in the following case.
"If the distribution has lowest bound >0 AND if it includes -ve values then this change distorts the distribution by adding additional '0' bound."
May be add this to the comment and link to the above issue.

Copy link
Contributor

@nilebox nilebox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I will wait for @rghetia and / or @mayurkale22 to approve this change before merging.

@@ -536,7 +536,7 @@ func newTypedValue(vd *view.View, r *view.Row) *monitoringpb.TypedValue {
}

func shouldInsertZeroBound(bounds ...float64) bool {
if len(bounds) > 0 && bounds[0] != 0.0 {
if len(bounds) > 0 && bounds[0] > 0.0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with merging this change. However it will still be incorrect in the following case.
"If the distribution has lowest bound >0 AND if it includes -ve values then this change distorts the distribution by adding additional '0' bound."
May be add this to the comment and link to the above issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants