sparse buckets: Fix handling of +Inf/-Inf/NaN observations #1144

beorn7 · 2022-10-05T13:48:46Z

This is only for the sparsehistogram branch.

Next attempt to solve ±Inf observations.

NaN observations now go to no bucket, but increment count (and effectively set sum to NaN, too).

±Inf observations now go to the bucket following the bucket that would have received math.MaxFloat64. The former is now the last bucket that can be created.

The getLe is modified to return math.MaxFloat64 for the penultimate possible bucket.

Also add a test for getLe.

Signed-off-by: beorn7 beorn@grafana.com

beorn7 · 2022-10-05T13:51:53Z

Corresponding changes in prometheus/prometheus: prometheus/prometheus#11418

beorn7 · 2022-10-05T15:38:25Z

Will fix tests momentarily…

NaN observations now go to no bucket, but increment count (and effectively set sum to NaN, too). ±Inf observations now go to the bucket following the bucket that would have received math.MaxFloat64. The former is now the last bucket that can be created. The getLe is modified to return math.MaxFloat64 for the penultimate possible bucket. Also add a test for getLe. Signed-off-by: beorn7 <beorn@grafana.com>

beorn7 · 2022-10-10T17:21:38Z

@fstab @codesome with prometheus/prometheus#11418 already reviewed, this one should be easy. Could you have a look?

codesome

Now that I am thinking more clearly, I just have a bunch of questions.

codesome · 2022-10-11T07:32:57Z

prometheus/histogram_test.go

@@ -548,13 +548,13 @@ func TestSparseHistogram(t *testing.T) {
 			name:         "+Inf observation",
 			observations: []float64{0, 1, 1.2, 1.4, 1.8, 2, math.Inf(+1)},
 			factor:       1.2,
-			want:         `sample_count:7 sample_sum:inf schema:2 zero_threshold:2.938735877055719e-39 zero_count:1 positive_span:<offset:0 length:5 > positive_span:<offset:2147483642 length:1 > positive_delta:1 positive_delta:-1 positive_delta:2 positive_delta:-2 positive_delta:2 positive_delta:-1 `,
+			want:         `sample_count:7 sample_sum:inf schema:2 zero_threshold:2.938735877055719e-39 zero_count:1 positive_span:<offset:0 length:5 > positive_span:<offset:4092 length:1 > positive_delta:1 positive_delta:-1 positive_delta:2 positive_delta:-2 positive_delta:2 positive_delta:-1 `,


IIUC, the last span corresponds to a bucket ID 4093 where le will be +Inf. And the last non-inf bucket is 4092 whose le is MaxFloat64?

Not quite.

The last regular bucket has ID 4096, and the +Inf bucket has ID 4097. Here we have two spans. The first spans starts at ID 0 and has a length of 5. The second span's offset is 4092, hence the ID of the first bucket in the second span is 5+4092=4097 (i.e. the Inf bucket).

has a length of 5

🤦 I totally missed this part when doing the math, my bad

codesome · 2022-10-11T07:37:16Z

prometheus/histogram_test.go

 		},
 		{
 			name:         "-Inf observation",
 			observations: []float64{0, 1, 1.2, 1.4, 1.8, 2, math.Inf(-1)},
 			factor:       1.2,
-			want:         `sample_count:7 sample_sum:-inf schema:2 zero_threshold:2.938735877055719e-39 zero_count:1 negative_span:<offset:2147483647 length:1 > negative_delta:1 positive_span:<offset:0 length:5 > positive_delta:1 positive_delta:-1 positive_delta:2 positive_delta:-2 positive_delta:2 `,
+			want:         `sample_count:7 sample_sum:-inf schema:2 zero_threshold:2.938735877055719e-39 zero_count:1 negative_span:<offset:4097 length:1 > negative_delta:1 positive_span:<offset:0 length:5 > positive_delta:1 positive_delta:-1 positive_delta:2 positive_delta:-2 positive_delta:2 `,


Similarly here bucket 4097 as le (or ge here?) of -Inf, while 4096 will have -MaxFloat64? I don't know if I understand why the IDs differ here compared to +Inf above, some signed bit thing?

See explanation above. Here the Inf observation is negative, so it's the only negative observation, we get only one span there, and its offset is therefore 4097 directly.

codesome · 2022-10-11T07:39:52Z

prometheus/histogram_test.go

+		{
+			key:    4096,
+			schema: 2,
+			want:   math.MaxFloat64,
+		},
+		{
+			key:    4097,
+			schema: 2,
+			want:   math.Inf(+1),
+		},


I am a little confused about 4092 and 4097 from above. How would 4092 fit in this unit test?

codesome · 2022-10-11T07:42:06Z

prometheus/histogram.go

@@ -1346,13 +1357,55 @@ func findSmallestKey(m *sync.Map) int {
 }

 func getLe(key int, schema int32) float64 {


IIUC, we are having this special case because the last bucket overlaps the limitation of the float64? Like the lower bound is less than MaxFloat64 but the upper bound crosses MaxFloat64, hence we do special casing? And since we cannot give a number for a bucket beyond that, we just give it to Inf observation?

Yes, that's the idea.

beorn7 force-pushed the beorn7/histogram2 branch from 11f983a to 3a8e28f Compare October 5, 2022 13:49

beorn7 mentioned this pull request Oct 5, 2022

histogram: Modify getBound to deal properly with infinity prometheus/prometheus#11418

Merged

beorn7 requested review from codesome and fstab October 5, 2022 13:52

beorn7 force-pushed the beorn7/histogram2 branch from 3a8e28f to 579eca4 Compare October 5, 2022 15:49

beorn7 force-pushed the beorn7/histogram2 branch from 579eca4 to 6942f9e Compare October 6, 2022 15:40

codesome approved these changes Oct 11, 2022

View reviewed changes

beorn7 merged commit 25bc188 into sparsehistogram Oct 11, 2022

beorn7 deleted the beorn7/histogram2 branch October 11, 2022 10:57

beorn7 mentioned this pull request Oct 11, 2022

histograms: Verify the handling of ±Inf observations #1131

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sparse buckets: Fix handling of +Inf/-Inf/NaN observations #1144

sparse buckets: Fix handling of +Inf/-Inf/NaN observations #1144

beorn7 commented Oct 5, 2022

beorn7 commented Oct 5, 2022

beorn7 commented Oct 5, 2022

beorn7 commented Oct 10, 2022

codesome left a comment

codesome Oct 11, 2022

beorn7 Oct 11, 2022

codesome Oct 12, 2022

codesome Oct 11, 2022

beorn7 Oct 11, 2022

codesome Oct 11, 2022

codesome Oct 11, 2022

beorn7 Oct 11, 2022

		@@ -1346,13 +1357,55 @@ func findSmallestKey(m *sync.Map) int {
		}

		func getLe(key int, schema int32) float64 {

sparse buckets: Fix handling of +Inf/-Inf/NaN observations #1144

sparse buckets: Fix handling of +Inf/-Inf/NaN observations #1144

Conversation

beorn7 commented Oct 5, 2022

beorn7 commented Oct 5, 2022

beorn7 commented Oct 5, 2022

beorn7 commented Oct 10, 2022

codesome left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment