From f9a8a4d6b756339dabee4a97c5830b0bfad4d797 Mon Sep 17 00:00:00 2001 From: Joshua MacDonald Date: Thu, 16 Sep 2021 16:00:08 -0700 Subject: [PATCH 01/22] Add ExponentialHistogram to Metrics data model --- specification/metrics/datamodel.md | 76 ++++++++++++++++++++++++++++++ 1 file changed, 76 insertions(+) diff --git a/specification/metrics/datamodel.md b/specification/metrics/datamodel.md index b2be93869b9..fe805ae9fa6 100644 --- a/specification/metrics/datamodel.md +++ b/specification/metrics/datamodel.md @@ -398,6 +398,82 @@ denotes Delta temporality where accumulated event counts are reset to zero after and a new aggregation occurs. Cumulative, on the other hand, continues to aggregate events, resetting with the use of a new start time. +### ExponentialHistogram + +[ExponentialHistogram](TBD_after_PR322_merges) data points are an +alternate representation, like the [Histogram](#histogram) that convey +a population of recorded measurements in a compressed format. +ExponentialHistogram compresses bucket boundaries using an exponential +formula, making it suitable for conveying high-resolution histogram +data with a large number of buckets. + +Statements about `Histogram` that refer to aggregation temporality, +attributes, timestamps, as well as the `sum`, `count`, `exemplars` +fields are identical for `ExponentialHistogram`. + +The resolution of the ExponentialHistogram is characterized by a +parameter known as `scale`, with larger values of `scale` offering +greater precision. Bucket boundaries of the ExponentialHistogram are +located at integer powers of the `base`, where: + +``` +base = 2**(2**(-scale)) +``` + +The symbol `**` in these formulas represents exponentiation, thus +`2**x` is read "Two to the power of X", typically computed by an +expression like `math.Pow(2.0, x)`. Calculated `base` values for +selected scales are shown below: + +| Scale | Base | Expression | +| -- | -- | -- | +| 10 | 1.000677130693066 | 2**(1/1024) | +| 9 | 1.001354719892108 | 2**(1/512) | +| 8 | 1.002711275050203 | 2**(1/256) | +| 7 | 1.005429901112803 | 2**(1/128) | +| 6 | 1.010889286051700 | 2**(1/64) | +| 5 | 1.021897148654117 | 2**(1/32) | +| 4 | 1.044273782427414 | 2**(1/16) | +| 3 | 1.090507732665258 | 2**(1/8) | +| 2 | 1.189207115002721 | 2**(1/4) | +| 1 | 1.414213562373095 | 2**(1/2) | +| 0 | 2 | 2**1 | +| -1 | 4 | 2**2 | +| -2 | 16 | 2**4 | +| -3 | 256 | 2**8 | +| -4 | 65536 | 2**16 | + +The ExponentialHistogram bucket identified by `index`, a signed +integer, represents values in the population that are greater than or +equal to `base**index` and less than `base**(index+1)`. + +The positive and negative ranges of the histogram are expressed +separately. Negative values are mapped by their absolute value +into the negative range using the same scale as the positive range. + +Each range of the ExponentialHistogram data point uses a dense +representation of the buckets, where a range of buckets is expressed +as a single `offset` value, a signed integer, and an array of count +values, where array element `i` represents the bucket count for bucket +index `offset+i`. + +For a given range, positive or negative: + +- The absolute value 1.0 has bucket index `0` +- Bucket index `0` counts measurements greater than or equal to 1.0 and less than `base` +- Negative indexes correspond with absolute values less than 1.0. + +The ExponentialHistogram contains a special `zero_count` field +containing is the count of values that are either exactly zero or +within the region considered zero by the instrumentation at the +tolerated degree of precision. This bucket stores values that cannot +be expressed using the standard exponential formula as well as values +that have been rounded to zero. + +#### Producers and consumer requirements + +TODO: Work-In-Progress. + ### Summary (Legacy) [Summary](https://github.com/open-telemetry/opentelemetry-proto/blob/v0.9.0/opentelemetry/proto/metrics/v1/metrics.proto#L268) From e985ea5bd5ac41c19b2c55bcd76934134d7f0c16 Mon Sep 17 00:00:00 2001 From: Joshua MacDonald Date: Mon, 20 Sep 2021 12:02:27 -0700 Subject: [PATCH 02/22] draft expectations --- specification/metrics/datamodel.md | 99 ++++++++++++++++++++++-------- 1 file changed, 75 insertions(+), 24 deletions(-) diff --git a/specification/metrics/datamodel.md b/specification/metrics/datamodel.md index fe805ae9fa6..8fef8242a51 100644 --- a/specification/metrics/datamodel.md +++ b/specification/metrics/datamodel.md @@ -400,28 +400,35 @@ aggregate events, resetting with the use of a new start time. ### ExponentialHistogram -[ExponentialHistogram](TBD_after_PR322_merges) data points are an -alternate representation, like the [Histogram](#histogram) that convey -a population of recorded measurements in a compressed format. -ExponentialHistogram compresses bucket boundaries using an exponential -formula, making it suitable for conveying high-resolution histogram -data with a large number of buckets. +**Status**: [Experimental](../document-status.md) + +[ExponentialHistogram](https://github.com/open-telemetry/opentelemetry-proto/blob/cfbf9357c03bf4ac150a3ab3bcbe4cc4ed087362/opentelemetry/proto/metrics/v1/metrics.proto#L222) +data points are an alternate representation to the +[Histogram](#histogram), used to convey a population of recorded +measurements in a compressed format. ExponentialHistogram compresses +bucket boundaries using an exponential formula, making it suitable for +conveying high-resolution data using a large number of buckets. Statements about `Histogram` that refer to aggregation temporality, -attributes, timestamps, as well as the `sum`, `count`, `exemplars` -fields are identical for `ExponentialHistogram`. +attributes, timestamps, as well as the `sum`, `count`, and `exemplars` +fields are the same for `ExponentialHistogram`. These fields all have +identical interpretation with `Histogram`, only their bucket structure +differs. + +#### Exponential scale The resolution of the ExponentialHistogram is characterized by a parameter known as `scale`, with larger values of `scale` offering greater precision. Bucket boundaries of the ExponentialHistogram are -located at integer powers of the `base`, where: +located at integer powers of the `base`, which is the "growth factor", +where: ``` base = 2**(2**(-scale)) ``` The symbol `**` in these formulas represents exponentiation, thus -`2**x` is read "Two to the power of X", typically computed by an +`2**x` is read "Two to the power of `x`", typically computed by an expression like `math.Pow(2.0, x)`. Calculated `base` values for selected scales are shown below: @@ -443,6 +450,14 @@ selected scales are shown below: | -3 | 256 | 2**8 | | -4 | 65536 | 2**16 | +An important property of this design is described as "perfect +subsetting". Buckets of an exponential Histogram with a given scale +map directly into buckets of exponential Histograms with lesser +scales, which allows consumers to automatically lower the resolution +of a histogram (i.e., downscale) without introducing errors. + +#### Exponential buckets + The ExponentialHistogram bucket identified by `index`, a signed integer, represents values in the population that are greater than or equal to `base**index` and less than `base**(index+1)`. @@ -459,29 +474,65 @@ index `offset+i`. For a given range, positive or negative: -- The absolute value 1.0 has bucket index `0` -- Bucket index `0` counts measurements greater than or equal to 1.0 and less than `base` -- Negative indexes correspond with absolute values less than 1.0. +- Bucket index `0` counts measurements in the range `[1, base)` +- Positive indexes correspond with absoluve values greater or equal to `base` +- Negative indexes correspond with absolute values less than 1 +- There are `2**scale` buckets between successive powers of 2. + +For example, with `scale=3` there are `2**3` buckets between 1 and 2. +Note that the lower boundary for bucket index 4 in a `scale=3` +histogram maps into the lower boundary for bucket index 2 in a +`scale=2` histogram and maps into the lower boundary for bucket index +1 (i.e., the `base`) in a `scale=1` histogram. + +| `scale=3` bucket index | lower boundary | equation | +| -- | -- | -- | +| 0 | 1 | 2**(0/8) | +| 1 | 1.090507732665258 | 2**(1/8) | +| 2 | 1.189207115002721 | 2**(2/8), 2**(1/4)| +| 3 | 1.29683955465101 | 2**(3/8) | +| 4 | 1.414213562373095 | 2**(4/8), 2**(2/4), 2**(1/2) | +| 5 | 1.542210825407941 | 2**(5/8) | +| 6 | 1.681792830507429 | 2**(6/8) | +| 7 | 1.834008086409343) | 2**(7/8) | + +#### Exponential zero count The ExponentialHistogram contains a special `zero_count` field -containing is the count of values that are either exactly zero or -within the region considered zero by the instrumentation at the -tolerated degree of precision. This bucket stores values that cannot -be expressed using the standard exponential formula as well as values +containing the count of values that are either exactly zero or within +the region considered zero by the instrumentation at the tolerated +level of precision. This bucket stores values that cannot be +expressed using the standard exponential formula as well as values that have been rounded to zero. -#### Producers and consumer requirements +#### Producer and consumer expectations + +The ExponentialHistogram design makes it possible to express values +that are too large or small to be represented in computer hardware. +Certain values for `scale`, while meaningful, are not necessarily +useful. + +The range of data represented by an ExponentialHistogram determines +which scales can be usefully applied. Therefore, producers SHOULD +ensure that bucket indices are within the range of a signed 64-bit +integer by downscaling as necessary. -TODO: Work-In-Progress. +ExponentialHistogram buckets are expected to map into numbers can be +represented using normalized IEEE 754 double-width floating point +values (i.e., subnormal values are excluded). Consumers SHOULD reject +ExponentialHistogram data with `scale` and bucket indices that +overflow or underflow this representation. Consumers that reject such +data SHOULD warn the user through error logging that out-of-range data +was received. ### Summary (Legacy) [Summary](https://github.com/open-telemetry/opentelemetry-proto/blob/v0.9.0/opentelemetry/proto/metrics/v1/metrics.proto#L268) -metric data points convey quantile summaries, e.g. What is the 99-th percentile -latency of my HTTP server. Unlike other point types in OpenTelemetry, Summary -points cannot always be merged in a meaningful way. This point type is not -recommended for new applications and exists for compatibility with other -formats. +metric data points convey quantile summaries, e.g. What is the 99-th +percentile latency of my HTTP server. Unlike other point types in +OpenTelemetry, Summary points cannot always be merged in a meaningful +way. This point type is not recommended for new applications and +exists for compatibility with other formats. ## Exemplars From 7308b5a842f76e8c3940355d66c48947c828c460 Mon Sep 17 00:00:00 2001 From: Joshua MacDonald Date: Mon, 20 Sep 2021 14:00:17 -0700 Subject: [PATCH 03/22] toc --- specification/metrics/datamodel.md | 58 ++++++++++++++++-------------- 1 file changed, 32 insertions(+), 26 deletions(-) diff --git a/specification/metrics/datamodel.md b/specification/metrics/datamodel.md index 8fef8242a51..a75f444e486 100644 --- a/specification/metrics/datamodel.md +++ b/specification/metrics/datamodel.md @@ -8,7 +8,7 @@ - [Overview](#overview) -- [Events → Data Stream → Timeseries](#events--data-stream--timeseries) +- [Events → Data Stream → Timeseries](#events-%E2%86%92-data-stream-%E2%86%92-timeseries) * [Example Use-cases](#example-use-cases) * [Out of Scope Use-cases](#out-of-scope-use-cases) - [Model Details](#model-details) @@ -19,6 +19,11 @@ * [Sums](#sums) * [Gauge](#gauge) * [Histogram](#histogram) + * [ExponentialHistogram](#exponentialhistogram) + + [Exponential scale](#exponential-scale) + + [Exponential buckets](#exponential-buckets) + + [Exponential zero count](#exponential-zero-count) + + [Producer and consumer expectations](#producer-and-consumer-expectations) * [Summary (Legacy)](#summary-legacy) - [Exemplars](#exemplars) - [Single-Writer](#single-writer) @@ -404,24 +409,25 @@ aggregate events, resetting with the use of a new start time. [ExponentialHistogram](https://github.com/open-telemetry/opentelemetry-proto/blob/cfbf9357c03bf4ac150a3ab3bcbe4cc4ed087362/opentelemetry/proto/metrics/v1/metrics.proto#L222) data points are an alternate representation to the -[Histogram](#histogram), used to convey a population of recorded -measurements in a compressed format. ExponentialHistogram compresses -bucket boundaries using an exponential formula, making it suitable for -conveying high-resolution data using a large number of buckets. +[Histogram](#histogram) data point, used to convey a population of +recorded measurements in a compressed format. ExponentialHistogram +compresses bucket boundaries using an exponential formula, making it +suitable for conveying high-resolution data using a relatively large +number of buckets. Statements about `Histogram` that refer to aggregation temporality, -attributes, timestamps, as well as the `sum`, `count`, and `exemplars` -fields are the same for `ExponentialHistogram`. These fields all have -identical interpretation with `Histogram`, only their bucket structure -differs. +attributes, and timestamps, as well as the `sum`, `count`, and +`exemplars` fields, are the same for `ExponentialHistogram`. These +fields all share identical interpretation as for `Histogram`, only the +bucket structure differs between these two types. #### Exponential scale The resolution of the ExponentialHistogram is characterized by a parameter known as `scale`, with larger values of `scale` offering greater precision. Bucket boundaries of the ExponentialHistogram are -located at integer powers of the `base`, which is the "growth factor", -where: +located at integer powers of the `base`, also known as the "growth +factor", where: ``` base = 2**(2**(-scale)) @@ -452,9 +458,9 @@ selected scales are shown below: An important property of this design is described as "perfect subsetting". Buckets of an exponential Histogram with a given scale -map directly into buckets of exponential Histograms with lesser -scales, which allows consumers to automatically lower the resolution -of a histogram (i.e., downscale) without introducing errors. +map exactly into buckets of exponential Histograms with lesser scales, +which allows consumers to automatically lower the resolution of a +histogram (i.e., downscale) without introducing error. #### Exponential buckets @@ -475,7 +481,7 @@ index `offset+i`. For a given range, positive or negative: - Bucket index `0` counts measurements in the range `[1, base)` -- Positive indexes correspond with absoluve values greater or equal to `base` +- Positive indexes correspond with absolute values greater or equal to `base` - Negative indexes correspond with absolute values less than 1 - There are `2**scale` buckets between successive powers of 2. @@ -485,16 +491,16 @@ histogram maps into the lower boundary for bucket index 2 in a `scale=2` histogram and maps into the lower boundary for bucket index 1 (i.e., the `base`) in a `scale=1` histogram. -| `scale=3` bucket index | lower boundary | equation | -| -- | -- | -- | -| 0 | 1 | 2**(0/8) | -| 1 | 1.090507732665258 | 2**(1/8) | -| 2 | 1.189207115002721 | 2**(2/8), 2**(1/4)| -| 3 | 1.29683955465101 | 2**(3/8) | -| 4 | 1.414213562373095 | 2**(4/8), 2**(2/4), 2**(1/2) | -| 5 | 1.542210825407941 | 2**(5/8) | -| 6 | 1.681792830507429 | 2**(6/8) | -| 7 | 1.834008086409343) | 2**(7/8) | +| `scale=3` bucket index | lower boundary | equation | +| -- | -- | -- | +| 0 | 1 | 2**(0/8) | +| 1 | 1.090507732665258 | 2**(1/8) | +| 2 | 1.189207115002721 | 2**(2/8), 2**(1/4) | +| 3 | 1.296839554651010 | 2**(3/8) | +| 4 | 1.414213562373095 | 2**(4/8), 2**(2/4), 2**(1/2) | +| 5 | 1.542210825407941 | 2**(5/8) | +| 6 | 1.681792830507429 | 2**(6/8) | +| 7 | 1.834008086409343 | 2**(7/8) | #### Exponential zero count @@ -515,7 +521,7 @@ useful. The range of data represented by an ExponentialHistogram determines which scales can be usefully applied. Therefore, producers SHOULD ensure that bucket indices are within the range of a signed 64-bit -integer by downscaling as necessary. +integer by changing scale. ExponentialHistogram buckets are expected to map into numbers can be represented using normalized IEEE 754 double-width floating point From 736638a2cce5a80638df90878b01dbe0e99cb89a Mon Sep 17 00:00:00 2001 From: Joshua MacDonald Date: Mon, 20 Sep 2021 14:00:55 -0700 Subject: [PATCH 04/22] remove one 'exponential' --- specification/metrics/datamodel.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/specification/metrics/datamodel.md b/specification/metrics/datamodel.md index a75f444e486..66e46b59e53 100644 --- a/specification/metrics/datamodel.md +++ b/specification/metrics/datamodel.md @@ -22,7 +22,7 @@ * [ExponentialHistogram](#exponentialhistogram) + [Exponential scale](#exponential-scale) + [Exponential buckets](#exponential-buckets) - + [Exponential zero count](#exponential-zero-count) + + [Zero count](#zero-count) + [Producer and consumer expectations](#producer-and-consumer-expectations) * [Summary (Legacy)](#summary-legacy) - [Exemplars](#exemplars) @@ -502,7 +502,7 @@ histogram maps into the lower boundary for bucket index 2 in a | 6 | 1.681792830507429 | 2**(6/8) | | 7 | 1.834008086409343 | 2**(7/8) | -#### Exponential zero count +#### Zero count The ExponentialHistogram contains a special `zero_count` field containing the count of values that are either exactly zero or within From c1df75af7a969aca7786beb80782e13087a5d1ee Mon Sep 17 00:00:00 2001 From: Joshua MacDonald Date: Mon, 20 Sep 2021 14:42:29 -0700 Subject: [PATCH 05/22] mention the use of logarithm and inexact computation --- specification/metrics/datamodel.md | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/specification/metrics/datamodel.md b/specification/metrics/datamodel.md index 66e46b59e53..0fa8f58b4d2 100644 --- a/specification/metrics/datamodel.md +++ b/specification/metrics/datamodel.md @@ -23,7 +23,8 @@ + [Exponential scale](#exponential-scale) + [Exponential buckets](#exponential-buckets) + [Zero count](#zero-count) - + [Producer and consumer expectations](#producer-and-consumer-expectations) + + [Producer expectations](#producer-expectations) + + [Consumer expectations](#consumer-expectations) * [Summary (Legacy)](#summary-legacy) - [Exemplars](#exemplars) - [Single-Writer](#single-writer) @@ -511,7 +512,7 @@ level of precision. This bucket stores values that cannot be expressed using the standard exponential formula as well as values that have been rounded to zero. -#### Producer and consumer expectations +#### Producer expectations The ExponentialHistogram design makes it possible to express values that are too large or small to be represented in computer hardware. @@ -523,6 +524,17 @@ which scales can be usefully applied. Therefore, producers SHOULD ensure that bucket indices are within the range of a signed 64-bit integer by changing scale. +Producers MAY use a built-in logarithm function to calculate the +bucket index of a value. The use of a built-in logarithm function +could lead to results that differ from the bucket index that would be +computed using arbitrary precision or a lookup table, however +producers are not required to perform an exact computation. As a +result, ExponentialHistogram exemplars could map into buckets with +zero count. Instead, we expect to find such values counted in the +adjacent bucket. + +#### Consumer expectations + ExponentialHistogram buckets are expected to map into numbers can be represented using normalized IEEE 754 double-width floating point values (i.e., subnormal values are excluded). Consumers SHOULD reject From e6cde7af437da3ee704f0083db2d5e3f337c54a7 Mon Sep 17 00:00:00 2001 From: Joshua MacDonald Date: Mon, 20 Sep 2021 14:43:29 -0700 Subject: [PATCH 06/22] manual edit TOC --- specification/metrics/datamodel.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specification/metrics/datamodel.md b/specification/metrics/datamodel.md index 0fa8f58b4d2..b7f9d8649ef 100644 --- a/specification/metrics/datamodel.md +++ b/specification/metrics/datamodel.md @@ -8,7 +8,7 @@ - [Overview](#overview) -- [Events → Data Stream → Timeseries](#events-%E2%86%92-data-stream-%E2%86%92-timeseries) +- [Events → Data Stream → Timeseries](#events--data-stream--timeseries) * [Example Use-cases](#example-use-cases) * [Out of Scope Use-cases](#out-of-scope-use-cases) - [Model Details](#model-details) From b654dd0e026737cffe8dca95b7103195896d21a7 Mon Sep 17 00:00:00 2001 From: Joshua MacDonald Date: Mon, 20 Sep 2021 15:38:20 -0700 Subject: [PATCH 07/22] typo --- specification/metrics/datamodel.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specification/metrics/datamodel.md b/specification/metrics/datamodel.md index b7f9d8649ef..0aab7168748 100644 --- a/specification/metrics/datamodel.md +++ b/specification/metrics/datamodel.md @@ -535,7 +535,7 @@ adjacent bucket. #### Consumer expectations -ExponentialHistogram buckets are expected to map into numbers can be +ExponentialHistogram buckets are expected to map into numbers that can be represented using normalized IEEE 754 double-width floating point values (i.e., subnormal values are excluded). Consumers SHOULD reject ExponentialHistogram data with `scale` and bucket indices that From 7927086ea1bf4c332853f2dfbb002e7d0c0c2de0 Mon Sep 17 00:00:00 2001 From: Joshua MacDonald Date: Fri, 24 Sep 2021 10:06:10 -0700 Subject: [PATCH 08/22] reduce precision --- specification/metrics/datamodel.md | 54 +++++++++++++++--------------- 1 file changed, 27 insertions(+), 27 deletions(-) diff --git a/specification/metrics/datamodel.md b/specification/metrics/datamodel.md index 0aab7168748..4da35709aa2 100644 --- a/specification/metrics/datamodel.md +++ b/specification/metrics/datamodel.md @@ -439,23 +439,23 @@ The symbol `**` in these formulas represents exponentiation, thus expression like `math.Pow(2.0, x)`. Calculated `base` values for selected scales are shown below: -| Scale | Base | Expression | -| -- | -- | -- | -| 10 | 1.000677130693066 | 2**(1/1024) | -| 9 | 1.001354719892108 | 2**(1/512) | -| 8 | 1.002711275050203 | 2**(1/256) | -| 7 | 1.005429901112803 | 2**(1/128) | -| 6 | 1.010889286051700 | 2**(1/64) | -| 5 | 1.021897148654117 | 2**(1/32) | -| 4 | 1.044273782427414 | 2**(1/16) | -| 3 | 1.090507732665258 | 2**(1/8) | -| 2 | 1.189207115002721 | 2**(1/4) | -| 1 | 1.414213562373095 | 2**(1/2) | -| 0 | 2 | 2**1 | -| -1 | 4 | 2**2 | -| -2 | 16 | 2**4 | -| -3 | 256 | 2**8 | -| -4 | 65536 | 2**16 | +| Scale | Base | Expression | +| -- | -- | -- | +| 10 | 1.00068 | 2**(1/1024) | +| 9 | 1.00135 | 2**(1/512) | +| 8 | 1.00271 | 2**(1/256) | +| 7 | 1.00543 | 2**(1/128) | +| 6 | 1.01089 | 2**(1/64) | +| 5 | 1.02190 | 2**(1/32) | +| 4 | 1.04427 | 2**(1/16) | +| 3 | 1.09051 | 2**(1/8) | +| 2 | 1.18921 | 2**(1/4) | +| 1 | 1.41421 | 2**(1/2) | +| 0 | 2 | 2**1 | +| -1 | 4 | 2**2 | +| -2 | 16 | 2**4 | +| -3 | 256 | 2**8 | +| -4 | 65536 | 2**16 | An important property of this design is described as "perfect subsetting". Buckets of an exponential Histogram with a given scale @@ -492,16 +492,16 @@ histogram maps into the lower boundary for bucket index 2 in a `scale=2` histogram and maps into the lower boundary for bucket index 1 (i.e., the `base`) in a `scale=1` histogram. -| `scale=3` bucket index | lower boundary | equation | -| -- | -- | -- | -| 0 | 1 | 2**(0/8) | -| 1 | 1.090507732665258 | 2**(1/8) | -| 2 | 1.189207115002721 | 2**(2/8), 2**(1/4) | -| 3 | 1.296839554651010 | 2**(3/8) | -| 4 | 1.414213562373095 | 2**(4/8), 2**(2/4), 2**(1/2) | -| 5 | 1.542210825407941 | 2**(5/8) | -| 6 | 1.681792830507429 | 2**(6/8) | -| 7 | 1.834008086409343 | 2**(7/8) | +| `scale=3` bucket index | lower boundary | equation | +| -- | -- | -- | +| 0 | 1 | 2**(0/8) | +| 1 | 1.09051 | 2**(1/8) | +| 2 | 1.18921 | 2**(2/8), 2**(1/4) | +| 3 | 1.29684 | 2**(3/8) | +| 4 | 1.41421 | 2**(4/8), 2**(2/4), 2**(1/2) | +| 5 | 1.54221 | 2**(5/8) | +| 6 | 1.68179 | 2**(6/8) | +| 7 | 1.83401 | 2**(7/8) | #### Zero count From 91872944b1c3d0b42a8ba3ca34d6c97e9b181022 Mon Sep 17 00:00:00 2001 From: Joshua MacDonald Date: Fri, 24 Sep 2021 11:06:24 -0700 Subject: [PATCH 09/22] from Yuke's feedback --- specification/metrics/datamodel.md | 47 +++++++++++++++++------------- 1 file changed, 26 insertions(+), 21 deletions(-) diff --git a/specification/metrics/datamodel.md b/specification/metrics/datamodel.md index 4da35709aa2..741985e7daf 100644 --- a/specification/metrics/datamodel.md +++ b/specification/metrics/datamodel.md @@ -413,8 +413,8 @@ data points are an alternate representation to the [Histogram](#histogram) data point, used to convey a population of recorded measurements in a compressed format. ExponentialHistogram compresses bucket boundaries using an exponential formula, making it -suitable for conveying high-resolution data using a relatively large -number of buckets. +suitable for conveying high dynamic range data with small relative +error, compared with alternative representations of similar size. Statements about `Histogram` that refer to aggregation temporality, attributes, and timestamps, as well as the `sum`, `count`, and @@ -460,14 +460,16 @@ selected scales are shown below: An important property of this design is described as "perfect subsetting". Buckets of an exponential Histogram with a given scale map exactly into buckets of exponential Histograms with lesser scales, -which allows consumers to automatically lower the resolution of a -histogram (i.e., downscale) without introducing error. +which allows consumers to lower the resolution of a histogram (i.e., +downscale) without introducing error. #### Exponential buckets The ExponentialHistogram bucket identified by `index`, a signed integer, represents values in the population that are greater than or -equal to `base**index` and less than `base**(index+1)`. +equal to `base**index` and less than `base**(index+1)`. Note that the +ExponentialHistogram specifies a lower-inclusive bound while the +explicit-boundary Histogram specifies an upper-inclusive bound. The positive and negative ranges of the histogram are expressed separately. Negative values are mapped by their absolute value @@ -490,7 +492,8 @@ For example, with `scale=3` there are `2**3` buckets between 1 and 2. Note that the lower boundary for bucket index 4 in a `scale=3` histogram maps into the lower boundary for bucket index 2 in a `scale=2` histogram and maps into the lower boundary for bucket index -1 (i.e., the `base`) in a `scale=1` histogram. +1 (i.e., the `base`) in a `scale=1` histogram—these are examples of +perfect subsetting. | `scale=3` bucket index | lower boundary | equation | | -- | -- | -- | @@ -515,14 +518,17 @@ that have been rounded to zero. #### Producer expectations The ExponentialHistogram design makes it possible to express values -that are too large or small to be represented in computer hardware. -Certain values for `scale`, while meaningful, are not necessarily -useful. +that are too large or small to be represented in the 64 bit "double" +floating point format. Certain values for `scale`, while meaningful, +are not necessarily useful. The range of data represented by an ExponentialHistogram determines -which scales can be usefully applied. Therefore, producers SHOULD -ensure that bucket indices are within the range of a signed 64-bit -integer by changing scale. +which scales can be usefully applied. Regardless of scale, producers +SHOULD ensure that the index of any encoded bucket falls within the +range of a signed 32-bit integer. This recommendation is applied to +limit the width of integers used in standard processing pipelines such +as the OpenTelemetry collector. The wire-level protocol could be +extended for 64-bit bucket indices in a future release. Producers MAY use a built-in logarithm function to calculate the bucket index of a value. The use of a built-in logarithm function @@ -530,18 +536,17 @@ could lead to results that differ from the bucket index that would be computed using arbitrary precision or a lookup table, however producers are not required to perform an exact computation. As a result, ExponentialHistogram exemplars could map into buckets with -zero count. Instead, we expect to find such values counted in the -adjacent bucket. +zero count. We expect to find such values counted in the adjacent +bucket. #### Consumer expectations -ExponentialHistogram buckets are expected to map into numbers that can be -represented using normalized IEEE 754 double-width floating point -values (i.e., subnormal values are excluded). Consumers SHOULD reject -ExponentialHistogram data with `scale` and bucket indices that -overflow or underflow this representation. Consumers that reject such -data SHOULD warn the user through error logging that out-of-range data -was received. +ExponentialHistogram buckets are expected to map into numbers that can +be represented using IEEE 754 double-width floating point values. +Consumers SHOULD reject ExponentialHistogram data with `scale` and +bucket indices that overflow or underflow this representation. +Consumers that reject such data SHOULD warn the user through error +logging that out-of-range data was received. ### Summary (Legacy) From 1d2579f6aaff392315881524b56dd9c95f15534f Mon Sep 17 00:00:00 2001 From: Joshua MacDonald Date: Mon, 4 Oct 2021 14:51:29 -0700 Subject: [PATCH 10/22] mapping methods --- specification/metrics/datamodel.md | 101 ++++++++++++++++++++++++++--- 1 file changed, 91 insertions(+), 10 deletions(-) diff --git a/specification/metrics/datamodel.md b/specification/metrics/datamodel.md index 9557b7a25cd..871a55dc9c0 100644 --- a/specification/metrics/datamodel.md +++ b/specification/metrics/datamodel.md @@ -545,19 +545,100 @@ limit the width of integers used in standard processing pipelines such as the OpenTelemetry collector. The wire-level protocol could be extended for 64-bit bucket indices in a future release. -Producers MAY use a built-in logarithm function to calculate the -bucket index of a value. The use of a built-in logarithm function -could lead to results that differ from the bucket index that would be -computed using arbitrary precision or a lookup table, however -producers are not required to perform an exact computation. As a -result, ExponentialHistogram exemplars could map into buckets with -zero count. We expect to find such values counted in the adjacent -bucket. +Producers use a mapping function to compute bucket indices. Producers +are presumed to support IEEE double-width floating-point numbers with +11-bit exponent and 52-bit significand. The pseudo-code below for +mapping values to exponents refers to the following constants: + +```golang +const ( + // SignificandWidth is the size of an IEEE 754 double-precision + // floating-point significand. + SignificandWidth = 52 + // ExponentWidth is the size of an IEEE 754 double-precision + // floating-point exponent. + ExponentWidth = 11 + + // SignificandMask is the mask for the significand of an IEEE 754 + // double-precision floating-point value: 0xFFFFFFFFFFFFF. + SignificandMask = 1<> SignificandWidth + rawSignificand := rawBits & SignificandMask + if rawExponent == 0 { + // Handle subnormal values: rawSignificand cannot be zero + // unless value is zero. + rawExponent -= int64(bits.LeadingZeros64(rawSignificand) - 12) + } + return int32(rawExponent - ExponentBias) +} +``` + +2. For negative scales, the index of a value equals the normalized + base-2 exponent (as by `GetExponent()` above) shifted to the right + by `-scale`. Note that because of sign extension, this shift performs + correct rounding for the negative indices. This may be written as: + +```golang + return GetExponent(value) << -scale +``` + +3. For positive scales, use of the built-in natural logarithm + function. A multiplicative factor equal to `2**scale / ln(2)` + proves useful (where `ln()` is the natural logarithm), for example: + +```golang + scaleFactor := math.Log2E * math.Exp2(1< Date: Mon, 4 Oct 2021 22:06:01 -0700 Subject: [PATCH 11/22] several fixes from yzhuge --- specification/metrics/datamodel.md | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/specification/metrics/datamodel.md b/specification/metrics/datamodel.md index 871a55dc9c0..f8cdca7e7b9 100644 --- a/specification/metrics/datamodel.md +++ b/specification/metrics/datamodel.md @@ -606,25 +606,30 @@ func GetExponent(value float64) int32 { correct rounding for the negative indices. This may be written as: ```golang - return GetExponent(value) << -scale + return GetExponent(value) >> -scale ``` -3. For positive scales, use of the built-in natural logarithm +3. For any scale, use of the built-in natural logarithm function. A multiplicative factor equal to `2**scale / ln(2)` proves useful (where `ln()` is the natural logarithm), for example: ```golang - scaleFactor := math.Log2E * math.Exp2(1< Date: Mon, 4 Oct 2021 22:31:55 -0700 Subject: [PATCH 12/22] lint --- specification/metrics/datamodel.md | 127 ++++++++++++++++------------- 1 file changed, 71 insertions(+), 56 deletions(-) diff --git a/specification/metrics/datamodel.md b/specification/metrics/datamodel.md index f8cdca7e7b9..219c5c1f05a 100644 --- a/specification/metrics/datamodel.md +++ b/specification/metrics/datamodel.md @@ -8,7 +8,7 @@ - [Overview](#overview) -- [Events → Data Stream → Timeseries](#events--data-stream--timeseries) +- [Events → Data Stream → Timeseries](#events-%E2%86%92-data-stream-%E2%86%92-timeseries) * [Example Use-cases](#example-use-cases) * [Out of Scope Use-cases](#out-of-scope-use-cases) - [Model Details](#model-details) @@ -24,6 +24,11 @@ + [Exponential buckets](#exponential-buckets) + [Zero count](#zero-count) + [Producer expectations](#producer-expectations) + - [Scale zero: extract the exponent](#scale-zero-extract-the-exponent) + - [Negative scale: extract and shift the exponent](#negative-scale-extract-and-shift-the-exponent) + - [All scales: use the logarithm function](#all-scales-use-the-logarithm-function) + - [Positive scale: use a lookup table](#positive-scale-use-a-lookup-table) + - [Producer recommendations](#producer-recommendations) + [Consumer expectations](#consumer-expectations) * [Summary (Legacy)](#summary-legacy) - [Exemplars](#exemplars) @@ -552,84 +557,94 @@ mapping values to exponents refers to the following constants: ```golang const ( - // SignificandWidth is the size of an IEEE 754 double-precision - // floating-point significand. - SignificandWidth = 52 - // ExponentWidth is the size of an IEEE 754 double-precision - // floating-point exponent. - ExponentWidth = 11 - - // SignificandMask is the mask for the significand of an IEEE 754 - // double-precision floating-point value: 0xFFFFFFFFFFFFF. - SignificandMask = 1<> SignificandWidth - rawSignificand := rawBits & SignificandMask - if rawExponent == 0 { - // Handle subnormal values: rawSignificand cannot be zero - // unless value is zero. - rawExponent -= int64(bits.LeadingZeros64(rawSignificand) - 12) - } - return int32(rawExponent - ExponentBias) + rawBits := math.Float64bits(value) + rawExponent := (int64(rawBits) & ExponentMask) >> SignificandWidth + rawSignificand := rawBits & SignificandMask + if rawExponent == 0 { + // Handle subnormal values: rawSignificand cannot be zero + // unless value is zero. + rawExponent -= int64(bits.LeadingZeros64(rawSignificand) - 12) + } + return int32(rawExponent - ExponentBias) } ``` -2. For negative scales, the index of a value equals the normalized - base-2 exponent (as by `GetExponent()` above) shifted to the right - by `-scale`. Note that because of sign extension, this shift performs - correct rounding for the negative indices. This may be written as: - +##### Negative scale: extract and shift the exponent + +For negative scales, the index of a value equals the normalized +base-2 exponent (as by `GetExponent()` above) shifted to the right +by `-scale`. Note that because of sign extension, this shift performs +correct rounding for the negative indices. This may be written as: + ```golang return GetExponent(value) >> -scale ``` - -3. For any scale, use of the built-in natural logarithm - function. A multiplicative factor equal to `2**scale / ln(2)` - proves useful (where `ln()` is the natural logarithm), for example: - + +##### All scales: use the logarithm function + +For any scale, use of the built-in natural logarithm +function. A multiplicative factor equal to `2**scale / ln(2)` +proves useful (where `ln()` is the natural logarithm), for example: + ```golang scaleFactor := math.Log2E * math.Exp2(scale) - return int64(math.Floor(math.Log(value) * scaleFactor)) + return int64(math.Floor(math.Log(value) * scaleFactor)) ``` - Note that in the example Golang code above, the built-in `math.Log2E` - is defined as `1/ln(2)`. +Note that in the example Golang code above, the built-in `math.Log2E` +is defined as `1 / ln(2)`. + +##### Positive scale: use a lookup table + +For positive scales, lookup table methods have been demonstrated +that are able to exactly compute the index in constant time from a +lookup table with `O(2**scale)` entries. + +##### Producer recommendations -4. For positive scales, lookup table methods have been demonstrated - that are able to exactly compute the index in constant time from a - lookup table with `O(2**scale)` entries. - For positive scales, the logarithm method is preferred because it -requires very little code to validate and is nearly as fast and -accurate as the lookup table approach. For zero scale and negative -scales, directly calculating the index from the floating-point -representation is more efficient. +requires very little code, is easy to validate and is nearly as fast +and accurate as the lookup table approach. For zero scale and +negative scales, directly calculating the index from the +floating-point representation is more efficient. The use of a built-in logarithm function could lead to results that differ from the bucket index that would be computed using arbitrary From 285ebcc14a2d4cc6912951d884bcddb60285fad2 Mon Sep 17 00:00:00 2001 From: Joshua MacDonald Date: Mon, 4 Oct 2021 22:39:01 -0700 Subject: [PATCH 13/22] update links --- specification/metrics/datamodel.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/specification/metrics/datamodel.md b/specification/metrics/datamodel.md index 219c5c1f05a..2fce92ecf12 100644 --- a/specification/metrics/datamodel.md +++ b/specification/metrics/datamodel.md @@ -235,11 +235,12 @@ consisting of several metadata properties: - Unit of measurement The primary data of each timeseries are ordered (timestamp, value) points, for -three value types: +four value types: 1. Counter (Monotonic, Cumulative) 2. Gauge 3. Histogram +4. Exponential Histogram This model may be viewed as an idealization of [Prometheus Remote Write](https://docs.google.com/document/d/1LPhVRSFkGNSuU1fBd81ulhsCPR4hkSZyyBj1SZ8fWOM/edit#heading=h.3p42p5s8n0ui). @@ -278,9 +279,10 @@ same kind. [1](#otlpdatapointfn) The basic point kinds are: -1. [Sum](https://github.com/open-telemetry/opentelemetry-proto/blob/v0.9.0/opentelemetry/proto/metrics/v1/metrics.proto#L230) -2. [Gauge](https://github.com/open-telemetry/opentelemetry-proto/blob/v0.9.0/opentelemetry/proto/metrics/v1/metrics.proto#L200) -3. [Histogram](https://github.com/open-telemetry/opentelemetry-proto/blob/v0.9.0/opentelemetry/proto/metrics/v1/metrics.proto#L258) +1. [Sum](https://github.com/open-telemetry/opentelemetry-proto/blob/v0.10.x/opentelemetry/proto/metrics/v1/metrics.proto#L198) +2. [Gauge](https://github.com/open-telemetry/opentelemetry-proto/blob/v0.10.x/opentelemetry/proto/metrics/v1/metrics.proto#L192) +3. [Histogram](https://github.com/open-telemetry/opentelemetry-proto/blob/v0.10.x/opentelemetry/proto/metrics/v1/metrics.proto#L211) +4. [Exponential Histogram](https://github.com/open-telemetry/opentelemetry-proto/blob/27a10cd70f63afdbddf460881969f9ad7ae4af5d/opentelemetry/proto/metrics/v1/metrics.proto#L239) Comparing the OTLP Metric Data Stream and Timeseries data models, OTLP does not map 1:1 from its point types into timeseries points. In OTLP, a Sum point From b5314007a21123f21e9abd3337862f893f0dba4a Mon Sep 17 00:00:00 2001 From: Joshua MacDonald Date: Mon, 4 Oct 2021 22:41:20 -0700 Subject: [PATCH 14/22] Changelog --- CHANGELOG.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 52f1a762b1f..2f2d92cf84e 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -15,6 +15,8 @@ release. - Add optional min / max fields to histogram data model. ([#1915](https://github.com/open-telemetry/opentelemetry-specification/pull/1915)) +- Add exponential histogram to the metrics data model. + ([#1935](https://github.com/open-telemetry/opentelemetry-specification/pull/1935)) ### Logs From 778fe38be1579659e0e5992bdc3890665bd335a4 Mon Sep 17 00:00:00 2001 From: Joshua MacDonald Date: Mon, 4 Oct 2021 22:47:50 -0700 Subject: [PATCH 15/22] mention min/max --- specification/metrics/datamodel.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specification/metrics/datamodel.md b/specification/metrics/datamodel.md index 4f014e94cac..eb95313c335 100644 --- a/specification/metrics/datamodel.md +++ b/specification/metrics/datamodel.md @@ -448,7 +448,7 @@ suitable for conveying high dynamic range data with small relative error, compared with alternative representations of similar size. Statements about `Histogram` that refer to aggregation temporality, -attributes, and timestamps, as well as the `sum`, `count`, and +attributes, and timestamps, as well as the `sum`, `count`, `min`, `max` and `exemplars` fields, are the same for `ExponentialHistogram`. These fields all share identical interpretation as for `Histogram`, only the bucket structure differs between these two types. From cb9984ea007ad06af151413b7392a1d3484b579d Mon Sep 17 00:00:00 2001 From: Joshua MacDonald Date: Tue, 5 Oct 2021 11:55:51 -0700 Subject: [PATCH 16/22] let consumers deal with overflow and underflow --- specification/metrics/datamodel.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/specification/metrics/datamodel.md b/specification/metrics/datamodel.md index eb95313c335..e4a31a98ff5 100644 --- a/specification/metrics/datamodel.md +++ b/specification/metrics/datamodel.md @@ -651,6 +651,12 @@ lookup table with `O(2**scale)` entries. ##### Producer recommendations +Regardless of scale or mapping technique, it can be difficult to +correctly map values to indices at the extremes of the floating-point +range. Some mapping functions may correctly compute an index whose +upper- or lower-boundary cannot be represented. This is considered a +normal condition which consumers are expected to handle. + For positive scales, the logarithm method is preferred because it requires very little code, is easy to validate and is nearly as fast and accurate as the lookup table approach. For zero scale and From 83145a1a9a1f59483d2378288dc3def597a484fa Mon Sep 17 00:00:00 2001 From: Joshua MacDonald Date: Tue, 5 Oct 2021 13:23:51 -0700 Subject: [PATCH 17/22] yzhuge's remarks --- specification/metrics/datamodel.md | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/specification/metrics/datamodel.md b/specification/metrics/datamodel.md index e4a31a98ff5..21d0f17c1e3 100644 --- a/specification/metrics/datamodel.md +++ b/specification/metrics/datamodel.md @@ -651,11 +651,11 @@ lookup table with `O(2**scale)` entries. ##### Producer recommendations -Regardless of scale or mapping technique, it can be difficult to -correctly map values to indices at the extremes of the floating-point -range. Some mapping functions may correctly compute an index whose -upper- or lower-boundary cannot be represented. This is considered a -normal condition which consumers are expected to handle. +At the lowest or highest end of the 64 bit IEEE floating point, a +bucket's range may only be partially representable by the floating +point number format. When mapping a number in these buckets, a +producer may correctly return the index of such a partially +representable bucket. This is considered a normal condition. For positive scales, the logarithm method is preferred because it requires very little code, is easy to validate and is nearly as fast @@ -674,7 +674,9 @@ such values counted in the adjacent buckets. ExponentialHistogram bucket indices are expected to map into buckets where both the uppwer and lower boundaries that can be represented -using IEEE 754 double-width floating point values. +using IEEE 754 double-width floating point values. Consumers MAY +round the unrepresentable boundary of a partially representable bucket +index to the nearest representable value. Consumers SHOULD reject ExponentialHistogram data with `scale` and bucket indices that overflow or underflow this representation. From 64f891d3f67d9999bdc09dda5158438b9b5600e9 Mon Sep 17 00:00:00 2001 From: Joshua MacDonald Date: Tue, 5 Oct 2021 13:24:09 -0700 Subject: [PATCH 18/22] whitespace --- specification/metrics/datamodel.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specification/metrics/datamodel.md b/specification/metrics/datamodel.md index 21d0f17c1e3..1825ff178bf 100644 --- a/specification/metrics/datamodel.md +++ b/specification/metrics/datamodel.md @@ -655,7 +655,7 @@ At the lowest or highest end of the 64 bit IEEE floating point, a bucket's range may only be partially representable by the floating point number format. When mapping a number in these buckets, a producer may correctly return the index of such a partially -representable bucket. This is considered a normal condition. +representable bucket. This is considered a normal condition. For positive scales, the logarithm method is preferred because it requires very little code, is easy to validate and is nearly as fast From d628dac49a4ace0cbacd7497c26fc067d79bf253 Mon Sep 17 00:00:00 2001 From: Joshua MacDonald Date: Wed, 6 Oct 2021 09:54:05 -0700 Subject: [PATCH 19/22] Apply suggestions from code review Co-authored-by: Reiley Yang --- specification/metrics/datamodel.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/specification/metrics/datamodel.md b/specification/metrics/datamodel.md index 1825ff178bf..e434bcac074 100644 --- a/specification/metrics/datamodel.md +++ b/specification/metrics/datamodel.md @@ -577,11 +577,11 @@ const ( // SignificandMask is the mask for the significand of an IEEE 754 // double-precision floating-point value: 0xFFFFFFFFFFFFF. - SignificandMask = 1< Date: Wed, 6 Oct 2021 10:00:29 -0700 Subject: [PATCH 20/22] revert TOC trouble etc --- specification/metrics/datamodel.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/specification/metrics/datamodel.md b/specification/metrics/datamodel.md index e434bcac074..99beb22b35e 100644 --- a/specification/metrics/datamodel.md +++ b/specification/metrics/datamodel.md @@ -8,7 +8,7 @@ - [Overview](#overview) -- [Events → Data Stream → Timeseries](#events-%E2%86%92-data-stream-%E2%86%92-timeseries) +- [Events → Data Stream → Timeseries](#events--data-stream--timeseries) * [Example Use-cases](#example-use-cases) * [Out of Scope Use-cases](#out-of-scope-use-cases) - [Model Details](#model-details) @@ -234,8 +234,8 @@ consisting of several metadata properties: - Kind of point (integer, floating point, etc) - Unit of measurement -The primary data of each timeseries are ordered (timestamp, value) points, for -four value types: +The primary data of each timeseries are ordered (timestamp, value) points, with +one of the following value types: 1. Counter (Monotonic, Cumulative) 2. Gauge From d597a740a38ca4b53cdb643f7a5d31a9d3acf31e Mon Sep 17 00:00:00 2001 From: Joshua MacDonald Date: Wed, 6 Oct 2021 10:06:39 -0700 Subject: [PATCH 21/22] upcase --- specification/metrics/datamodel.md | 40 +++++++++++++++--------------- 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/specification/metrics/datamodel.md b/specification/metrics/datamodel.md index 99beb22b35e..c7041408874 100644 --- a/specification/metrics/datamodel.md +++ b/specification/metrics/datamodel.md @@ -20,16 +20,16 @@ * [Gauge](#gauge) * [Histogram](#histogram) * [ExponentialHistogram](#exponentialhistogram) - + [Exponential scale](#exponential-scale) - + [Exponential buckets](#exponential-buckets) - + [Zero count](#zero-count) - + [Producer expectations](#producer-expectations) - - [Scale zero: extract the exponent](#scale-zero-extract-the-exponent) - - [Negative scale: extract and shift the exponent](#negative-scale-extract-and-shift-the-exponent) - - [All scales: use the logarithm function](#all-scales-use-the-logarithm-function) - - [Positive scale: use a lookup table](#positive-scale-use-a-lookup-table) - - [Producer recommendations](#producer-recommendations) - + [Consumer expectations](#consumer-expectations) + + [Exponential Scale](#exponential-scale) + + [Exponential Buckets](#exponential-buckets) + + [Zero Count](#zero-count) + + [Producer Expectations](#producer-expectations) + - [Scale Zero: Extract the Exponent](#scale-zero-extract-the-exponent) + - [Negative Scale: Extract and Shift the Exponent](#negative-scale-extract-and-shift-the-exponent) + - [All Scales: Use the Logarithm Function](#all-scales-use-the-logarithm-function) + - [Positive Scale: Use a Lookup Table](#positive-scale-use-a-lookup-table) + - [Producer Recommendations](#producer-recommendations) + + [Consumer Expectations](#consumer-expectations) * [Summary (Legacy)](#summary-legacy) - [Exemplars](#exemplars) - [Single-Writer](#single-writer) @@ -453,7 +453,7 @@ attributes, and timestamps, as well as the `sum`, `count`, `min`, `max` and fields all share identical interpretation as for `Histogram`, only the bucket structure differs between these two types. -#### Exponential scale +#### Exponential Scale The resolution of the ExponentialHistogram is characterized by a parameter known as `scale`, with larger values of `scale` offering @@ -494,7 +494,7 @@ map exactly into buckets of exponential Histograms with lesser scales, which allows consumers to lower the resolution of a histogram (i.e., downscale) without introducing error. -#### Exponential buckets +#### Exponential Buckets The ExponentialHistogram bucket identified by `index`, a signed integer, represents values in the population that are greater than or @@ -537,7 +537,7 @@ perfect subsetting. | 6 | 1.68179 | 2**(6/8) | | 7 | 1.83401 | 2**(7/8) | -#### Zero count +#### Zero Count The ExponentialHistogram contains a special `zero_count` field containing the count of values that are either exactly zero or within @@ -546,7 +546,7 @@ level of precision. This bucket stores values that cannot be expressed using the standard exponential formula as well as values that have been rounded to zero. -#### Producer expectations +#### Producer Expectations The ExponentialHistogram design makes it possible to express values that are too large or small to be represented in the 64 bit "double" @@ -592,7 +592,7 @@ const ( The following choices of mapping function have been validated through reference implementations. -##### Scale zero: extract the exponent +##### Scale Zero: Extract the Exponent For scale zero, the index of a value equals its normalized base-2 exponent, meaning the value of _exponent_ in the base-2 fractional @@ -618,7 +618,7 @@ func GetExponent(value float64) int32 { } ``` -##### Negative scale: extract and shift the exponent +##### Negative Scale: Extract and Shift the Exponent For negative scales, the index of a value equals the normalized base-2 exponent (as by `GetExponent()` above) shifted to the right @@ -629,7 +629,7 @@ correct rounding for the negative indices. This may be written as: return GetExponent(value) >> -scale ``` -##### All scales: use the logarithm function +##### All Scales: Use the Logarithm Function For any scale, use of the built-in natural logarithm function. A multiplicative factor equal to `2**scale / ln(2)` @@ -643,13 +643,13 @@ proves useful (where `ln()` is the natural logarithm), for example: Note that in the example Golang code above, the built-in `math.Log2E` is defined as `1 / ln(2)`. -##### Positive scale: use a lookup table +##### Positive Scale: Use a Lookup Table For positive scales, lookup table methods have been demonstrated that are able to exactly compute the index in constant time from a lookup table with `O(2**scale)` entries. -##### Producer recommendations +##### Producer Recommendations At the lowest or highest end of the 64 bit IEEE floating point, a bucket's range may only be partially representable by the floating @@ -670,7 +670,7 @@ perform an exact computation. As a result, ExponentialHistogram exemplars could map into buckets with zero count. We expect to find such values counted in the adjacent buckets. -#### Consumer expectations +#### Consumer Expectations ExponentialHistogram bucket indices are expected to map into buckets where both the uppwer and lower boundaries that can be represented From 45cd5bab5f3d2a3e60cb2ec3df7eeabd7f2407c3 Mon Sep 17 00:00:00 2001 From: Joshua MacDonald Date: Wed, 6 Oct 2021 11:46:52 -0700 Subject: [PATCH 22/22] Update specification/metrics/datamodel.md Co-authored-by: Aaron Abbott --- specification/metrics/datamodel.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specification/metrics/datamodel.md b/specification/metrics/datamodel.md index c7041408874..d2cd77a472d 100644 --- a/specification/metrics/datamodel.md +++ b/specification/metrics/datamodel.md @@ -673,7 +673,7 @@ such values counted in the adjacent buckets. #### Consumer Expectations ExponentialHistogram bucket indices are expected to map into buckets -where both the uppwer and lower boundaries that can be represented +where both the upper and lower boundaries can be represented using IEEE 754 double-width floating point values. Consumers MAY round the unrepresentable boundary of a partially representable bucket index to the nearest representable value.