Skip to content

Commit

Permalink
docs: Document the ability to use prefix in dynamic sampler FieldList (
Browse files Browse the repository at this point in the history
…#1396)

## Short description of the changes

- This PR documents the feature from #1275, showing our users how to
specify the `root.` prefix in FieldLists everywhere
- It also sprinkles the use of this feature around the
`rules_complete.yaml` example

Signed-off-by: Irving Popovetsky <irving@honeycomb.io>
  • Loading branch information
irvingpop authored Oct 23, 2024
1 parent 8829dec commit ce34790
Show file tree
Hide file tree
Showing 4 changed files with 42 additions and 3 deletions.
8 changes: 8 additions & 0 deletions config/metadata/rulesMeta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,14 @@ groups:
all endpoints under normal traffic and call out when there is
failing traffic to any endpoint.
As of Refinery 2.8.0, the `root.` prefix can be used to limit the
field value to that of the root span. For example,
`root.http.response.status_code` will only consider the
`http.response.status_code` field from the root span rather than a
combination of all the spans in the trace. This is useful when you
want to sample based on the root span's properties rather than the
entire trace, and helps to reduce the cardinality of the sampler key.
In contrast, for example, consider as a bad set of fields: a
combination of `HTTP endpoint`, `status code`, and `pod id`, since it
would result in keys that are all unique, and therefore result in
Expand Down
15 changes: 15 additions & 0 deletions refinery_rules.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,9 @@ Using fields with very high cardinality, like `k8s.pod.id`, is a bad choice.
If the combination of fields essentially makes each trace unique, then the Dynamic Sampler will sample everything.
If the combination of fields is not unique enough, then you will not be guaranteed samples of the most interesting traces.
As an example, consider as a good set of fields: the combination of `HTTP endpoint` (high-frequency and boring), `HTTP method`, and `status code` (normally boring but can become interesting when indicating an error) since it will allowing proper sampling of all endpoints under normal traffic and call out when there is failing traffic to any endpoint.
As of Refinery 2.8.0, the `root.` prefix can be used to limit the field value to that of the root span.
For example, `root.http.response.status_code` will only consider the `http.response.status_code` field from the root span rather than a combination of all the spans in the trace.
This is useful when you want to sample based on the root span's properties rather than the entire trace, and helps to reduce the cardinality of the sampler key.
In contrast, for example, consider as a bad set of fields: a combination of `HTTP endpoint`, `status code`, and `pod id`, since it would result in keys that are all unique, and therefore result in sampling 100% of traces.
For example, rather than a set of fields, using only the `HTTP endpoint` field is a **bad** choice, as it is not unique enough, and therefore interesting traces, like traces that experienced a `500`, might not be sampled.
Field names may come from any span in the trace; if they occur on multiple spans, then all unique values will be included in the key.
Expand Down Expand Up @@ -199,6 +202,9 @@ Using fields with very high cardinality, like `k8s.pod.id`, is a bad choice.
If the combination of fields essentially makes each trace unique, then the Dynamic Sampler will sample everything.
If the combination of fields is not unique enough, then you will not be guaranteed samples of the most interesting traces.
As an example, consider as a good set of fields: the combination of `HTTP endpoint` (high-frequency and boring), `HTTP method`, and `status code` (normally boring but can become interesting when indicating an error) since it will allowing proper sampling of all endpoints under normal traffic and call out when there is failing traffic to any endpoint.
As of Refinery 2.8.0, the `root.` prefix can be used to limit the field value to that of the root span.
For example, `root.http.response.status_code` will only consider the `http.response.status_code` field from the root span rather than a combination of all the spans in the trace.
This is useful when you want to sample based on the root span's properties rather than the entire trace, and helps to reduce the cardinality of the sampler key.
In contrast, for example, consider as a bad set of fields: a combination of `HTTP endpoint`, `status code`, and `pod id`, since it would result in keys that are all unique, and therefore result in sampling 100% of traces.
For example, rather than a set of fields, using only the `HTTP endpoint` field is a **bad** choice, as it is not unique enough, and therefore interesting traces, like traces that experienced a `500`, might not be sampled.
Field names may come from any span in the trace; if they occur on multiple spans, then all unique values will be included in the key.
Expand Down Expand Up @@ -313,6 +319,9 @@ Using fields with very high cardinality, like `k8s.pod.id`, is a bad choice.
If the combination of fields essentially makes each trace unique, then the Dynamic Sampler will sample everything.
If the combination of fields is not unique enough, then you will not be guaranteed samples of the most interesting traces.
As an example, consider as a good set of fields: the combination of `HTTP endpoint` (high-frequency and boring), `HTTP method`, and `status code` (normally boring but can become interesting when indicating an error) since it will allowing proper sampling of all endpoints under normal traffic and call out when there is failing traffic to any endpoint.
As of Refinery 2.8.0, the `root.` prefix can be used to limit the field value to that of the root span.
For example, `root.http.response.status_code` will only consider the `http.response.status_code` field from the root span rather than a combination of all the spans in the trace.
This is useful when you want to sample based on the root span's properties rather than the entire trace, and helps to reduce the cardinality of the sampler key.
In contrast, for example, consider as a bad set of fields: a combination of `HTTP endpoint`, `status code`, and `pod id`, since it would result in keys that are all unique, and therefore result in sampling 100% of traces.
For example, rather than a set of fields, using only the `HTTP endpoint` field is a **bad** choice, as it is not unique enough, and therefore interesting traces, like traces that experienced a `500`, might not be sampled.
Field names may come from any span in the trace; if they occur on multiple spans, then all unique values will be included in the key.
Expand Down Expand Up @@ -398,6 +407,9 @@ Using fields with very high cardinality, like `k8s.pod.id`, is a bad choice.
If the combination of fields essentially makes each trace unique, then the Dynamic Sampler will sample everything.
If the combination of fields is not unique enough, then you will not be guaranteed samples of the most interesting traces.
As an example, consider as a good set of fields: the combination of `HTTP endpoint` (high-frequency and boring), `HTTP method`, and `status code` (normally boring but can become interesting when indicating an error) since it will allowing proper sampling of all endpoints under normal traffic and call out when there is failing traffic to any endpoint.
As of Refinery 2.8.0, the `root.` prefix can be used to limit the field value to that of the root span.
For example, `root.http.response.status_code` will only consider the `http.response.status_code` field from the root span rather than a combination of all the spans in the trace.
This is useful when you want to sample based on the root span's properties rather than the entire trace, and helps to reduce the cardinality of the sampler key.
In contrast, for example, consider as a bad set of fields: a combination of `HTTP endpoint`, `status code`, and `pod id`, since it would result in keys that are all unique, and therefore result in sampling 100% of traces.
For example, rather than a set of fields, using only the `HTTP endpoint` field is a **bad** choice, as it is not unique enough, and therefore interesting traces, like traces that experienced a `500`, might not be sampled.
Field names may come from any span in the trace; if they occur on multiple spans, then all unique values will be included in the key.
Expand Down Expand Up @@ -608,6 +620,9 @@ Using fields with very high cardinality, like `k8s.pod.id`, is a bad choice.
If the combination of fields essentially makes each trace unique, then the Dynamic Sampler will sample everything.
If the combination of fields is not unique enough, then you will not be guaranteed samples of the most interesting traces.
As an example, consider as a good set of fields: the combination of `HTTP endpoint` (high-frequency and boring), `HTTP method`, and `status code` (normally boring but can become interesting when indicating an error) since it will allowing proper sampling of all endpoints under normal traffic and call out when there is failing traffic to any endpoint.
As of Refinery 2.8.0, the `root.` prefix can be used to limit the field value to that of the root span.
For example, `root.http.response.status_code` will only consider the `http.response.status_code` field from the root span rather than a combination of all the spans in the trace.
This is useful when you want to sample based on the root span's properties rather than the entire trace, and helps to reduce the cardinality of the sampler key.
In contrast, for example, consider as a bad set of fields: a combination of `HTTP endpoint`, `status code`, and `pod id`, since it would result in keys that are all unique, and therefore result in sampling 100% of traces.
For example, rather than a set of fields, using only the `HTTP endpoint` field is a **bad** choice, as it is not unique enough, and therefore interesting traces, like traces that experienced a `500`, might not be sampled.
Field names may come from any span in the trace; if they occur on multiple spans, then all unique values will be included in the key.
Expand Down
17 changes: 16 additions & 1 deletion rules.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Honeycomb Refinery Rules Documentation

This is the documentation for the rules configuration for Honeycomb's Refinery.
It was automatically generated on 2024-10-11 at 16:33:02 UTC.
It was automatically generated on 2024-10-22 at 22:51:47 UTC.

## The Rules file

Expand Down Expand Up @@ -118,6 +118,9 @@ Using fields with very high cardinality, like `k8s.pod.id`, is a bad choice.
If the combination of fields essentially makes each trace unique, then the Dynamic Sampler will sample everything.
If the combination of fields is not unique enough, then you will not be guaranteed samples of the most interesting traces.
As an example, consider as a good set of fields: the combination of `HTTP endpoint` (high-frequency and boring), `HTTP method`, and `status code` (normally boring but can become interesting when indicating an error) since it will allowing proper sampling of all endpoints under normal traffic and call out when there is failing traffic to any endpoint.
As of Refinery 2.8.0, the `root.` prefix can be used to limit the field value to that of the root span.
For example, `root.http.response.status_code` will only consider the `http.response.status_code` field from the root span rather than a combination of all the spans in the trace.
This is useful when you want to sample based on the root span's properties rather than the entire trace, and helps to reduce the cardinality of the sampler key.
In contrast, for example, consider as a bad set of fields: a combination of `HTTP endpoint`, `status code`, and `pod id`, since it would result in keys that are all unique, and therefore result in sampling 100% of traces.
For example, rather than a set of fields, using only the `HTTP endpoint` field is a **bad** choice, as it is not unique enough, and therefore interesting traces, like traces that experienced a `500`, might not be sampled.
Field names may come from any span in the trace; if they occur on multiple spans, then all unique values will be included in the key.
Expand Down Expand Up @@ -223,6 +226,9 @@ Using fields with very high cardinality, like `k8s.pod.id`, is a bad choice.
If the combination of fields essentially makes each trace unique, then the Dynamic Sampler will sample everything.
If the combination of fields is not unique enough, then you will not be guaranteed samples of the most interesting traces.
As an example, consider as a good set of fields: the combination of `HTTP endpoint` (high-frequency and boring), `HTTP method`, and `status code` (normally boring but can become interesting when indicating an error) since it will allowing proper sampling of all endpoints under normal traffic and call out when there is failing traffic to any endpoint.
As of Refinery 2.8.0, the `root.` prefix can be used to limit the field value to that of the root span.
For example, `root.http.response.status_code` will only consider the `http.response.status_code` field from the root span rather than a combination of all the spans in the trace.
This is useful when you want to sample based on the root span's properties rather than the entire trace, and helps to reduce the cardinality of the sampler key.
In contrast, for example, consider as a bad set of fields: a combination of `HTTP endpoint`, `status code`, and `pod id`, since it would result in keys that are all unique, and therefore result in sampling 100% of traces.
For example, rather than a set of fields, using only the `HTTP endpoint` field is a **bad** choice, as it is not unique enough, and therefore interesting traces, like traces that experienced a `500`, might not be sampled.
Field names may come from any span in the trace; if they occur on multiple spans, then all unique values will be included in the key.
Expand Down Expand Up @@ -340,6 +346,9 @@ Using fields with very high cardinality, like `k8s.pod.id`, is a bad choice.
If the combination of fields essentially makes each trace unique, then the Dynamic Sampler will sample everything.
If the combination of fields is not unique enough, then you will not be guaranteed samples of the most interesting traces.
As an example, consider as a good set of fields: the combination of `HTTP endpoint` (high-frequency and boring), `HTTP method`, and `status code` (normally boring but can become interesting when indicating an error) since it will allowing proper sampling of all endpoints under normal traffic and call out when there is failing traffic to any endpoint.
As of Refinery 2.8.0, the `root.` prefix can be used to limit the field value to that of the root span.
For example, `root.http.response.status_code` will only consider the `http.response.status_code` field from the root span rather than a combination of all the spans in the trace.
This is useful when you want to sample based on the root span's properties rather than the entire trace, and helps to reduce the cardinality of the sampler key.
In contrast, for example, consider as a bad set of fields: a combination of `HTTP endpoint`, `status code`, and `pod id`, since it would result in keys that are all unique, and therefore result in sampling 100% of traces.
For example, rather than a set of fields, using only the `HTTP endpoint` field is a **bad** choice, as it is not unique enough, and therefore interesting traces, like traces that experienced a `500`, might not be sampled.
Field names may come from any span in the trace; if they occur on multiple spans, then all unique values will be included in the key.
Expand Down Expand Up @@ -428,6 +437,9 @@ Using fields with very high cardinality, like `k8s.pod.id`, is a bad choice.
If the combination of fields essentially makes each trace unique, then the Dynamic Sampler will sample everything.
If the combination of fields is not unique enough, then you will not be guaranteed samples of the most interesting traces.
As an example, consider as a good set of fields: the combination of `HTTP endpoint` (high-frequency and boring), `HTTP method`, and `status code` (normally boring but can become interesting when indicating an error) since it will allowing proper sampling of all endpoints under normal traffic and call out when there is failing traffic to any endpoint.
As of Refinery 2.8.0, the `root.` prefix can be used to limit the field value to that of the root span.
For example, `root.http.response.status_code` will only consider the `http.response.status_code` field from the root span rather than a combination of all the spans in the trace.
This is useful when you want to sample based on the root span's properties rather than the entire trace, and helps to reduce the cardinality of the sampler key.
In contrast, for example, consider as a bad set of fields: a combination of `HTTP endpoint`, `status code`, and `pod id`, since it would result in keys that are all unique, and therefore result in sampling 100% of traces.
For example, rather than a set of fields, using only the `HTTP endpoint` field is a **bad** choice, as it is not unique enough, and therefore interesting traces, like traces that experienced a `500`, might not be sampled.
Field names may come from any span in the trace; if they occur on multiple spans, then all unique values will be included in the key.
Expand Down Expand Up @@ -651,6 +663,9 @@ Using fields with very high cardinality, like `k8s.pod.id`, is a bad choice.
If the combination of fields essentially makes each trace unique, then the Dynamic Sampler will sample everything.
If the combination of fields is not unique enough, then you will not be guaranteed samples of the most interesting traces.
As an example, consider as a good set of fields: the combination of `HTTP endpoint` (high-frequency and boring), `HTTP method`, and `status code` (normally boring but can become interesting when indicating an error) since it will allowing proper sampling of all endpoints under normal traffic and call out when there is failing traffic to any endpoint.
As of Refinery 2.8.0, the `root.` prefix can be used to limit the field value to that of the root span.
For example, `root.http.response.status_code` will only consider the `http.response.status_code` field from the root span rather than a combination of all the spans in the trace.
This is useful when you want to sample based on the root span's properties rather than the entire trace, and helps to reduce the cardinality of the sampler key.
In contrast, for example, consider as a bad set of fields: a combination of `HTTP endpoint`, `status code`, and `pod id`, since it would result in keys that are all unique, and therefore result in sampling 100% of traces.
For example, rather than a set of fields, using only the `HTTP endpoint` field is a **bad** choice, as it is not unique enough, and therefore interesting traces, like traces that experienced a `500`, might not be sampled.
Field names may come from any span in the trace; if they occur on multiple spans, then all unique values will be included in the key.
Expand Down
5 changes: 3 additions & 2 deletions rules_complete.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ Samplers:
ClearFrequency: 1m0s
FieldList:
- request.method
- http.target
- root.http.target
- response.status_code
UseTraceLength: true
env2:
Expand All @@ -47,7 +47,7 @@ Samplers:
BurstDetectionDelay: 3
FieldList:
- request.method
- http.target
- root.http.target
- response.status_code
UseTraceLength: true
env3:
Expand Down Expand Up @@ -134,3 +134,4 @@ Samplers:
GoalThroughputPerSec: 100
FieldList:
- request.method
- root.http.target

0 comments on commit ce34790

Please sign in to comment.