Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(throttle transform)!: make events_discarded_total internal metric with key tag opt-in #19083

Merged
merged 10 commits into from
Nov 13, 2023
31 changes: 17 additions & 14 deletions src/internal_events/throttle.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,27 +4,30 @@ use vector_lib::internal_event::{ComponentEventsDropped, InternalEvent, INTENTIO
#[derive(Debug)]
pub(crate) struct ThrottleEventDiscarded {
pub key: String,
pub emit_events_discarded_per_key: bool,
}

impl InternalEvent for ThrottleEventDiscarded {
fn emit(self) {
// TODO: Technically, the Component Specification states that the discarded events metric
Copy link
Member

@pront pront Nov 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleting TODOs sparks joy.

// must _only_ have the `intentional` tag, in addition to the core tags like
// `component_kind`, etc, and nothing else.
//
// That doesn't give us the leeway to specify which throttle bucket the events are being
// discarded for... but including the key/bucket as a tag does seem useful and so I wonder
// if we should change the specification wording? Sort of a similar situation to the
// `error_code` tag for the component errors metric, where it's meant to be optional and
// only specified when relevant.
counter!(
"events_discarded_total", 1,
"key" => self.key,
); // Deprecated.
let message = "Rate limit exceeded.";

debug!(message, key = self.key, internal_log_rate_limit = true);
if self.emit_events_discarded_per_key {
// TODO: Technically, the Component Specification states that the discarded events metric
// must _only_ have the `intentional` tag, in addition to the core tags like
// `component_kind`, etc, and nothing else.
//
// That doesn't give us the leeway to specify which throttle bucket the events are being
// discarded for... but including the key/bucket as a tag does seem useful and so I wonder
// if we should change the specification wording? Sort of a similar situation to the
// `error_code` tag for the component errors metric, where it's meant to be optional and
// only specified when relevant.
counter!("events_discarded_total", 1, "key" => self.key); // Deprecated.
}

emit!(ComponentEventsDropped::<INTENTIONAL> {
count: 1,
reason: "Rate limit exceeded."
reason: message
})
}
}
34 changes: 29 additions & 5 deletions src/transforms/throttle.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,24 @@ use crate::{
transforms::{TaskTransform, Transform},
};

/// Configuration of internal metrics for the Throttle transform.
#[configurable_component]
#[derive(Clone, Debug, PartialEq, Eq, Default)]
#[serde(deny_unknown_fields)]
pub struct ThrottleInternalMetricsConfig {
/// Whether or not to emit the `events_discarded_total` internal metric with the `key` tag.
///
/// If true, the counter will be incremented for each discarded event, including the key value
/// associated with the discarded event. If false, the counter will not be emitted. Instead, the
/// number of discarded events can be seen through the `component_discarded_events_total` internal
/// metric.
///
/// Note that this defaults to false because the `key` tag has potentially unbounded cardinality.
/// Only set this to true if you know that the number of unique keys is bounded.
#[serde(default)]
pub emit_events_discarded_per_key: bool,
}

/// Configuration for the `throttle` transform.
#[serde_as]
#[configurable_component(transform("throttle", "Rate limit logs passing through a topology."))]
Expand All @@ -43,6 +61,10 @@ pub struct ThrottleConfig {

/// A logical condition used to exclude events from sampling.
exclude: Option<AnyCondition>,

#[configurable(derived)]
#[serde(default)]
internal_metrics: ThrottleInternalMetricsConfig,
}

impl_generate_config_from_default!(ThrottleConfig);
Expand Down Expand Up @@ -79,6 +101,7 @@ pub struct Throttle<C: clock::Clock<Instant = I>, I: clock::Reference> {
key_field: Option<Template>,
exclude: Option<Condition>,
clock: C,
internal_metrics: ThrottleInternalMetricsConfig,
}

impl<C, I> Throttle<C, I>
Expand Down Expand Up @@ -116,6 +139,7 @@ where
flush_keys_interval,
key_field: config.key_field.clone(),
exclude,
internal_metrics: config.internal_metrics.clone(),
})
}
}
Expand Down Expand Up @@ -170,11 +194,10 @@ where
Some(event)
}
_ => {
if let Some(key) = key {
emit!(ThrottleEventDiscarded{key})
} else {
emit!(ThrottleEventDiscarded{key: "None".to_string()})
}
emit!(ThrottleEventDiscarded{
key: key.unwrap_or_else(|| "None".to_string()),
emit_events_discarded_per_key: self.internal_metrics.emit_events_discarded_per_key
});
None
}
}
Expand Down Expand Up @@ -421,6 +444,7 @@ key_field = "{{ bucket }}"
window_secs: Duration::from_secs_f64(1.0),
key_field: None,
exclude: None,
internal_metrics: Default::default(),
};
let (tx, rx) = mpsc::channel(1);
let (topology, mut out) = create_topology(ReceiverStream::new(rx), config).await;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,29 @@ badges:
type: breaking change
---

Vector's 0.35.0 release includes **deprecations**:
Vector's 0.35.0 release includes **breaking changes**:

1. [The Throttle transform's `events_discarded_total` internal metric is now opt-in](#events-discarded-total-opt-in)

and **deprecations**:

1. [Deprecation of `file` internal metric tag for file-based components](#deprecate-file-tag)

We cover them below to help you upgrade quickly:

## Upgrade guide

### Breaking Changes

#### The Throttle transform's `events_discarded_total` internal metric is now opt-in {#events-discarded-total-opt-in}

The Throttle transform's `events_discarded_total` internal metric, which includes the `key` tag, is now only emitted on
an opt-in basis. Users can opt-in to emit this metric by setting `internal_metrics.emit_events_discarded_per_key` to `true`
in the corresponding Throttle transform component config. This change is motivated by the fact that the `key` metric tag has
potentially unbounded cardinality.

To view events discarded without the `key` tag, use the `component_discarded_events_total` internal metric.

### Deprecations

#### Deprecation of `file` internal metric tag for file-based components {#deprecate-file-tag}
Expand Down
19 changes: 19 additions & 0 deletions website/cue/reference/components/transforms/base/throttle.cue
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,25 @@ base: components: transforms: throttle: configuration: {
required: false
type: condition: {}
}
internal_metrics: {
description: "Configuration of internal metrics for the Throttle transform."
required: false
type: object: options: emit_events_discarded_per_key: {
description: """
Whether or not to emit the `events_discarded_total` internal metric with the `key` tag.
If true, the counter will be incremented for each discarded event, including the key value
associated with the discarded event. If false, the counter will not be emitted. Instead, the
number of discarded events can be seen through the `component_discarded_events_total` internal
metric.
Note that this defaults to false because the `key` tag has potentially unbounded cardinality.
Only set this to true if you know that the number of unique keys is bounded.
"""
required: false
type: bool: default: false
}
}
key_field: {
description: """
The value to group events into separate buckets to be rate limited independently.
Expand Down
Loading