Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Update Honeycomb logger to use EMAThroughput sampler #1328

Merged
merged 6 commits into from
Sep 16, 2024

Conversation

MikeGoldsmith
Copy link
Contributor

@MikeGoldsmith MikeGoldsmith commented Sep 13, 2024

Which problem is this PR solving?

Updates the Honeycomb logger to use the EMAThroughput sampler. This sampler has built-in support for burst protection that the PerKeyThroughput sampler does not.

Burst protection would be useful when a cluster node does down unexpectedly because the other nodes in the cluster will fail to make peer requests until it is removed from the cluster. This can result in a very high number of "failed to send" log messages in a very small window, faster than what the PerKeyThroughput sampler can adjust the sample rate which results in all log messages being sent. The burst protection from the EMAThroughput sampler would help here as it will schedule an update to sample rates if a high number of events are received very quickly, allowing it to react quicker.

This is a like for like replacement, and such I haven't added any additional configuration options the new sampler supports. We can add more later if desired.

Short description of the changes

  • Replace the Honeycomb logger sampler with the EMAThroughput sampler.

@MikeGoldsmith MikeGoldsmith added the type: enhancement New feature or request label Sep 13, 2024
@MikeGoldsmith MikeGoldsmith self-assigned this Sep 13, 2024
@MikeGoldsmith MikeGoldsmith requested a review from a team as a code owner September 13, 2024 11:18
@MikeGoldsmith MikeGoldsmith added this to the v2.9 milestone Sep 13, 2024
@cartermp
Copy link
Member

Will this also aid #1222?

@MikeGoldsmith
Copy link
Contributor Author

Will this also aid #1222?

Yes, it probably 👍🏻

Copy link
Contributor

@kentquirk kentquirk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, but given what we've been learning about EMA, we should extend the interval.

PerKeyThroughputPerSec: loggerConfig.SamplerThroughput,
MaxKeys: 1000,
h.sampler = &dynsampler.EMAThroughput{
AdjustmentInterval: 10 * time.Second,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's lengthen this interval -- maybe to 30s, please?

Copy link
Contributor

@VinozzZ VinozzZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given all the quirks that we recently discovered in EMA samplers, I'm worried that it's going to be harder to reason why a log line doesn't get sampled

@MikeGoldsmith
Copy link
Contributor Author

We already sample refinery log messages; this change is just give burst protection when a sudden increase increase happens. The throughput samplers ensure any unique log message is sent at least once per window.

@MikeGoldsmith MikeGoldsmith merged commit ac3f6de into main Sep 16, 2024
5 checks passed
@MikeGoldsmith MikeGoldsmith deleted the mike/logger-sampler branch September 16, 2024 20:01
TylerHelmuth pushed a commit that referenced this pull request Oct 16, 2024
## Which problem is this PR solving?

Updates the Honeycomb logger to use the
[EMAThroughput](https://github.com/honeycombio/dynsampler-go/blob/main/emathroughput.go#L77)
sampler. This sampler has built-in support for burst protection that the
[PerKeyThroughput](https://github.com/honeycombio/dynsampler-go/blob/main/perkeythroughput.go#L17)
sampler does not.

Burst protection would be useful when a cluster node does down
unexpectedly because the other nodes in the cluster will fail to make
peer requests until it is removed from the cluster. This can result in a
very high number of "failed to send" log messages in a very small
window, faster than what the PerKeyThroughput sampler can adjust the
sample rate which results in all log messages being sent. The burst
protection from the EMAThroughput sampler would help here as it will
schedule an update to sample rates if a high number of events are
received very quickly, allowing it to react quicker.

This is a like for like replacement, and such I haven't added any
additional configuration options the new sampler supports. We can add
more later if desired.

## Short description of the changes
- Replace the Honeycomb logger sampler with the EMAThroughput sampler.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants