Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Put a limit on the size of sampler keys #1364

Merged
merged 4 commits into from
Oct 7, 2024

Conversation

kentquirk
Copy link
Contributor

Which problem is this PR solving?

  • If someone sends a big trace where one of the sampler keys is a high-cardinality field, Refinery could generate a huge sampler key value. This causes problems both for refinery but also the downstream telemetry. Let's not do that.

Short description of the changes

  • Put a limit of 100 unique values to make up any one sampler key. Even that is big and probably useless but it should avoid any real use cases.
  • Add a test to show it works.

Fixes #1363

@kentquirk kentquirk requested a review from a team as a code owner October 4, 2024 15:28
Copy link
Contributor

@MikeGoldsmith MikeGoldsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense in general, but I think a generic set to hold the keys is a bit too much. Can we simplify to an int counter?

sample/trace_key.go Show resolved Hide resolved
sample/trace_key.go Show resolved Hide resolved
sample/trace_key.go Outdated Show resolved Hide resolved
@MikeGoldsmith MikeGoldsmith merged commit e5fd6d8 into main Oct 7, 2024
5 checks passed
@MikeGoldsmith MikeGoldsmith deleted the kent.limit_sampler_keys branch October 7, 2024 12:05
MikeGoldsmith added a commit that referenced this pull request Oct 7, 2024
## Which problem is this PR solving?

- If someone sends a big trace where one of the sampler keys is a
high-cardinality field, Refinery could generate a huge sampler key
value. This causes problems both for refinery but also the downstream
telemetry. Let's not do that.

## Short description of the changes

- Put a limit of 100 unique values to make up any one sampler key. Even
that is big and probably useless but it should avoid any real use cases.
- Add a test to show it works.

Fixes #1363

---------

Co-authored-by: Mike Goldsmth <goldsmith.mike@gmail.com>
TylerHelmuth pushed a commit that referenced this pull request Oct 16, 2024
## Which problem is this PR solving?

- If someone sends a big trace where one of the sampler keys is a
high-cardinality field, Refinery could generate a huge sampler key
value. This causes problems both for refinery but also the downstream
telemetry. Let's not do that.

## Short description of the changes

- Put a limit of 100 unique values to make up any one sampler key. Even
that is big and probably useless but it should avoid any real use cases.
- Add a test to show it works.

Fixes #1363

---------

Co-authored-by: Mike Goldsmth <goldsmith.mike@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Protect against giant sample keys
2 participants