Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tail-based sampling #526

Merged
merged 14 commits into from
Apr 16, 2021
Merged

Tail-based sampling #526

merged 14 commits into from
Apr 16, 2021

Conversation

mapno
Copy link
Member

@mapno mapno commented Apr 9, 2021

PR Description

Implement tail-based sampling. This is done by layering two pipelines: one to load balance spans across agent instances and another that does the actual tail sampling.

Which issue(s) this PR fixes

None

Notes to the Reviewer

PR is still WIP

PR Checklist

  • CHANGELOG updated
  • Documentation added
  • Tests updated

@rfratto
Copy link
Member

rfratto commented Apr 9, 2021

Our example Tempo Kubernetes manifests deploy it as a DaemonSet, and then uses a service to LB across the pods. Will this break that workflow?

@mapno
Copy link
Member Author

mapno commented Apr 12, 2021

Our example Tempo Kubernetes manifests deploy it as a DaemonSet, and then uses a service to LB across the pods. Will this break that workflow?

To be honest, I haven't tested it yet with the k8s examples, so I can't say for sure. I think it will. ATM I'm working on that, plus updating documentation and docker-compose examples.

@mapno mapno force-pushed the tail-sampling branch 3 times, most recently from 2275a23 to 9d541d5 Compare April 12, 2021 19:52
Copy link
Member

@joe-elliott joe-elliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall love this work and I think we should merge it shortly and begin experimenting with it.

Are the policies and load balancing separately configurable? So you can have one layer that does the load balancing and one that applies the policies?

docs/configuration-reference.md Show resolved Hide resolved
defaultDecisionWait = time.Second * 5

// defaultLoadBalancingPort is the default port the agent uses for internal load balancing
defaultLoadBalancingPort = "9999"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should choose otlp default + 1? would give it a bit more meaning.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, 9999 was just a placeholder. Changed in 6ca6e3e

@mapno
Copy link
Member Author

mapno commented Apr 14, 2021

All right, my last commit 8d11011 adds tail-based sampling to the k3d example. It also introduces the synthetic-load-generator and the otel-collector as dependencies in the example.

@mapno mapno changed the title WIP: Tail-based sampling Tail-based sampling Apr 14, 2021
Copy link
Member

@joe-elliott joe-elliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. couple of clean up things

},
]),

pvc:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pvc necessary for the load generator?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in c25abf3, thanks

'--config=/etc/collector/config.yaml',
),

pvc:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no pvc necessary for otelcol

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in c25abf3

@mapno mapno merged commit 1ab3d31 into grafana:main Apr 16, 2021
@mapno mapno deleted the tail-sampling branch April 16, 2021 10:08
@mattdurham mattdurham mentioned this pull request Sep 7, 2021
3 tasks
mattdurham pushed a commit that referenced this pull request Nov 11, 2021
* Add tail sampling in tempo pipelines

* Add load balancing for traces when tail sampling

Implements load balancing of spans by trace ID between agent instances for tail sampling

* Lint things

* Config recevier listening port

* Add sampling in scraping service compose example

* Don't load balance when it's not needed

Many times users will have single instance deployements that do not require
to load balance spans.

Load balancing block is optional

* Config reference for tail sampling

* Update CHANGELOG

* Fix panic

* Move exporter to its own block

* Make tail-sampling work in k3d example

* Fix image name

* Lower collector log level

* Remove unnecessary pvcs
@github-actions github-actions bot added the frozen-due-to-age Locked due to a period of inactivity. Please open new issues or PRs if more discussion is needed. label Apr 13, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 13, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
frozen-due-to-age Locked due to a period of inactivity. Please open new issues or PRs if more discussion is needed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants