-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update example config for handling tail-sampling and span metric generation when horizontally scaling collectors #6260
base: main
Are you sure you want to change the base?
Conversation
dd7bf67
to
eb5c3ab
Compare
…ing exporter, tail-sampling processor, and span metrics connector together when scaled to multiple collector instances. Also removes language and configuration suggesting that load-balancing should be a different deployment of collectors than the collectors doing the tail sampling and span metric generation. It's easier to maintain a single deployment responsible for both load balancing and processing of the load balanced data but the pattern for doing this may not be obvious at first.
eb5c3ab
to
b184128
Compare
thanks @swar8080 ! @open-telemetry/collector-approvers PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the change, but I'd really prefer to have three collector instances (three config files) instead of one with three pipelines. The reason is that each pipeline (load-balancer, tail-sampler, span metrics) has a different load profile and would scale differently.
You absolutely CAN do it like depicted here, but without understanding the nuances, I'd prefer the official documentation to have each pipeline to be its own deployment.
|
||
traces/span_metrics: | ||
receivers: | ||
- otlp/for_tail_sampling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- otlp/for_tail_sampling | |
- otlp/for_span_metrics |
Hi @jpkrohling, thanks for reviewing Separating span metrics and tail sampling to different collectors is a good callout. A spike it span volume or span metric cardinality explosion would likely only cause problems for one deployment but not the other. I'll go ahead and change this into three collector configurations, or two if the below persuades you otherwise :). Maybe we missed the benefits of load balancing being its own deployment. Both set-ups have to receive incoming spans, and load balancing to the same deployment has twice as many receives and exports. But it seems like the extra memory for load balancing is dwarfed by memory needed for tail-sampling and span metrics, which also grows with span volume. For CPU, load balancing exporter used a lot before this optimization, but now our pprof shows it as a small percent of total cpu time. So for us it didn't seem worth the effort of maintaining another deployment, which is why we ended-up consolidating it. That saved us another deployment to monitor, and also one less file to jump between when working on our collector config, since we did some filtering/edits before load balancing. So maybe two separate deployments, each load balancing to themselves, would be a good set-up for a lot of users? |
Hello, we used this page's documentation when setting-up tail-sampling and span metric generation. The guide was helpful but there were a couple things it could've mentioned to make our implementation smoother:
So this documentation suggestion shows how all the components work together in a single collector deployment.