-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include the Tail Sampling Processor #1229
Comments
@seh any interest in contributing a PR for this? |
Yes, though it would help to have an example that shows similar precedent, in order to help get started and show how large such a patch is likely to become. Do any other components introduced like this come to mind? |
In the tail sampling processor, it is expected that spans with the same trace id should go to the same collector instance, because sampling decision is made at the end of trace over all spans in that trace. When spans in the same trace go to different collector instance, tail sampling processor will not work properly and there will likely be partial traces because of partial sampling of spans in the same trace. In our case, since different Lambda functions and instances can be part of the same trace (flow), exporting their spans to different Lambda collectors for tail sampling will be problematic. So, I am not sure how we expect from tail sampling processor to run in the Lambda collector instance. Am I missing anything here? |
I expect that you're correct here, as, by analogy—or maybe prior art—Honeycomb's Refinery servers coordinate "ownership" of traces by tracking the peer count and identity via Redis. Without coordination like that, lots of separate collectors each making independent decisions wouldn't work very well. For my case, though, I expect the "entire" trace—at least the spans pertinent to this part of our system—to come from a given Lambda function instance, such that I could still make simple local sampling decisions usefully, such as "keep every trace with an error but keep no more than one of twenty of the rest". |
Reasonable argument. Going to close this, but reopen if you feel you have another idea how to proceed. |
Motivating Problem
When running the OpenTelemetry Collector alongside a Lambda function, it is difficult to coordinate running separate tail sampling proxies like Honeycomb Refinery, for lack of a suitable container orchestration system being available. It would be beneficial to perform tail sampling within the OpenTelemetry Collector running alongside each Lambda function as well.
Proposed Solution
Make the Tail Sampling Processor available as a processor in the OpenTelemetry Collector Lambda layer's build.
Alternatives Considered
Set up a Kubernetes or ECS cluster to run Honeycomb Refinery, and export traces from each OpenTelemetry Collector to Refinery for tail sampling. However, for a system that relies solely on AWS Lambda functions, establishing that separate environment within which to run Refinery is difficult.
Additional Context
I don't know the change in "weight" in the size of the built executable file results from adding this new processor.
The text was updated successfully, but these errors were encountered: