-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collector constantly breaking down #6420
Comments
Even though I am using the contrib docker, it's code coming from this repo, where it's panicking: |
@tigrannajaryan Any idea why this might be breaking? |
To me this seems to be caused by the "spanmetrics" since I have a feeling that it mutates the data that it also sends to the exporter. |
This is a good hunch. The failure shows we run out of buffer while marshalling. One way this can happen is if data is mutated after the buffer size is calculated. A component is likely misbehaving and mutating data when it shouldn't. |
@ambition-consulting are you sure about the version you are using? Since I see something like helm chart version 30 which is very old. |
Yes, I don't have permissions to run the operator on k8s, thus I've been using the helm chart template, but replaced the version inside:
|
@ambition-consulting still investigating, is this the first version you saw this? Have you run 61 successfully without any error? Updated: Do you see this with v0.63.0 as well or just with v0.62.1? |
Also, can you run the collector with this configuration for the pipelines, to isolate the problem: pipelines:
logs:
receivers:
- otlp
processors:
- batch
exporters:
- logging
traces:
receivers:
- otlp
processors:
- batch
exporters:
- otlphttp
traces/spanmetrics:
receivers:
- otlp
processors:
- spanmetrics
exporters:
- logging
metrics:
receivers:
- otlp
processors:
- batch
exporters:
- logging
- prometheus
metrics/spanmetrics:
receivers:
- otlp/spanmetrics
exporters:
- otlp/spanmetrics |
yes with both - those are also the only versions I have tested. |
@ambition-consulting let me know when you have updates from the new config proposal |
Hello everyone! As an extra, I'm using the Attributes Span Processor feature The only thing I could notice prior to the error is that is trying to process a trace with at least 26 spans and then it crashes.
|
@andretong can you share your config as well? |
@bogdandrutu
|
I'm receiving the same error as @andretong above. My config is pretty similar but with the addition of the transform processor:
|
@bogdandrutu Hi! Here I attach other stack trace that is crashing the collector in version 0.63.1
|
Does any of you @andretong @Edition-X happen to run a previous version where this did not happen, and can help me identify in which version it started? |
Also, can any of you (@andretong @Edition-X @ambition-consulting) deploy the collector without the "batch" processor, seem to be the common thing between all your pipeline. |
Hi @bogdandrutu I made the test without the
|
I ran the same config file with the above versions: 0.60.0 -> OK |
@ambition-consulting @andretong @Edition-X found the bug, will submit a fix soon. In the meantime if you want a quick fix, remove the logging exporter from the pipelines, or do not configure |
Thanks @bogdandrutu |
@bogdandrutu thank you very much! |
@Edition-X @andretong the v0.64.1 is ready to be tested :) |
Removes the first two points from the bugfix release criteria. I think the remaining points give a more accurate picture of the decision making process we have taken so far, (e.g for #6420, where the first two points were not fulfilled). We can revisit this in the future if there are disagreements on when to do a bugfix release
Describe the bug
After a while the pod, on which the collector runs, stops with this error:
Steps to reproduce
Using otel/opentelemetry-collector-contrib:0.62.1 docker image and config:
Environment
OS: linux docker on k8s
The text was updated successfully, but these errors were encountered: