-
Notifications
You must be signed in to change notification settings - Fork 206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Tracing index is not re-created in opensearch. Dataprepper needs restart? #4951
Comments
This might be related to #3342 and maybe #3506. The index setup used for spans is a little complicated. It usually uses a write alias, that points to a concrete span index. @AdaptiveStep can you elaborate on your setup? Do you use the default index configuration or do you provide a custom config? When you delete the current index, do you keep the write alias if you have that? Can you provide the error log of DataPrepper, that contains the "index missing" message? |
About Alias: Log message: My config:
gRPC is sent from the "OpenTelemetry collector pod" -> to -> "the dataprepper pod". Just normal basic otel stuff. Basically everything is default, latest version as we speak. Everything works and if you go into the Opensearch GUI you will see the "otel-v1-apm-span-000001" index. Delete this index and it will never be recreated again. Only by restarting dataprepper will it be recreated. The servicemaps index seems buggy too if the the "otel-v1-apm-span-000001"-index gets removed. If both are removed, none of them are coming back. This might explain why the rollover for that other person didn't work. If you remove the metrics index they get recreated. My investigation so far: Summary: |
The difference between OTEL logs/metrics and traces comes from the index setup as mentioned by @KarstenSchnitter.
The question is why you are deleting the current write index ( There is ongoing work to move towards the index alias/rollover approach for logs/metrics as well with #3929. |
Describe the bug
When events are sent to opensearch, usually the index is created if it doesn't exist. This happens for all data except when dataprepper recieves traces. When dataprepper starts up, it creates the necessary tracing indexes for spans and servicemaps, once but never again unless restarted.
If the index is removed during dataprepper runtime, an error saying "index is missing" shows up extremely often, possibly filling up the buffer and eventually causing packetdrops.
To Reproduce
Send traces to dataprepper as per usual. And you will see the trace index in the management/index page on in the "opensearch dashboards gui".
However, if you delete the index, it never gets recreated again! Even if new traces are being sent to dataprepper! Only re-starting dataprepper seems to "recreate" the index again. This can probably be easily fixed so that indexes are recreated if they don't exist in opensearch.
Expected behavior
1: the Span Index needs to be re-created if it doesn't exist when new events come to dataprepper. (And when they are sent to opensearch).
2: the serviceMap index needs to be re-created if it doesn't exist.
Environment (please complete the following information):
I tried this in dataprepper on kubernetes using the dataprepper helmchart.
Additional context
I tried this using the otel demo apps. It seems pretty consistent with all their traces. If the index is removed, it never gets re-created again unless dataprepper is restarted. Neither the "service-map index" nor the "span index" get recreated.
The text was updated successfully, but these errors were encountered: