-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kubernetes_logs missing labels after upgrading to 0.30.0 #17487
Comments
The issue doesn't seem limited to the kubernetes logs source. We also get similar errors logged for the file source messages |
Can you confirm you are using the exact same config for both vector It looks like some of the components that are used as inputs into For example, none of the components in the following path adds the |
Yes, I am using the exact same config in 0.29.1 and 0.30.0.
Yes, for some reason the fields are no longer being propagated to the sinks. |
The problem may be that the (vector) label values are actually not set and that the Loki sink rejects messages with labels referencing undefined vector labels. This used to work in 0.29.1. Could it be #17052 ? |
We discussed this briefly. The behavior change in 0.30.0 was to add the log message when a rendered template uses a field that doesn't exist, but the sink functionally still works the same as 0.29.1: it still sends the data without the label. The conclusion we came to is the error should be a warning and, potentially, for users to be able to suppress the warning to replace the old behavior from 0.29.X. To be consistent with the rest of Vector’s template handling in sinks, rendering errors should cause events to be dropped; however, it does seem somewhat useful for Loki labels to conditionally be added depending on if the template field existed. We need to reflect a bit more on what a good configuration UX is for that. |
If anyone else is affected by this: I fixed our config to use dynamic labels imported into the loki sink and everything works as expected again. |
So want to note for others, upgrading to 0.30.0 is not just a logging change - internal Prometheus metrics are affected too. With our use case we're sending
However, we scrape Vector of its own Prometheus metrics and (maybe naively) have the following custom alerting rule as a way to attempt detect any issues sending to our sinks:
So we saw alerts firing because each Vector was accumulating This all sounds expected, but I guess reading the thread I have two questions:
Thanks for the input and the awesome project. We've had great success recently adopting Vector within our logging stack and are experimenting with it as Prometheus metric receiver / forwarder as well 📈 |
…to warnings If a render error doesn't result in a dropped event, it seems more like a warning than an error. For the places that currently emit template errors with `drop_event: false`: * `loki` sink: skips inserting label if key or value fails to render; falls back to `None` for partitioning using `tenant_id` * `throttle` transform: falls back to `None` for throttle key * `log_to_metric` transform: skips tag addition * `papertrail` sink: falls back to `vector` for the `process` field * `splunk_hec_logs` sink: falls back to `None` for partition keys (source, sourcetype, index) * `splunk_hec_metrics` sink: falls back to `None` for source, sourcetype, index Fixes: #17487 Signed-off-by: Jesse Szwedko <jesse.szwedko@datadoghq.com>
Thanks for the discussion all! I opened #17746 to drop this back down to a warning, at least. That should be an improvement in that it won't publish the error metric. |
…to warnings (#17746) If a render error doesn't result in a dropped event, it seems more like a warning than an error. For the places that currently emit template errors with `drop_event: false`: * `loki` sink: skips inserting label if key or value fails to render; falls back to `None` for partitioning using `tenant_id` * `throttle` transform: falls back to `None` for throttle key * `log_to_metric` transform: skips tag addition * `papertrail` sink: falls back to `vector` for the `process` field * `splunk_hec_logs` sink: falls back to `None` for partition keys (source, sourcetype, index) * `splunk_hec_metrics` sink: falls back to `None` for source, sourcetype, index Fixes: #17487 Signed-off-by: Jesse Szwedko <jesse.szwedko@datadoghq.com> Signed-off-by: Jesse Szwedko <jesse.szwedko@datadoghq.com>
A note for the community
Problem
After upgrading from vector 0.29.1 to 0.30.0, vector logs the following errors:
Switching back to 0.29.1 solves the issue. We are using the official Helm charts to deploy. Our cluster is AKS, running Kubernetes 1.24.9
Configuration
Version
docker.io/timberio/vector:0.30.0-debian
Debug Output
No response
Example Data
No response
Additional Context
No response
References
No response
The text was updated successfully, but these errors were encountered: