-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
filebeat Duplicated logs at beginning with kubernetes autodiscovery #24208
Comments
Pinging @elastic/integrations (Team:Integrations) |
issue is still present in filebeat 7.16 |
@marqc do you have a workaround for the behaviour? Or do you just live with the duplicates? :-/ |
I have been trying to replicate this issue, but I didn't manage to. @marqc, @stephan-erb-by are you still facing this issue? Have you tried the latest versions |
hey @belimawr! I have played around with this a bit and could still see this (or a related) issue in 7.17.2. I don't have a minimal reproducing example, but could observe the problem with a config such as this one:
We are launching PODs with multiple containers (
I would assume this should not really happen, as we'd assume each container is only discovered once and gets only one filestream launched. The issue reminds me a bit of #29015 that was also causing pain for us back in the day. Maybe there is a problem in autodiscovery itself. Could it be, that the auto-discovery discovers a single container multiple times? |
@stephan-erb-by thanks a lot for the quick reply! I'm not sure if it's related to #29015, but it could be. It seems to have been solved, so it's hard to tell without reproducing it. At least this confirms that this issue is related to the duplicated IDs in fliestream, the question now is why there are duplicated IDs if the ID is the container ID (which is unique). Could you send me the hints you're setting on those pod/containers? If you could share the whole manifest that would be even better. If there are any sensitive information redact them. I'm mostly interested on how the annotations/hints are being set and how the multiple containers are configured on you pods. |
hey @belimawr ! I meet same problem. What can I do send for you, for help me with this problem? |
The whole Kubernetes manifest (don't forget to redact any sensitive information) and debug logs for the following selectors: If you can consistently reproduce it, a simple step-by-step on how to reproduce it would be amazing. Preferably using Kind or Minikube. |
Hey @belimawr This is my Kubernetes manifest https://pastebin.com/raw/ZqD2BNJ8 Logs https://pastebin.com/raw/NXzFYxb8 Thanks for you helping. |
@mgfnv9 Thanks for the files! I cannot access your logs, I keep getting 403:
Did you notice any data duplication or just the log error? The good news is that I managed to reproduce the error log bug! I reproduced it both using Kind and using a VM running Minikube with the I had to modify a little bit the config to get it working with my Kubernetes setup, here is my config: filebeat.autodiscover:
providers:
- type: kubernetes
node: ${NODE_NAME}
kube_config: "/root/.kube/config"
hints.enabled: true
hints.default_config:
type: filestream
id: ${data.kubernetes.container.id}
prospector.scanner.symlinks: true
paths:
- /var/log/containers/*${data.kubernetes.container.id}.log
processors:
- add_cloud_metadata:
- add_host_metadata:
- add_tags:
tags: ["something"]
cloud.id: "some cloud ID"
cloud.auth: "elastic:very secret password"
logging.level: debug
logging.selectors: [ input, input.filestream, harvester, autodiscover ] |
@belimawr Sorry, pastebin don't show this logs. |
@belimawr My I ask you, what's the difference between type: filestream or type: container, if I choose And what I would prefer? |
They're different inputs from Filebeat. The container input is meant to read log files from containers, it does some transformations/parsing of the data to extract what is the actual log entry from the entries logged by the container runtime. I'm hoping to have time to look into this this week. |
Hello everyone, |
Hello everyone, Same problem filebeat 7.17.3 in k8s cluster 1.23.8 using filebeat autodiscover with input type docker. See the following logs (same time -> same directory) 2023-06-16T11:19:16.577Z INFO [input] log/input.go:171 Configured paths: [/var/lib/docker/containers/2ac82fdbf6fb151562c65810425ae6ac68e1b63f3a2bfebe029a81b333d826ff/*.log] {"input_id": "e3001f5b-b0e6-4760-8484-ff6f7967c844"}
2023-06-16T11:19:16.581Z INFO [input] log/input.go:171 Configured paths: [/var/lib/docker/containers/2ac82fdbf6fb151562c65810425ae6ac68e1b63f3a2bfebe029a81b333d826ff/*.log] {"input_id": "8e080260-e047-4ad2-a0ac-70b2be0b03f2"} We see that input.type is diferent. The first one is docker and the second one is container. In filebeat configuration we are not using container input type. Any solution? |
Aside from getting two log entries for the same container, are you consistently getting data duplication? Without more logs it's harder to tell exactly what is happening. We do have a code path that instantiates an input with the objective of validating its configuration, but this input is never started. That seems to be the case there (if you're not experiencing any other issues). |
Hi! We're labeling this issue as |
Sometimes, when filebeat discovers new pod/container, it opens log file multiple times and collected logs are duplicated.
Version:
7.11.0 and 7.11.1
Operating System:
filebeat is deployed as daemonset in kube cluster
host machines operate on centos 7.9.2009
cpu: AMD EPYC 7402P 24-Core Processor (48 logical cores) or Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz (32 logical cores)
Configuration:
Filebeat logs:
Debug logs attached at file, search for container cd1336aa6
filtered-debug.log.gz
There are also error messages:
The text was updated successfully, but these errors were encountered: