-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
filestream
input logs an error when an existing input is reloaded with the same ID
#31767
Comments
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
That is fixed by: #32309 |
I see that the fix has been backported to 7.17. Are there plans already for the next release date? |
There is a planned 7.17.6 release in the near future. |
According to the release notes, the fix is included in 7.17.6 but I still see the error happening with a single filestream input in a file in the inputs.d directory. |
Hi @abraxxa, could you share some more details about how you're experiencing this issue? A minimal config and steps to reproduce would be ideal. I've just checked the |
Hi @belimawr, thanks for your response! Exactly as reported above. I've started yesterday changing |
Sorry if I'm asking something obvious, but are you sure you didn't have any filestream input configured on your Did you touch/edit the file(s) on I'm just trying to understand what triggered it. Starting a Filebeat with a single filestream input on |
filebeat.yml is untouched from the deb package. It has one disabled input of That's the exact error message: And here is the file in the input.d directory:
|
Thanks! I'll investigate it. |
Ok, I can confirm that this is still happening on |
Great, thanks! |
yes, for some reason Filebeat is reading the config twice on startup, I believe it reads the |
The problemI believe I got to the root cause:
Why does it not lead to two inputs running at the same time?Well, the harvester is smart enough not to run two harvesters for the same file from the same input. So only one of the inputs actually run. This leads to no issues while running Filebeat. WorkaroundThere are two possible workarounds, both with the downside of not having the reload of inputs happening automatically. Only use
|
Some extra informationHere are the stacktraces of when the filestream input is first loaded by the input manager and the second time it's loaded by the input manager (via the reloader), which triggers the log message: First time, it comes from the "normal" startup that started on
|
@pierrehilbert @cmacknz we should look into the priority of this, when using the |
Thanks for debugging this so quickly! 👍🏻 Can |
Yes, you can add:
on your This option is enabled by default, hence it's not explicitly listed on Or just not add anything into |
But then the file in the inputs.d directory won't be loaded. |
yes, that's the downside of the workaround, you'd have to add all input configuration to You can add your input configurations on any part of filebeat.config.inputs:
enabled: false
path: inputs.d/*.yml
reload.period: 60s
output.file:
enabled: true
path: "/tmp/filebeat-out"
filebeat.inputs:
- type: filestream
id: my-unique-id
paths: [/tmp/flog.log] That might make it a bit easier to manipulatethis file with Ansible (e.g: concatenate a default config file with the input configuration) . If you don't need to watch the |
I just read up on this feature in the docs. What if I disable the input.d file reloading using |
Sorry @germain05, I didn't get what you meat. Could you elaborate it more? |
@belimawr I have this issue: #36379
I am talking about that. to be more precise I have this issue here: #36379 and I want to understand why do I have data duplications even if the ID is unique? |
@germain05 I'm reading/responding to it right now. |
@germain05 I replied here: #36379 (comment) TL;DR: The issue should have been fixed, and so far I haven't managed to reproduce it, so I'll need some help understanding what is different on your environment. |
@belimawr I though so also, when going through issues. Added more informations there: #36379 (comment) |
I can easily reproduce it on Kubernetes with Filebeat It's mostly the manifest from https://www.elastic.co/guide/en/beats/filebeat/current/running-on-kubernetes.html but with small modifications in Filebeat's configuration. Here is the kubernetes manifest I used: apiVersion: v1
kind: ServiceAccount
metadata:
name: filebeat
namespace: kube-system
labels:
k8s-app: filebeat
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: filebeat
labels:
k8s-app: filebeat
rules:
- apiGroups: [""] # "" indicates the core API group
resources:
- namespaces
- pods
- nodes
verbs:
- get
- watch
- list
- apiGroups: ["apps"]
resources:
- replicasets
verbs: ["get", "list", "watch"]
- apiGroups: ["batch"]
resources:
- jobs
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: filebeat
# should be the namespace where filebeat is running
namespace: kube-system
labels:
k8s-app: filebeat
rules:
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs: ["get", "create", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: filebeat-kubeadm-config
namespace: kube-system
labels:
k8s-app: filebeat
rules:
- apiGroups: [""]
resources:
- configmaps
resourceNames:
- kubeadm-config
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: filebeat
subjects:
- kind: ServiceAccount
name: filebeat
namespace: kube-system
roleRef:
kind: ClusterRole
name: filebeat
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: filebeat
namespace: kube-system
subjects:
- kind: ServiceAccount
name: filebeat
namespace: kube-system
roleRef:
kind: Role
name: filebeat
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: filebeat-kubeadm-config
namespace: kube-system
subjects:
- kind: ServiceAccount
name: filebeat
namespace: kube-system
roleRef:
kind: Role
name: filebeat-kubeadm-config
apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: ConfigMap
metadata:
name: filebeat-config
namespace: kube-system
labels:
k8s-app: filebeat
data:
filebeat.yml: |-
filebeat.autodiscover:
providers:
- type: kubernetes
node: ${NODE_NAME}
hints.enabled: true
hints.default_config:
type: filestream
prospector.scanner.symlinks: true
id: filestream-kubernetes-pod-${data.kubernetes.container.id}
take_over: true
paths:
- /var/log/containers/*${data.kubernetes.container.id}.log
parsers:
- container: ~
processors:
- add_cloud_metadata:
- add_host_metadata:
output.elasticsearch:
hosts: ["https://localhost:9200"] # add real credentials
protocol: "https"
username: "elastic"
password: "changeme"
allow_older_versions: true
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: filebeat
namespace: kube-system
labels:
k8s-app: filebeat
spec:
selector:
matchLabels:
k8s-app: filebeat
template:
metadata:
labels:
k8s-app: filebeat
spec:
serviceAccountName: filebeat
terminationGracePeriodSeconds: 30
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
containers:
- name: filebeat
image: docker.elastic.co/beats/filebeat:8.9.2
args: [
"-c", "/etc/filebeat.yml",
"-e",
]
env:
- name: ELASTICSEARCH_HOST
value: elasticsearch
- name: ELASTICSEARCH_PORT
value: "9200"
- name: ELASTICSEARCH_USERNAME
value: elastic
- name: ELASTICSEARCH_PASSWORD
value: changeme
- name: ELASTIC_CLOUD_ID
value:
- name: ELASTIC_CLOUD_AUTH
value:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
securityContext:
runAsUser: 0
# If using Red Hat OpenShift uncomment this:
#privileged: true
resources:
limits:
memory: 200Mi
requests:
cpu: 100m
memory: 100Mi
volumeMounts:
- name: config
mountPath: /etc/filebeat.yml
readOnly: true
subPath: filebeat.yml
- name: data
mountPath: /usr/share/filebeat/data
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
- name: varlog
mountPath: /var/log
readOnly: true
volumes:
- name: config
configMap:
defaultMode: 0640
name: filebeat-config
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
- name: varlog
hostPath:
path: /var/log
# data folder stores a registry of read status for all files, so we don't send everything again on a Filebeat pod restart
- name: data
hostPath:
# When filebeat runs as non-root user, this directory needs to be writable by group (g+w).
path: /var/lib/filebeat-data
type: DirectoryOrCreate
---
|
We are seeing the same issue as @belimawr with filebeat 8.11.1. Any workaround to resolve this? Maybe fingerprint? |
The fingerprint won't help here as it is the file identity and has no influence on this log message. The filestream input is working as expected, however that log message gets incorrectly logged in some cases. |
@belimawr its more than an incorrect log, it causes the pod to crash continuously. |
Ok, then it's a bigger problem.
|
I am sorry I am mixing my issues lol. We do not have pod restarts with this problem. Just flooding of logs the customer is not happy seeing. |
@belimawr Is there a work around for this issue? |
The incorrect logs? No at the moment there isn't any. |
Yes |
Just to add to this I've upgraded to filebeat |
same here. running filebeat 8.14.0 in AWS EKS. I am getting "filestream input with ID '...." already exits, this wil lead to data duplication... " |
Filebeat in version |
When loading inputs from external files is enabled and there is only one
filestream
input enabled the following error is returned after the first reload:The problem is that the input ID stays in the bookeeper of the input manager. On reload we should remove all IDs from the manager, so the check does not interfere with previous configurations.
Example configuration
filebeat.yml
inputs.d/my-input.yml
There is no workaround. At the moment it is not possible to configure
filestream
input from external configuration files.Reported on discuss: https://discuss.elastic.co/t/filebeat-raise-error-filestream-input-id-already-exist
CC @belimawr
The text was updated successfully, but these errors were encountered: