`filestream` input logs an error when an existing input is reloaded with the same ID #31767

kvch · 2022-05-26T16:14:50Z

When loading inputs from external files is enabled and there is only one filestream input enabled the following error is returned after the first reload:

filestream input with ID 'my-unique-id' already exists, this will lead to data duplication, please use a different ID

The problem is that the input ID stays in the bookeeper of the input manager. On reload we should remove all IDs from the manager, so the check does not interfere with previous configurations.

Example configuration

filebeat.yml

 filebeat.config.inputs:
   enabled: true
   path: inputs.d/*.yml

 output.file:
   enabled: true

inputs.d/my-input.yml

- type: filestream
  id: my-unique-id
  paths: [/var/log/messages]

There is no workaround. At the moment it is not possible to configure filestream input from external configuration files.

Reported on discuss: https://discuss.elastic.co/t/filebeat-raise-error-filestream-input-id-already-exist

CC @belimawr

The text was updated successfully, but these errors were encountered:

elasticmachine · 2022-05-26T16:14:54Z

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

belimawr · 2022-07-12T19:27:38Z

That is fixed by: #32309

andreycha · 2022-07-27T16:42:01Z

I see that the fix has been backported to 7.17. Are there plans already for the next release date?

cmacknz · 2022-07-27T17:36:14Z

There is a planned 7.17.6 release in the near future.

abraxxa · 2022-09-05T16:11:03Z

According to the release notes, the fix is included in 7.17.6 but I still see the error happening with a single filestream input in a file in the inputs.d directory.

belimawr · 2022-09-06T07:16:13Z

According to the release notes, the fix is included in 7.17.6 but I still see the error happening with a single filestream input in a file in the inputs.d directory.

Hi @abraxxa, could you share some more details about how you're experiencing this issue? A minimal config and steps to reproduce would be ideal.

I've just checked the v7.17.6 tag and the fix is there. If you can help me to reproduce it, I'll investigate it further.

abraxxa · 2022-09-06T07:19:14Z

Hi @belimawr, thanks for your response!

Exactly as reported above.

I've started yesterday changing type: log to filestream and it happened on the very first server where I only have a single file in the input.d directory which collects logs from three files (no globs involved).
Filebeat is version 7.17.6 and the VM runs on Debian 10.

belimawr · 2022-09-06T07:38:53Z

Sorry if I'm asking something obvious, but are you sure you didn't have any filestream input configured on your filebeat.yml?

Did you touch/edit the file(s) on inputs.d/?

I'm just trying to understand what triggered it. Starting a Filebeat with a single filestream input on inputs.d without any other filestream configured on filebeat.yml should never trigger this message.

abraxxa · 2022-09-06T07:43:36Z

filebeat.yml is untouched from the deb package. It has one disabled input of type: filestream for /var/log/*.log.
But as the error message included the id I assigned to the filestream input in the input.d directory, I didn't check it before.

That's the exact error message:
Sep 05 17:40:16 tsa-tc-dot1x-3 filebeat[23435]: 2022-09-05T17:40:16.552+0200 ERROR [input] input-logfile/manager.go:183 filestream input with ID 'dot1x-services' already exists, this will lead to data duplication, please use a different ID

And here is the file in the input.d directory:

---
- type: filestream
  id: dot1x-services
  paths:
      - /var/log/radiator-dot1x/central-logging-radiator.json
      - /var/log/radiator-dot1x/central-logging-authentication.json
      - /var/log/radiator-dot1x/central-logging-accounting.json
  encoding: utf-8
  parsers:
      - ndjson:
          keys_under_root: true
          overwrite_keys: true
          expand_keys: true
  prospector.scanner.check_interval: 1s
  processors:
    - add_fields:
        target: "data_stream"
        fields:
            dataset: "radiator"
            type: "logs"
    - add_fields:
        target: "event"
        fields:
            module: "radiator-dot1x"

belimawr · 2022-09-06T07:47:30Z

Thanks! I'll investigate it.

belimawr · 2022-09-06T08:05:45Z

Ok, I can confirm that this is still happening on v7.16.6 as well as on `main.

abraxxa · 2022-09-06T08:08:41Z

Great, thanks!
One more info: it happened when first restarting filebeat with the input converted from log to filestream as well as on reload and restart now.

belimawr · 2022-09-06T08:12:09Z

yes, for some reason Filebeat is reading the config twice on startup, I believe it reads the inputs.d as the normal startup sequence, then starts watching inputs.d which trigger a re-read of the files, hence a loading again the same filestream input. But it does not stop the current one before starting the new one.

belimawr · 2022-09-06T09:10:16Z

The problem

I believe I got to the root cause:

The normal startup process loads all inputs, including the ones on inputs.d
Because things happens concurrently, the input is loaded (it's ID is added to the inputManager's map), but the harvester is not started yet.
The inputReloader is started in another goroutine
The inputReloaders reads again inputs.d and starts the inputs there.
Because the harvester has not started yet (and the other things that lead to the harvester running), the inputRelaoder does not know it needs to stop an input, so it only starts (again) the filestream input

Why does it not lead to two inputs running at the same time?

Well, the harvester is smart enough not to run two harvesters for the same file from the same input. So only one of the inputs actually run. This leads to no issues while running Filebeat.

Workaround

There are two possible workarounds, both with the downside of not having the reload of inputs happening automatically.

Only use `filebeat.yml`

One easy way is to add all your input configurations on filebeat.yml and completely ignore the inputs.d directory

~~## Disable the reload of inputs~~
After more testing we found out that disabling the reload of inputs does not solve the problem.

belimawr · 2022-09-06T09:13:23Z

Some extra information

Here are the stacktraces of when the filestream input is first loaded by the input manager and the second time it's loaded by the input manager (via the reloader), which triggers the log message:

First time, it comes from the "normal" startup that started on `main.go`

(dlv) bt
 0  0x00000000028c3bc0 in github.com/elastic/beats/v7/filebeat/input/filestream/internal/input-logfile.(*InputManager).Create
    at ./input/filestream/internal/input-logfile/manager.go:162
 1  0x00000000028aed86 in github.com/elastic/beats/v7/filebeat/input/v2.(*Loader).Configure
    at ./input/v2/loader.go:110
 2  0x00000000028afda5 in github.com/elastic/beats/v7/filebeat/input/v2/compat.(*factory).CheckConfig
    at ./input/v2/compat/compat.go:73
 3  0x00000000028b0927 in github.com/elastic/beats/v7/filebeat/input/v2/compat.composeFactory.CheckConfig
    at ./input/v2/compat/composed.go:48
 4  0x00000000028b0c25 in github.com/elastic/beats/v7/filebeat/input/v2/compat.(*composeFactory).CheckConfig
    at <autogenerated>:1
 5  0x000000000281fcc2 in github.com/elastic/beats/v7/filebeat/channel.(*onCreateFactory).CheckConfig
    at ./channel/runner.go:64
 6  0x00000000016dc967 in github.com/elastic/beats/v7/libbeat/cfgfile.(*Reloader).Check
    at /home/tiago/devel/beats/libbeat/cfgfile/reload.go:155
 7  0x00000000028b6655 in github.com/elastic/beats/v7/filebeat/beater.(*crawler).Start
    at ./beater/crawler.go:83
 8  0x00000000028b9431 in github.com/elastic/beats/v7/filebeat/beater.(*Filebeat).Run
    at ./beater/filebeat.go:338
 9  0x00000000027fb7d3 in github.com/elastic/beats/v7/libbeat/cmd/instance.(*Beat).launch
    at /home/tiago/devel/beats/libbeat/cmd/instance/beat.go:475
10  0x00000000027f9c05 in github.com/elastic/beats/v7/libbeat/cmd/instance.Run.func1
    at /home/tiago/devel/beats/libbeat/cmd/instance/beat.go:180
11  0x00000000027f9a85 in github.com/elastic/beats/v7/libbeat/cmd/instance.Run
    at /home/tiago/devel/beats/libbeat/cmd/instance/beat.go:181
12  0x000000000280af58 in github.com/elastic/beats/v7/libbeat/cmd.genRunCmd.func1
    at /home/tiago/devel/beats/libbeat/cmd/run.go:36
13  0x00000000011a52c3 in github.com/spf13/cobra.(*Command).execute
    at /home/tiago/go/pkg/mod/github.com/spf13/cobra@v1.3.0/command.go:860
14  0x00000000011a59dd in github.com/spf13/cobra.(*Command).ExecuteC
    at /home/tiago/go/pkg/mod/github.com/spf13/cobra@v1.3.0/command.go:974
Sending output to pager...
 0  0x00000000028c3bc0 in github.com/elastic/beats/v7/filebeat/input/filestream/internal/input-logfile.(*InputManager).Create
    at ./input/filestream/internal/input-logfile/manager.go:162
 1  0x00000000028aed86 in github.com/elastic/beats/v7/filebeat/input/v2.(*Loader).Configure
    at ./input/v2/loader.go:110
 2  0x00000000028afda5 in github.com/elastic/beats/v7/filebeat/input/v2/compat.(*factory).CheckConfig
    at ./input/v2/compat/compat.go:73
 3  0x00000000028b0927 in github.com/elastic/beats/v7/filebeat/input/v2/compat.composeFactory.CheckConfig
    at ./input/v2/compat/composed.go:48
 4  0x00000000028b0c25 in github.com/elastic/beats/v7/filebeat/input/v2/compat.(*composeFactory).CheckConfig
    at <autogenerated>:1
 5  0x000000000281fcc2 in github.com/elastic/beats/v7/filebeat/channel.(*onCreateFactory).CheckConfig
    at ./channel/runner.go:64
 6  0x00000000016dc967 in github.com/elastic/beats/v7/libbeat/cfgfile.(*Reloader).Check
    at /home/tiago/devel/beats/libbeat/cfgfile/reload.go:155
 7  0x00000000028b6655 in github.com/elastic/beats/v7/filebeat/beater.(*crawler).Start
    at ./beater/crawler.go:83
 8  0x00000000028b9431 in github.com/elastic/beats/v7/filebeat/beater.(*Filebeat).Run
    at ./beater/filebeat.go:338
 9  0x00000000027fb7d3 in github.com/elastic/beats/v7/libbeat/cmd/instance.(*Beat).launch
    at /home/tiago/devel/beats/libbeat/cmd/instance/beat.go:475
10  0x00000000027f9c05 in github.com/elastic/beats/v7/libbeat/cmd/instance.Run.func1
    at /home/tiago/devel/beats/libbeat/cmd/instance/beat.go:180
11  0x00000000027f9a85 in github.com/elastic/beats/v7/libbeat/cmd/instance.Run
    at /home/tiago/devel/beats/libbeat/cmd/instance/beat.go:181
12  0x000000000280af58 in github.com/elastic/beats/v7/libbeat/cmd.genRunCmd.func1
    at /home/tiago/devel/beats/libbeat/cmd/run.go:36
13  0x00000000011a52c3 in github.com/spf13/cobra.(*Command).execute
    at /home/tiago/go/pkg/mod/github.com/spf13/cobra@v1.3.0/command.go:860
14  0x00000000011a59dd in github.com/spf13/cobra.(*Command).ExecuteC
    at /home/tiago/go/pkg/mod/github.com/spf13/cobra@v1.3.0/command.go:974
15  0x00000000028e13c6 in github.com/spf13/cobra.(*Command).Execute
15  0x00000000028e13c6 in github.com/spf13/cobra.(*Command).Execute
    at /home/tiago/go/pkg/mod/github.com/spf13/cobra@v1.3.0/command.go:902
16  0x00000000028e13c6 in main.main
    at ./main.go:36
17  0x0000000000ff5db2 in runtime.main
    at /usr/local/go/src/runtime/proc.go:250
18  0x0000000001028d81 in runtime.goexit
    at /usr/local/go/src/runtime/asm_amd64.s:1594

Second time, the harvester is not running, but the input manager already has got the ID registered. It runs on a different goroutine

(dlv) bt
 0  0x00000000028c3bc0 in github.com/elastic/beats/v7/filebeat/input/filestream/internal/input-logfile.(*InputManager).Create
    at ./input/filestream/internal/input-logfile/manager.go:162
 1  0x00000000028aed86 in github.com/elastic/beats/v7/filebeat/input/v2.(*Loader).Configure
    at ./input/v2/loader.go:110
 2  0x00000000028afe5a in github.com/elastic/beats/v7/filebeat/input/v2/compat.(*factory).Create
    at ./input/v2/compat/compat.go:84
 3  0x00000000028b0a22 in github.com/elastic/beats/v7/filebeat/input/v2/compat.composeFactory.Create
    at ./input/v2/compat/composed.go:62
 4  0x00000000028b0cd5 in github.com/elastic/beats/v7/filebeat/input/v2/compat.(*composeFactory).Create
    at <autogenerated>:1
 5  0x000000000281ffa9 in github.com/elastic/beats/v7/filebeat/channel.RunnerFactoryWithCommonInputSettings.func1
    at ./channel/runner.go:99
 6  0x000000000281fd3e in github.com/elastic/beats/v7/filebeat/channel.(*onCreateFactory).Create
    at ./channel/runner.go:68
 7  0x00000000016dc418 in github.com/elastic/beats/v7/libbeat/cfgfile.createRunner
    at /home/tiago/devel/beats/libbeat/cfgfile/list.go:188
 8  0x00000000016db15f in github.com/elastic/beats/v7/libbeat/cfgfile.(*RunnerList).Reload
    at /home/tiago/devel/beats/libbeat/cfgfile/list.go:103
 9  0x00000000016dcd35 in github.com/elastic/beats/v7/libbeat/cfgfile.(*Reloader).Run
    at /home/tiago/devel/beats/libbeat/cfgfile/reload.go:215
10  0x00000000028b6949 in github.com/elastic/beats/v7/filebeat/beater.(*crawler).Start.func1
    at ./beater/crawler.go:97
11  0x0000000001028d81 in runtime.goexit
    at /usr/local/go/src/runtime/asm_amd64.s:1594
(dlv)

belimawr · 2022-09-06T09:15:14Z

@pierrehilbert @cmacknz we should look into the priority of this, when using the inputs.d it emits misleading errors about duplicated filestream IDs. Aside that, it seems to have no effect onto Filebeat's execution.

abraxxa · 2022-09-06T09:15:44Z

Thanks for debugging this so quickly! 👍🏻

Can inputReloader be disabled?

belimawr · 2022-09-06T09:28:18Z

Can inputReloader be disabled?

Yes, you can add:

filebeat.config:
  inputs:
    enabled: false
    path: inputs.d/*.yml

on your filebeat.yml

This option is enabled by default, hence it's not explicitly listed on filebeat.yml (/etc/filebeat/filebeat.yml on the deb package)

Or just not add anything into inputs.d folder and keep all your inputs into filebeat.yml

abraxxa · 2022-09-06T09:38:13Z

But then the file in the inputs.d directory won't be loaded.
My goal was to load it but not discover any changes to it as I handle deployment and restarting/reloading with Ansible.

belimawr · 2022-09-06T10:00:38Z

yes, that's the downside of the workaround, you'd have to add all input configuration to filebeat.yml.

You can add your input configurations on any part of filebeat.yml, so you could add it at the very end, e.g:

filebeat.config.inputs:
  enabled: false
  path: inputs.d/*.yml
  reload.period: 60s

output.file:
  enabled: true
  path: "/tmp/filebeat-out"
  
filebeat.inputs:
  - type: filestream
    id: my-unique-id
    paths: [/tmp/flog.log]

That might make it a bit easier to manipulatethis file with Ansible (e.g: concatenate a default config file with the input configuration) . If you don't need to watch the inputs.d directory for changes, that might be an option for you.

abraxxa · 2022-09-06T10:11:09Z

I just read up on this feature in the docs. What if I disable the input.d file reloading using reload.enabled: false?
Sadly the docs don't include its default value.

belimawr · 2023-08-22T09:10:46Z

Contributor

@belimawr so what you mean is that if everything is properly configured and we still get data duplication message, it is not an issue if the data is not actually duplicated?

Sorry @germain05, I didn't get what you meat. Could you elaborate it more?

germain05 · 2023-08-22T09:14:54Z

@belimawr I have this issue: #36379

I did some more investigation and this error message also appears when using Kubernetes autodiscovery. Here is a simple example config:
filebeat.autodiscover:
  providers:
    - type: kubernetes
      node: ${NODE_NAME}
      hints.enabled: true
      include_annotations: true
      include_labels: true
      hints.default_config:
        prospector.scanner.symlinks: true
        id: filestream-kubernetes-pod-${data.kubernetes.container.id}
        type: filestream
        paths:
          - /var/log/containers/*${data.kubernetes.container.id}.log
        parsers:
          - container: ~
This configuration is correct, and all IDs are unique per container/file harvested.

The log error comes from the fact that the autodiscovery calls the CheckConfig from a RunnerFactory, below are some bits of code involved on this process:

beats/libbeat/autodiscover/autodiscover.go

Line 234 in 786730c

err = a.factory.CheckConfig(config)

beats/libbeat/cfgfile/factories.go

Lines 78 to 84 in 786730c

func (f multiplexedFactory) CheckConfig(c *config.C) error {

factory, err := f.findFactory(c)

if err == nil {

err = factory.CheckConfig(c)

}

return err

}

beats/filebeat/input/v2/compat/compat.go

Lines 72 to 78 in 786730c

func (f *factory) CheckConfig(cfg *conf.C) error {

_, err := f.loader.Configure(cfg)

if err != nil {

return err

}

return nil

}

_, err := f.loader.Configure(cfg) will instantiate an input, that on Filestream case will call the Create method from the InputManager that ends ups adding the ID to its bookkeeping even though the input will never be started stopped. Here is the code responsible for that:

beats/filebeat/input/filestream/internal/input-logfile/manager.go

Lines 161 to 187 in 786730c

func (cim *InputManager) Create(config *conf.C) (v2.Input, error) {

if err := cim.init(); err != nil {

return nil, err

}

settings := struct {

ID string `config:"id"`

CleanTimeout time.Duration `config:"clean_timeout"`

HarvesterLimit uint64 `config:"harvester_limit"`

}{CleanTimeout: cim.DefaultCleanTimeout}

if err := config.Unpack(&settings); err != nil {

return nil, err

}

if settings.ID == "" {

cim.Logger.Error("filestream input ID without ID might lead to data" +

" duplication, please add an ID and restart Filebeat")

}

cim.idsMux.Lock()

if _, exists := cim.ids[settings.ID]; exists {

cim.Logger.Errorf("filestream input with ID '%s' already exists, this "+

"will lead to data duplication, please use a different ID", settings.ID)

}

cim.ids[settings.ID] = struct{}{}

cim.idsMux.Unlock()

I am talking about that.

to be more precise I have this issue here: #36379 and I want to understand why do I have data duplications even if the ID is unique?

germain05 · 2023-08-23T08:35:15Z

@belimawr, Please can you help here: #36379?

belimawr · 2023-08-23T08:35:37Z

@germain05 I'm reading/responding to it right now.

belimawr · 2023-08-23T08:54:09Z

@germain05 I replied here: #36379 (comment)

TL;DR: The issue should have been fixed, and so far I haven't managed to reproduce it, so I'll need some help understanding what is different on your environment.

germain05 · 2023-08-23T09:15:11Z

@belimawr I though so also, when going through issues. Added more informations there: #36379 (comment)

belimawr · 2023-09-12T10:57:05Z

I can easily reproduce it on Kubernetes with Filebeat v8.9.2, so I'm re-opening the issue.

It's mostly the manifest from https://www.elastic.co/guide/en/beats/filebeat/current/running-on-kubernetes.html but with small modifications in Filebeat's configuration.

Here is the kubernetes manifest I used:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: filebeat
  namespace: kube-system
  labels:
    k8s-app: filebeat
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: filebeat
  labels:
    k8s-app: filebeat
rules:
- apiGroups: [""] # "" indicates the core API group
  resources:
  - namespaces
  - pods
  - nodes
  verbs:
  - get
  - watch
  - list
- apiGroups: ["apps"]
  resources:
    - replicasets
  verbs: ["get", "list", "watch"]
- apiGroups: ["batch"]
  resources:
    - jobs
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: filebeat
  # should be the namespace where filebeat is running
  namespace: kube-system
  labels:
    k8s-app: filebeat
rules:
  - apiGroups:
      - coordination.k8s.io
    resources:
      - leases
    verbs: ["get", "create", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: filebeat-kubeadm-config
  namespace: kube-system
  labels:
    k8s-app: filebeat
rules:
  - apiGroups: [""]
    resources:
      - configmaps
    resourceNames:
      - kubeadm-config
    verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: filebeat
subjects:
- kind: ServiceAccount
  name: filebeat
  namespace: kube-system
roleRef:
  kind: ClusterRole
  name: filebeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: filebeat
  namespace: kube-system
subjects:
  - kind: ServiceAccount
    name: filebeat
    namespace: kube-system
roleRef:
  kind: Role
  name: filebeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: filebeat-kubeadm-config
  namespace: kube-system
subjects:
  - kind: ServiceAccount
    name: filebeat
    namespace: kube-system
roleRef:
  kind: Role
  name: filebeat-kubeadm-config
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-config
  namespace: kube-system
  labels:
    k8s-app: filebeat
data:
  filebeat.yml: |-
    filebeat.autodiscover:
      providers:
        - type: kubernetes
          node: ${NODE_NAME}
          hints.enabled: true
          hints.default_config:
            type: filestream
            prospector.scanner.symlinks: true
            id: filestream-kubernetes-pod-${data.kubernetes.container.id}
            take_over: true
            paths:
              - /var/log/containers/*${data.kubernetes.container.id}.log
            parsers:
            - container: ~ 
    processors:
      - add_cloud_metadata:
      - add_host_metadata:

    output.elasticsearch:
      hosts: ["https://localhost:9200"] # add real credentials
      protocol: "https"
      username: "elastic"
      password: "changeme"
      allow_older_versions: true
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: filebeat
  namespace: kube-system
  labels:
    k8s-app: filebeat
spec:
  selector:
    matchLabels:
      k8s-app: filebeat
  template:
    metadata:
      labels:
        k8s-app: filebeat
    spec:
      serviceAccountName: filebeat
      terminationGracePeriodSeconds: 30
      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet
      containers:
      - name: filebeat
        image: docker.elastic.co/beats/filebeat:8.9.2
        args: [
          "-c", "/etc/filebeat.yml",
          "-e",
        ]
        env:
        - name: ELASTICSEARCH_HOST
          value: elasticsearch
        - name: ELASTICSEARCH_PORT
          value: "9200"
        - name: ELASTICSEARCH_USERNAME
          value: elastic
        - name: ELASTICSEARCH_PASSWORD
          value: changeme
        - name: ELASTIC_CLOUD_ID
          value:
        - name: ELASTIC_CLOUD_AUTH
          value:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        securityContext:
          runAsUser: 0
          # If using Red Hat OpenShift uncomment this:
          #privileged: true
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 100Mi
        volumeMounts:
        - name: config
          mountPath: /etc/filebeat.yml
          readOnly: true
          subPath: filebeat.yml
        - name: data
          mountPath: /usr/share/filebeat/data
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        - name: varlog
          mountPath: /var/log
          readOnly: true
      volumes:
      - name: config
        configMap:
          defaultMode: 0640
          name: filebeat-config
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: varlog
        hostPath:
          path: /var/log
      # data folder stores a registry of read status for all files, so we don't send everything again on a Filebeat pod restart
      - name: data
        hostPath:
          # When filebeat runs as non-root user, this directory needs to be writable by group (g+w).
          path: /var/lib/filebeat-data
          type: DirectoryOrCreate
---

kbujold · 2024-01-30T19:31:21Z

We are seeing the same issue as @belimawr with filebeat 8.11.1. Any workaround to resolve this? Maybe fingerprint?

belimawr · 2024-02-01T10:17:32Z

We are seeing the same issue as @belimawr with filebeat 8.11.1. Any workaround to resolve this? Maybe fingerprint?

The fingerprint won't help here as it is the file identity and has no influence on this log message. The filestream input is working as expected, however that log message gets incorrectly logged in some cases.

kbujold · 2024-02-01T14:16:52Z

The fingerprint won't help here as it is the file identity and has no influence on this log message. The filestream input is working as expected, however that log message gets incorrectly logged in some cases.

@belimawr its more than an incorrect log, it causes the pod to crash continuously.

belimawr · 2024-02-02T11:47:55Z

@belimawr its more than an incorrect log, it causes the pod to crash continuously.

Ok, then it's a bigger problem.

Have you managed to reproduce it on a controlled environment (Kind, Minikube, etc)? If so, could you share the configuration and steps to reproduce?
Can you share some debug logs that capture the crash?

kbujold · 2024-02-02T14:21:28Z

I am sorry I am mixing my issues lol. We do not have pod restarts with this problem. Just flooding of logs the customer is not happy seeing.

kbujold · 2024-02-15T16:47:35Z

@belimawr Is there a work around for this issue?

belimawr · 2024-02-15T18:30:58Z

@belimawr Is there a work around for this issue?

The incorrect logs? No at the moment there isn't any.

kbujold · 2024-02-15T18:41:26Z

The incorrect logs? No at the moment there isn't any.

Yes

PBarnard · 2024-02-19T17:39:02Z

Just to add to this I've upgraded to filebeat 8.12.1 on AWS EKS Kubernetes and I'm still seeing this error. It's not causing any problems as far as I can tell, Filebeat still runs without crashing and all logs are ingesting into Elastic. Just thought I'd mention it :)

barrowkwan · 2024-06-26T04:09:21Z

same here. running filebeat 8.14.0 in AWS EKS. I am getting "filestream input with ID '...." already exits, this wil lead to data duplication... "
filebeat didn't crash and I don't think I got duplicated data.

GEownt · 2024-09-04T08:03:29Z

Filebeat in version 8.15.0 is still having this issue.

kvch added bug Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team labels May 26, 2022

belimawr mentioned this issue May 30, 2022

Fix input ID uniqueness #31512

Closed

jlind23 added the 8.5-candidate label May 31, 2022

jlind23 assigned belimawr Jul 8, 2022

jlind23 added v8.4.0 and removed 8.5-candidate labels Jul 8, 2022

belimawr closed this as completed Jul 12, 2022

belimawr reopened this Sep 6, 2022

ajax-shmyrko-o mentioned this issue Aug 30, 2023

filebeat.config.inputs to load external configuration produces error #34613

Open

belimawr reopened this Sep 12, 2023

belimawr mentioned this issue Sep 13, 2023

Filestream data duplication in filebeat 8.9.1 #36379

Closed

belimawr mentioned this issue Feb 8, 2024

Filestream metrics might not work correctly in Kubernetes #37925

Open

jlind23 removed the v8.9.0 label Sep 4, 2024

jlind23 unassigned AndersonQ Sep 4, 2024

kbujold mentioned this issue Sep 9, 2024

Make log messages in the file scanner less noisy #38421

Merged

4 tasks

pierrehilbert assigned AndersonQ Sep 20, 2024

AndersonQ mentioned this issue Nov 11, 2024

filebeat: input v2 compat uses random ID for CheckConfig #41585

Merged

4 tasks

AndersonQ closed this as completed in #41585 Nov 14, 2024

This was referenced Nov 14, 2024

[8.x](backport #41585) filebeat: input v2 compat uses random ID for CheckConfig #41641

Merged

[8.16](backport #41585) filebeat: input v2 compat uses random ID for CheckConfig #41642

Merged

filestream input logs an error when an existing input is reloaded with the same ID #31767

filestream input logs an error when an existing input is reloaded with the same ID #31767

Comments

kvch commented May 26, 2022 • edited Loading

Example configuration

elasticmachine commented May 26, 2022

belimawr commented Jul 12, 2022

andreycha commented Jul 27, 2022

cmacknz commented Jul 27, 2022

abraxxa commented Sep 5, 2022

belimawr commented Sep 6, 2022

abraxxa commented Sep 6, 2022

belimawr commented Sep 6, 2022

abraxxa commented Sep 6, 2022

belimawr commented Sep 6, 2022

belimawr commented Sep 6, 2022

abraxxa commented Sep 6, 2022

belimawr commented Sep 6, 2022

belimawr commented Sep 6, 2022 • edited Loading

The problem

Why does it not lead to two inputs running at the same time?

Workaround

Only use filebeat.yml

belimawr commented Sep 6, 2022

Some extra information

First time, it comes from the "normal" startup that started on main.go

Second time, the harvester is not running, but the input manager already has got the ID registered. It runs on a different goroutine

belimawr commented Sep 6, 2022

abraxxa commented Sep 6, 2022

belimawr commented Sep 6, 2022

abraxxa commented Sep 6, 2022

belimawr commented Sep 6, 2022

abraxxa commented Sep 6, 2022 • edited Loading

belimawr commented Aug 22, 2023

germain05 commented Aug 22, 2023 • edited Loading

germain05 commented Aug 23, 2023

belimawr commented Aug 23, 2023

belimawr commented Aug 23, 2023

germain05 commented Aug 23, 2023

belimawr commented Sep 12, 2023

kbujold commented Jan 30, 2024

belimawr commented Feb 1, 2024

kbujold commented Feb 1, 2024

belimawr commented Feb 2, 2024

kbujold commented Feb 2, 2024

kbujold commented Feb 15, 2024 • edited Loading

belimawr commented Feb 15, 2024

kbujold commented Feb 15, 2024

PBarnard commented Feb 19, 2024

barrowkwan commented Jun 26, 2024

GEownt commented Sep 4, 2024

`filestream` input logs an error when an existing input is reloaded with the same ID #31767

`filestream` input logs an error when an existing input is reloaded with the same ID #31767

kvch commented May 26, 2022 •

edited

Loading

belimawr commented Sep 6, 2022 •

edited

Loading

Only use `filebeat.yml`

First time, it comes from the "normal" startup that started on `main.go`

abraxxa commented Sep 6, 2022 •

edited

Loading

germain05 commented Aug 22, 2023 •

edited

Loading

kbujold commented Feb 15, 2024 •

edited

Loading