Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Filebeat] filestream input resends whole log files after restart #36541

Closed
oshmyrko opened this issue Sep 8, 2023 · 27 comments
Closed

[Filebeat] filestream input resends whole log files after restart #36541

oshmyrko opened this issue Sep 8, 2023 · 27 comments
Assignees
Labels
Team:Elastic-Agent Label for the Agent team

Comments

@oshmyrko
Copy link

oshmyrko commented Sep 8, 2023

Filebeat filestream resends whole log files after restart, but only in case several log files were rotated.
Not all log files might be resent, often it resends files with a second or third index.

Details

Configurations

/etc/filebeat/filebeat.yml:

filebeat.registry.file: /var/lib/filebeat/registry

http:
  enabled: true
  host: localhost
  port: 5066

filebeat.config.inputs:
  enabled: true
  path: inputs.d/*

setup:
  ilm.enabled: false
  template:
    enabled: false
    overwrite: false

logging.metrics.enabled: false

processors:
  - add_fields:
      target: ''
      fields:
        env: stage
  - add_cloud_metadata: ~
  - drop_fields:
      fields:
      - agent.ephemeral_id
      - agent.hostname
      - agent.id
      - agent.name
      - agent.type
      - cloud.account
      - cloud.image
      - cloud.machine
      - cloud.provider
      - cloud.region
      - cloud.service
      - ecs
      - input
      - input.type
      - log.file.path
      - log.offset
      - stream

output.elasticsearch:
  hosts: ["es.my-domain:9200"]
  bulk_max_size: 2048
  compression_level: 1
  indices:
    - index: "syslog-%{[agent.version]}"
      when.and:
        - has_fields: ['type']
        - equals.type: 'syslog'
    - index: "kernel-%{[agent.version]}"
      when.and:
        - has_fields: [ 'type' ]
        - equals.type: 'kernel'
    - index: "cloud-init-%{[agent.version]}"
      when.and:
        - has_fields: ['type']
        - equals.type: 'cloud-init'
    - index: "my-svc-%{[agent.version]}"
      when.not.and:
        - has_fields: ['type']
        - or:
          - equals.type: 'syslog'
          - equals.type: 'kernel'
          - equals.type: 'cloud-init'

/etc/filebeat/inputs.d/app-log.yml:

- type: filestream
  id: app-log
  paths:
    - /home/ubuntu/my-app/logs/*-json.log*
  parsers:
    - ndjson:
        ignore_decoding_error: true
  processors:
    - timestamp:
        field: timestamp
        layouts:
          - '2006-01-02T15:04:05Z'
          - '2006-01-02T15:04:05.999Z'
          - '2006-01-02T15:04:05.999-07:00'
        test:
          - '2023-04-16T17:45:35.999Z'
        ignore_missing: true
    - drop_fields:
        fields: [timestamp]
        ignore_missing: true

Logs

Log files before Filebeat restart:

# ls -il /home/ubuntu/my-app/logs/my-svc-json.log*
525442 -rw-r--r-- 1 root root  21222497 Sep  8 15:03 /home/ubuntu/my-app/logs/my-svc-json.log
525439 -rw-r--r-- 1 root root 209853249 Sep  8 15:00 /home/ubuntu/my-app/logs/my-svc-json.log.1
520698 -rw-r--r-- 1 root root 210390180 Sep  8 14:34 /home/ubuntu/my-app/logs/my-svc-json.log.2
525441 -rw-r--r-- 1 root root 210838250 Sep  8 14:09 /home/ubuntu/my-app/logs/my-svc-json.log.3

# service filebeat restart

Log files after Filebeat restart:

# ls -il /home/ubuntu/my-app/logs/my-svc-json.log*
525442 -rw-r--r-- 1 root root  30283599 Sep  8 15:04 /home/ubuntu/my-app/logs/my-svc-json.log
525439 -rw-r--r-- 1 root root 209853249 Sep  8 15:00 /home/ubuntu/my-app/logs/my-svc-json.log.1
520698 -rw-r--r-- 1 root root 210390180 Sep  8 14:34 /home/ubuntu/my-app/logs/my-svc-json.log.2
525441 -rw-r--r-- 1 root root 210838250 Sep  8 14:09 /home/ubuntu/my-app/logs/my-svc-json.log.3

Filebeat logs, pay attention on File was truncated. Reading file from offset 0 for my-svc-json.log.1-2:

# journalctl -u filebeat.service | grep my-svc-json.log
{"log.level":"info","@timestamp":"2023-09-08T15:00:29.630Z","log.logger":"input.filestream","log.origin":{"file.name":"filestream/input.go","file.line":321},"message":"Reader was closed. Closing.","service.name":"filebeat","id":"app-log","source_file":"filestream::app-log::native::525440-66305","path":"/home/ubuntu/my-app/logs/my-svc-json.log.3","state-id":"native::525440-66305","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-09-08T15:03:43.672Z","log.logger":"input.filestream","log.origin":{"file.name":"filestream/input.go","file.line":321},"message":"Reader was closed. Closing.","service.name":"filebeat","id":"app-log","source_file":"filestream::app-log::native::525441-66305","path":"/home/ubuntu/my-app/logs/my-svc-json.log.2","state-id":"native::525441-66305","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-09-08T15:04:21.915Z","log.logger":"input.filestream","log.origin":{"file.name":"filestream/input.go","file.line":321},"message":"Reader was closed. Closing.","service.name":"filebeat","id":"app-log","source_file":"filestream::app-log::native::525439-66305","path":"/home/ubuntu/my-app/logs/my-svc-json.log","state-id":"native::525439-66305","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-09-08T15:04:21.915Z","log.logger":"input.filestream","log.origin":{"file.name":"filestream/input.go","file.line":321},"message":"Reader was closed. Closing.","service.name":"filebeat","id":"app-log","source_file":"filestream::app-log::native::520698-66305","path":"/home/ubuntu/my-app/logs/my-svc-json.log.1","state-id":"native::520698-66305","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-09-08T15:04:21.915Z","log.logger":"input.filestream","log.origin":{"file.name":"filestream/input.go","file.line":321},"message":"Reader was closed. Closing.","service.name":"filebeat","id":"app-log","source_file":"filestream::app-log::native::525442-66305","path":"/home/ubuntu/my-app/logs/my-svc-json.log","state-id":"native::525442-66305","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-09-08T15:04:22.366Z","log.logger":"input.filestream","log.origin":{"file.name":"filestream/input.go","file.line":267},"message":"File was truncated. Reading file from offset 0. Path=/home/ubuntu/my-app/logs/my-svc-json.log.2","service.name":"filebeat","id":"app-log","source_file":"filestream::app-log::native::520698-66305","path":"/home/ubuntu/my-app/logs/my-svc-json.log.2","state-id":"native::520698-66305","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-09-08T15:04:22.370Z","log.logger":"input.filestream","log.origin":{"file.name":"filestream/input.go","file.line":267},"message":"File was truncated. Reading file from offset 0. Path=/home/ubuntu/my-app/logs/my-svc-json.log.1","service.name":"filebeat","id":"app-log","source_file":"filestream::app-log::native::525439-66305","path":"/home/ubuntu/my-app/logs/my-svc-json.log.1","state-id":"native::525439-66305","ecs.version":"1.6.0"}

After a few min, readers for files with no activities were closed:

{"log.level":"info","@timestamp":"2023-09-08T15:09:26.431Z","log.logger":"input.filestream","log.origin":{"file.name":"filestream/input.go","file.line":321},"message":"Reader was closed. Closing.","service.name":"filebeat","id":"app-log","source_file":"filestream::app-log::native::525441-66305","path":"/home/ubuntu/my-app/logs/my-svc-json.log.3","state-id":"native::525441-66305","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-09-08T15:11:01.290Z","log.logger":"input.filestream","log.origin":{"file.name":"filestream/input.go","file.line":321},"message":"Reader was closed. Closing.","service.name":"filebeat","id":"app-log","source_file":"filestream::app-log::native::525439-66305","path":"/home/ubuntu/my-app/logs/my-svc-json.log.1","state-id":"native::525439-66305","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-09-08T15:11:01.327Z","log.logger":"input.filestream","log.origin":{"file.name":"filestream/input.go","file.line":321},"message":"Reader was closed. Closing.","service.name":"filebeat","id":"app-log","source_file":"filestream::app-log::native::520698-66305","path":"/home/ubuntu/my-app/logs/my-svc-json.log.2","state-id":"native::520698-66305","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-09-08T15:33:20.536Z","log.logger":"input.filestream","log.origin":{"file.name":"filestream/input.go","file.line":321},"message":"Reader was closed. Closing.","service.name":"filebeat","id":"app-log","source_file":"filestream::app-log::native::525442-66305","path":"/home/ubuntu/my-app/logs/my-svc-json.log","state-id":"native::525442-66305","ecs.version":"1.6.0"}

One more restart:

# ls -il /home/ubuntu/my-app/logs/my-svc-json.log*
525440 -rw-r--r-- 1 root root 157838538 Sep  8 15:48 /home/ubuntu/my-app/logs/my-svc-json.log
525442 -rw-r--r-- 1 root root 210404533 Sep  8 15:28 /home/ubuntu/my-app/logs/my-svc-json.log.1
525439 -rw-r--r-- 1 root root 209853249 Sep  8 15:00 /home/ubuntu/my-app/logs/my-svc-json.log.2
520698 -rw-r--r-- 1 root root 210390180 Sep  8 14:34 /home/ubuntu/my-app/logs/my-svc-json.log.3

# service filebeat restart
# journalctl -u filebeat.service | grep my-svc-json.log
{"log.level":"info","@timestamp":"2023-09-08T15:48:59.340Z","log.logger":"input.filestream","log.origin":{"file.name":"filestream/input.go","file.line":321},"message":"Reader was closed. Closing.","service.name":"filebeat","id":"app-log","source_file":"filestream::app-log::native::525440-66305","path":"/home/ubuntu/my-app/logs/my-svc-json.log","state-id":"native::525440-66305","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-09-08T15:48:59.798Z","log.logger":"input.filestream","log.origin":{"file.name":"filestream/input.go","file.line":267},"message":"File was truncated. Reading file from offset 0. Path=/home/ubuntu/my-app/logs/my-svc-json.log.3","service.name":"filebeat","id":"app-log","source_file":"filestream::app-log::native::520698-66305","path":"/home/ubuntu/my-app/logs/my-svc-json.log.3","state-id":"native::520698-66305","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-09-08T15:48:59.801Z","log.logger":"input.filestream","log.origin":{"file.name":"filestream/input.go","file.line":267},"message":"File was truncated. Reading file from offset 0. Path=/home/ubuntu/my-app/logs/my-svc-json.log.2","service.name":"filebeat","id":"app-log","source_file":"filestream::app-log::native::525439-66305","path":"/home/ubuntu/my-app/logs/my-svc-json.log.2","state-id":"native::525439-66305","ecs.version":"1.6.0"}
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Sep 8, 2023
@sandervandegeijn
Copy link

I think we are running into the same problem with a suricata setup producing about ± 100 eve.*.json files. After a restart filebeat doesn't seem to know where it left things and starts most all over again leading to duplicate data.

@sandervandegeijn
Copy link

We ran into this again and this time it was at the end of the day, so the whole 400GB get's reindexed again leading to double data and a SIEM that's lagging many hours behind.

@carlosrochap
Copy link

carlosrochap commented Feb 7, 2024

We are experiencing this issue too, we are using version 8.11.1. Is there any ETA for this?

@sandervandegeijn
Copy link

haven't seen anyone responding in months, so no clue :)

@SadeGili
Copy link

+1 on this one - every restart for some reason filebeat reads the rotated files (not the current file though) - we are using version 8.11.2

@belimawr
Copy link
Contributor

The logs and information @oshmyrko provided show show something quite weird happening.

By default the Filestream input uses the file_identity: native, which relies on inode + device ID to identify files. If either of those change, Filebeat will detect it as a new file and re-ingest.

The state-id in the logs has the format native::<inode>-<device-ID> and the source_file has the format filestream::<input ID>::native::<inode>-<device-ID>. That is what Filebeat will use to identify the files, not the path.

Filebeat considers a file (identified by inode + device ID) truncated when the size from the file is smaller than size Filebeat has stored in memory. If a file was just moved the size is the same, Filebeat will detect the change in filename and move on without re-ingesting anything.

From the ls outputs provided before and after restarting Filebeat, we can see that all rotated files maintain their inode, name and size, the only difference is /home/ubuntu/my-app/logs/my-svc-json.log that has grown. Which is expected in a live environment.

However, based on the logs Filebeat perceived something different.
Based on the ls output, the inode 520698 is /home/ubuntu/my-app/logs/my-svc-json.log.2.
However the the "Reader was closed" logs, shows something different.

{
  "log.level": "info",
  "@timestamp": "2023-09-08T15:04:21.915Z",
  "log.logger": "input.filestream",
  "log.origin": {
    "file.name": "filestream/input.go",
    "file.line": 321
  },
  "message": "Reader was closed. Closing.",
  "service.name": "filebeat",
  "id": "app-log",
  "source_file": "filestream::app-log::native::520698-66305",
  "path": "/home/ubuntu/my-app/logs/my-svc-json.log.1",
  "state-id": "native::520698-66305",
  "ecs.version": "1.6.0"
}

The inode is 520698 and the file /home/ubuntu/my-app/logs/my-svc-json.log.1.

The Filebeat detects a file truncation, but this time the path shows /home/ubuntu/my-app/logs/my-svc-json.log.2

{
  "log.level": "info",
  "@timestamp": "2023-09-08T15:04:22.366Z",
  "log.logger": "input.filestream",
  "log.origin": {
    "file.name": "filestream/input.go",
    "file.line": 267
  },
  "message": "File was truncated. Reading file from offset 0. Path=/home/ubuntu/my-app/logs/my-svc-json.log.2",
  "service.name": "filebeat",
  "id": "app-log",
  "source_file": "filestream::app-log::native::520698-66305",
  "path": "/home/ubuntu/my-app/logs/my-svc-json.log.2",
  "state-id": "native::520698-66305",
  "ecs.version": "1.6.0"
}

/home/ubuntu/my-app/logs/my-svc-json.log.2 is smaller than /home/ubuntu/my-app/logs/my-svc-json.log.1 which characterises a file truncation.

I'll further investigate it and try to reproduce it on my dev environment. Once I have some updates I'll post it here.

@belimawr
Copy link
Contributor

For folks still experiencing this issue, please make sure all of your filestream inputs have an unique ID. If the ID is re-used or more than one input does not have the id key defined the files will be re-ingested on every restart.

Our documentation is very clear about the need for unique IDs:

Each filestream input must have a unique ID. Omitting or changing the filestream ID may cause data duplication. Without a unique ID, filestream is unable to correctly track the state of files.

Also look at your logs (level error) for entries like this:

filestream input with ID 'my-NOT-unique-ID' already exists, this will lead to data duplication, please use a different ID.

If it is there, then that's likely the reason for the data duplication. Setting unique IDs for filestream will fix the issue. Bear in mind that all files harvested by the inputs new/changed IDs will be re-ingested from the beginning once. After that Filebeat should stop duplicating data.

@oshmyrko
Copy link
Author

@belimawr, all my filestream inputs have unique IDs (you can see them below), and I've just reproduced this issue with Filebeat 8.11.3:

# cat /etc/filebeat/inputs.d/app-log.yml 
- type: filestream
  id: app-log
  paths:
    - /home/ubuntu/my-app/logs/*-json.log*
  parsers:
    - ndjson:
        ignore_decoding_error: true
  processors:
    - timestamp:
        field: timestamp
        layouts:
          - '2006-01-02T15:04:05Z'
          - '2006-01-02T15:04:05.999Z'
          - '2006-01-02T15:04:05.999-07:00'
        test:
          - '2023-04-16T17:45:35.999Z'
        ignore_missing: true
    - drop_fields:
        fields: [timestamp]
        ignore_missing: true

---
# cat /etc/filebeat/inputs.d/docker.yml 
- type: container
  paths:
    - '/var/lib/docker/containers/*/*.log'
  json:
    keys_under_root: true
    ignore_decoding_error: true

---
# cat /etc/filebeat/inputs.d/syslog.yml 
- type: filestream
  id: syslog
  paths:
    - '/var/log/syslog*'
  processors:
    - add_fields:
        target: ''
        fields:
          type: 'syslog'

---
# cat /etc/filebeat/inputs.d/kernel.yml 
- type: filestream
  id: kernel
  paths:
    - '/var/log/kern.log'
  processors:
    - add_fields:
        target: ''
        fields:
          type: 'kernel'

---
# cat /etc/filebeat/inputs.d/cloud-init.yml 
- type: filestream
  id: cloud-init
  paths:
    - '/var/log/cloud-init-output.log'
  ignore_older: '2h'
  close_inactive: '10m'
  exclude_lines: ['^\|', '^\+']
  parsers:
    - multiline:
        type: pattern
        pattern: '^ci-info.*'
        negate: false
        match: after
  processors:
    - add_fields:
        target: ''
        fields:
          type: 'cloud-init'

@belimawr
Copy link
Contributor

Thanks @oshmyrko! I'll look into reproducing this issue and further investigating

@oshmyrko
Copy link
Author

@belimawr, just want to add some details.
I was able to reproduce this on the system with a high number of log messages per second. Also, I had to wait a few complete rotations before Filebeat started to resend logs from rotated files after the restart.
I couldn't reproduce this on a system with a low rate of logs, probably because those log files are not rotated often enough.

Here are rough numbers - 500MB of logs every ~3 min and about 1200K logs in the file:

# ls -lh
-rw-r--r-- 1 root root 278M Feb 15 18:44 my-svc-json.log
-rw-r--r-- 1 root root 528M Feb 15 18:43 my-svc-json.log.1
-rw-r--r-- 1 root root 501M Feb 15 18:40 my-svc-json.log.2
-rw-r--r-- 1 root root 508M Feb 15 18:37 my-svc-json.log.3

# wc -l my-svc-json.log.1
1201099 my-svc-json.log.1

@redbaron4
Copy link

redbaron4 commented Feb 16, 2024

I have been running into this repeatedly. Have documented my findings on discuss

https://discuss.elastic.co/t/on-restarting-filebeat-every-filestream-paths-input-is-deemed-to-be-truncated/353337

My understanding is that when filebeat detects truncation on files, it rereads from 0 but does not reset offset in registry.

@SadeGili
Copy link

What solved it for us was to give a unique ID for each prospector we have for the same filebeat agent - it is documented here

https://www.elastic.co/guide/en/beats/filebeat/current/_step_1_set_an_identifier_for_each_filestream_input.html

I have to say - this is a poor choice for the product, maintaining a UID for each prospector we have (over 400 servers, 100+ files we read) is such a overhead. I get that it is easier to debug issues this way but the fact that if you change this ID or change conventions of IDs in the future you will end up with duplicated events - means we are bounded to this ID

Cant filebeat generate a UID for each prospector that does not have one? just for the sole purpose of it not reading the same events after every restart?

@belimawr
Copy link
Contributor

Thanks for the new information @oshmyrko!

I couldn't reproduce this on a system with a low rate of logs, probably because those log files are not rotated often enough.

That sounds like inode reuse could be the the root cause for you. Anyway, it's on my list to further investigate this.

@redbaron4 Thanks a lot for the detailed report. I wrote a more detailed answer on Discuss. Here goes a summary:

  1. You are facing inode reuse, the fingerprint file identity can help you
  2. There seems to be a bug on how the offset is persisted/updated when a file truncation happens. I'll further investigate that.

@SadeGili, the IDs need to be unique within a Filebeat process/configuration. So if you have multiple servers (400+ as you mentioned), they all can share the exact same Filebeat configuration. As long as each Filebeat process has got unique IDs on it's configuration file(s)/filestream inputs you will have no issue.

One of the features from the Filestream input is to allow different instances of the input to read from the same file and have independent states. For that to work we need a way to tie a file state/identity and an Filestream input instance, we use the input ID defined in the configuration for that.

Cant filebeat generate a UID for each prospector that does not have one?

Unfortunately that is not a simple task. If we generate an ID to a filestream input that does not have one, we have to ensure this ID is unique and stable, persisted and linked to that specific input configuration. The problem is that an input configuration can be edited by the user (adding processors, fields, more file paths, etc), moved to different parts of the configuration file (like changing the ordering of the inputs definition), moved to an external file, etc. That is not an easy problem to solve and it might not even be solvable due to the possibility of so many things changing in an input configuration.

@belimawr belimawr self-assigned this Feb 20, 2024
@belimawr belimawr added the Team:Elastic-Agent Label for the Agent team label Feb 20, 2024
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Feb 20, 2024
@belimawr
Copy link
Contributor

belimawr commented Feb 20, 2024

@redbaron4 I believe I managed to reproduce the case you described in your discuss post. There are two pre-conditions to experience the file being re-ingested as you
described:

  1. File identity needs to be reused. Either inode reuse or different
    files with the same path. The perceivable outcome for Filebeat is a
    state in its registry for the same key with a bigger offset than
    the real file.
  2. close.on_state_change.removed: false and clean_removed: false. This causes the file's state never to be removed from the
    registry, so when a new file reusing the identity is discovered it
    uses the old state which contains an offset that is bigger than the
    file size.

Those two conditions, when Filebeat re-starts causes the file to be
detected as truncated and re-ingested from the beginning. If
close.on_state_change.removed: false and clean_removed: false. are
disabled, then once the deletion is detected the state is not removed
from the registry, which causes it to be reused by the new file discovered.

The real bug here is that when Filebeat detects the file truncation at
stratup (I'm still investigating exactly where and why) it sets the
reader offset to 0, starts ingesting the file from the beginning and
incrementing the offset in the registry, however because the offset in
the registry has not being reset it only increases. This causes the
file to be re-ingested every time Filebeat starts up and the problem
to aggravate.

There are legit cases of file truncation usage that also triggers this
registry not being updated issue. I'm still investigating exactly the
pre-conditions for that and looking for a mitigation.

@oshmyrko your case seems to have the same root cause (registry not
being correctly updated), but with a different pre-condition. You
do not seem to face identity re-use. It seems that during log rotation
Filebeat is detecting a partial state of the filesystem that makes it
not correctly handle the files being moved.

@belimawr
Copy link
Contributor

I created an issue with a step-by-step to reproduce the truncation offset tracking: #38070

@belimawr
Copy link
Contributor

Disabling close.on_state_change.removed: false and clean_removed: false and having any sort of identity reuse (inode reuse for native file identity or filepath/name reuse for path file identity) will trigger this issue. If the states can be removed from the registry, it will be a completely new file and Filebeat should work as expected.

@belimawr
Copy link
Contributor

@oshmyrko, I've been trying to reproduce what you described, but no
matter how I try I just can't reproduce it. I believe we might be
missing something, some intermediate state in the file system we're
not seeing but Filebeat is and that is what leads it to perceive files
as truncated.

You posted this set of logs:

# journalctl -u filebeat.service | grep my-svc-json.log
{"log.level":"info","@timestamp":"2023-09-08T15:00:29.630Z","log.logger":"input.filestream","log.origin":{"file.name":"filestream/input.go","file.line":321},"message":"Reader was closed. Closing.","service.name":"filebeat","id":"app-log","source_file":"filestream::app-log::native::525440-66305","path":"/home/ubuntu/my-app/logs/my-svc-json.log.3","state-id":"native::525440-66305","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-09-08T15:03:43.672Z","log.logger":"input.filestream","log.origin":{"file.name":"filestream/input.go","file.line":321},"message":"Reader was closed. Closing.","service.name":"filebeat","id":"app-log","source_file":"filestream::app-log::native::525441-66305","path":"/home/ubuntu/my-app/logs/my-svc-json.log.2","state-id":"native::525441-66305","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-09-08T15:04:21.915Z","log.logger":"input.filestream","log.origin":{"file.name":"filestream/input.go","file.line":321},"message":"Reader was closed. Closing.","service.name":"filebeat","id":"app-log","source_file":"filestream::app-log::native::525439-66305","path":"/home/ubuntu/my-app/logs/my-svc-json.log","state-id":"native::525439-66305","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-09-08T15:04:21.915Z","log.logger":"input.filestream","log.origin":{"file.name":"filestream/input.go","file.line":321},"message":"Reader was closed. Closing.","service.name":"filebeat","id":"app-log","source_file":"filestream::app-log::native::520698-66305","path":"/home/ubuntu/my-app/logs/my-svc-json.log.1","state-id":"native::520698-66305","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-09-08T15:04:21.915Z","log.logger":"input.filestream","log.origin":{"file.name":"filestream/input.go","file.line":321},"message":"Reader was closed. Closing.","service.name":"filebeat","id":"app-log","source_file":"filestream::app-log::native::525442-66305","path":"/home/ubuntu/my-app/logs/my-svc-json.log","state-id":"native::525442-66305","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-09-08T15:04:22.366Z","log.logger":"input.filestream","log.origin":{"file.name":"filestream/input.go","file.line":267},"message":"File was truncated. Reading file from offset 0. Path=/home/ubuntu/my-app/logs/my-svc-json.log.2","service.name":"filebeat","id":"app-log","source_file":"filestream::app-log::native::520698-66305","path":"/home/ubuntu/my-app/logs/my-svc-json.log.2","state-id":"native::520698-66305","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-09-08T15:04:22.370Z","log.logger":"input.filestream","log.origin":{"file.name":"filestream/input.go","file.line":267},"message":"File was truncated. Reading file from offset 0. Path=/home/ubuntu/my-app/logs/my-svc-json.log.1","service.name":"filebeat","id":"app-log","source_file":"filestream::app-log::native::525439-66305","path":"/home/ubuntu/my-app/logs/my-svc-json.log.1","state-id":"native::525439-66305","ecs.version":"1.6.0"}

Because of the filter there (grep my-svc-json.log) it's hard to
truly understand what is happening. I believe the first 5 lines, the
"Reader was closed. Closing." were logged during Filebeat's
shutdown and the last two ("File was truncated. Reading file from
offset 0."
) were logged when Filebeat was started again. Is that
correct?

According to those logs the last offset in the registry for
filestream::app-log::native::520698-66305 when Filebeat shutdown was
larger then the size of the same file Filebeat found when starting
again.

If we look at the two truncated files from that set of logs (I removed
some unnecessary fields):

{
  "@timestamp": "2023-09-08T15:04:21.915Z",
  "log.origin": {
    "file.name": "filestream/input.go",
    "file.line": 321
  },
  "message": "Reader was closed. Closing.",
  "source_file": "filestream::app-log::native::520698-66305",
  "path": "/home/ubuntu/my-app/logs/my-svc-json.log.1",
  "state-id": "native::520698-66305"
}

{
  "@timestamp": "2023-09-08T15:03:43.672Z",
  "log.origin": {
    "file.name": "filestream/input.go",
    "file.line": 321
  },
  "message": "Reader was closed. Closing.",
  "source_file": "filestream::app-log::native::525441-66305",
  "path": "/home/ubuntu/my-app/logs/my-svc-json.log.2",
  "state-id": "native::525441-66305"
}
  • /home/ubuntu/my-app/logs/my-svc-json.log.1 has got the inode 520698
  • /home/ubuntu/my-app/logs/my-svc-json.log.2 has got the inode 525441

When Filebeat starts again we have:

{
  "@timestamp": "2023-09-08T15:04:22.366Z",
  "log.origin": {
    "file.name": "filestream/input.go",
    "file.line": 267
  },
  "message": "File was truncated. Reading file from offset 0. Path=/home/ubuntu/my-app/logs/my-svc-json.log.2",
  "source_file": "filestream::app-log::native::520698-66305",
  "path": "/home/ubuntu/my-app/logs/my-svc-json.log.2",
  "state-id": "native::520698-66305"
}

{
  "@timestamp": "2023-09-08T15:04:22.370Z",
  "log.origin": {
    "file.name": "filestream/input.go",
    "file.line": 267
  },
  "message": "File was truncated. Reading file from offset 0. Path=/home/ubuntu/my-app/logs/my-svc-json.log.1",
  "source_file": "filestream::app-log::native::525439-66305",
  "path": "/home/ubuntu/my-app/logs/my-svc-json.log.1",
  "state-id": "native::525439-66305"
}
  • /home/ubuntu/my-app/logs/my-svc-json.log.1 has got the inode 525439
  • /home/ubuntu/my-app/logs/my-svc-json.log.2 has got the inode 520698

Which looks normal for a file rotation strategy that moves
files. Inode 520698 pointed to
/home/ubuntu/my-app/logs/my-svc-json.log.1 and the to
/home/ubuntu/my-app/logs/my-svc-json.log.2.

Because I do not have access to Filebeat's registry nor the complete
logs, I can only suppose what might be happening.

If you're experiencing inode reuse, then you might be falling in one
of the cases described by
#38070. The files are not
actually truncated, but the re-use of inodes and the states not being
removed from the registry can be perceived as a truncation by
Filebeat, which can lead to the situation where the offset in the
registry is always bigger than the real file, making Filebeat consider
it truncated on every restart.

On a normal operation and file rotation, Filebeat will remove the
state from old files from its registry:

{"log.level":"debug","@timestamp":"2024-02-21T12:38:55.580+0100","log.logger":"file_watcher","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*fileWatcher).watch","file.name":"filestream/fswatch.go","file.line":120},"message":"Start next scan","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-21T12:38:55.580+0100","log.logger":"file_watcher","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*fileWatcher).watch","file.name":"filestream/fswatch.go","file.line":224},"message":"File scan complete","service.name":"filebeat","total":4,"written":1,"truncated":0,"renamed":0,"removed":0,"created":0,"ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-21T12:38:55.580+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*fileProspector).onFSEvent","file.name":"filestream/prospector.go","file.line":187},"message":"File /tmp/logs/foo.log has been updated","service.name":"filebeat","id":"filestream-input-id","prospector":"file_prospector","operation":"write","source_name":"native::85729-34","os_id":"85729-34","new_path":"/tmp/logs/foo.log","old_path":"/tmp/logs/foo.log","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-21T12:38:55.580+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream/internal/input-logfile.(*defaultHarvesterGroup).Start","file.name":"input-logfile/harvester.go","file.line":139},"message":"Starting harvester for file","service.name":"filebeat","id":"filestream-input-id","source_file":"filestream::filestream-input-id::native::85729-34","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-21T12:38:55.581+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream/internal/input-logfile.startHarvester.func1","file.name":"input-logfile/harvester.go","file.line":205},"message":"Stopped harvester for file","service.name":"filebeat","id":"filestream-input-id","source_file":"filestream::filestream-input-id::native::85729-34","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-21T12:39:00.582+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*logFile).shouldBeClosed","file.name":"filestream/filestream.go","file.line":215},"message":"close.on_state_change.removed is enabled and file /tmp/logs/foo.log has been removed","service.name":"filebeat","id":"filestream-input-id","source_file":"filestream::filestream-input-id::native::85589-34","path":"/tmp/logs/foo.log","state-id":"native::85589-34","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-21T12:39:05.581+0100","log.logger":"file_watcher","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*fileWatcher).watch","file.name":"filestream/fswatch.go","file.line":120},"message":"Start next scan","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-21T12:39:05.581+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*fileProspector).onFSEvent","file.name":"filestream/prospector.go","file.line":215},"message":"File /tmp/logs/foo.log has been renamed to /tmp/logs/foo.log.1","service.name":"filebeat","id":"filestream-input-id","prospector":"file_prospector","operation":"rename","source_name":"native::85729-34","os_id":"85729-34","new_path":"/tmp/logs/foo.log.1","old_path":"/tmp/logs/foo.log","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-21T12:39:05.581+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*fileProspector).onFSEvent","file.name":"filestream/prospector.go","file.line":215},"message":"File /tmp/logs/foo.log.1 has been renamed to /tmp/logs/foo.log.2","service.name":"filebeat","id":"filestream-input-id","prospector":"file_prospector","operation":"rename","source_name":"native::85674-34","os_id":"85674-34","new_path":"/tmp/logs/foo.log.2","old_path":"/tmp/logs/foo.log.1","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-21T12:39:05.581+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*fileProspector).onFSEvent","file.name":"filestream/prospector.go","file.line":215},"message":"File /tmp/logs/foo.log.2 has been renamed to /tmp/logs/foo.log.3","service.name":"filebeat","id":"filestream-input-id","prospector":"file_prospector","operation":"rename","source_name":"native::85625-34","os_id":"85625-34","new_path":"/tmp/logs/foo.log.3","old_path":"/tmp/logs/foo.log.2","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-21T12:39:05.581+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*fileProspector).onFSEvent","file.name":"filestream/prospector.go","file.line":210},"message":"File /tmp/logs/foo.log.3 has been removed","service.name":"filebeat","id":"filestream-input-id","prospector":"file_prospector","operation":"delete","source_name":"native::85589-34","os_id":"85589-34","old_path":"/tmp/logs/foo.log.3","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-21T12:39:05.581+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*fileProspector).onRemove","file.name":"filestream/prospector.go","file.line":241},"message":"Stopping harvester as file native::85589-34 has been removed and close.on_state_change.removed is enabled.","service.name":"filebeat","id":"filestream-input-id","prospector":"file_prospector","operation":"delete","source_name":"native::85589-34","os_id":"85589-34","old_path":"/tmp/logs/foo.log.3","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-21T12:39:05.581+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*fileProspector).onRemove","file.name":"filestream/prospector.go","file.line":246},"message":"Remove state for file as file removed: /tmp/logs/foo.log.3","service.name":"filebeat","id":"filestream-input-id","prospector":"file_prospector","operation":"delete","source_name":"native::85589-34","os_id":"85589-34","old_path":"/tmp/logs/foo.log.3","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-21T12:39:05.581+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*filestream).Run.func1","file.name":"filestream/input.go","file.line":151},"message":"Closing reader of filestream","service.name":"filebeat","id":"filestream-input-id","source_file":"filestream::filestream-input-id::native::85589-34","path":"/tmp/logs/foo.log","state-id":"native::85589-34","ecs.version":"1.6.0"}

close.on_state_change.removed is enabled by default, if you ran
Filebeat with log debug, you can see those log entries. I recommend
using the selectors below so it's not very verbose.

To experiment with the file rotation I've been using this script:

#!/bin/bash
cd /tmp/logs

rm foo.log.3
mv foo.log.2 foo.log.3
mv foo.log.1 foo.log.2
mv foo.log foo.log.1
for i in {1..1000000}
do
    # Each log line is 50 bytes
    printf "log line with some padding %10d ===========\n" $i >> /tmp/logs/foo.log
done

And my filebeat.yml is:

filebeat.inputs:
  - type: filestream
    id: filestream-input-id
    enabled: true
    paths:
      - /tmp/logs/foo.log*

output:
  file:
    enabled: true
    codec.json:
      pretty: false
    path: ${path.home}
    filename: "output"
    rotate_on_startup: false

queue.mem:
  flush:
    timeout: 1s
    min_events: 32

filebeat.registry.flush: 1s

logging:
  level: debug
  selectors:
    - crawler
    - file_watcher
    - input
    - input.filestream
    - input.harvester
    - registrar
    - scanner
    - service
  metrics:
    enabled: false

I'm running a build from our main branch, commit b9a75c9.

@oshmyrko
Copy link
Author

oshmyrko commented Feb 22, 2024

@belimawr, thank you for the effort you spent on this testing to reproduce the issue.
If you want, I can try to collect more logs and details for this issue (e.g. full log from Filebeat, enable debug, etc).

BTW, some of our logs have 100-500K bytes size.

@belimawr
Copy link
Contributor

That would be great @oshmyrko ! If you could provide me with debug logs and the registry, that would be perfect.

I believe the main issue you're facing is inode reuse. I cannot affirm that with the information I have but there is evidence. If you look at this:

525440 -rw-r--r-- 1 root root 157838538 Sep  8 15:48 /home/ubuntu/my-app/logs/my-svc-json.log
525442 -rw-r--r-- 1 root root 210404533 Sep  8 15:28 /home/ubuntu/my-app/logs/my-svc-json.log.1

The newer file (well, I assume it's newer given usual file rotation) has got a smaller inode, so they seem to be reused rather quickly by your system.

I'm already working on #38070, so a fix should be coming up soon. If you're really facing inode reuse it will help, but not fix all problems.

@belimawr
Copy link
Contributor

A possible workaround is to use the fingerprint, ex:

filebeat.inputs:
  - type: filestream
    id: id-unique-id
    enabled: true
    paths:
      - /tmp/logs/foo.log*
    file_identity.fingerprint: ~
    prospector.scanner.fingerprint:
      enabled: true

WARNING: By changing the file identity, any file that is not currently ignored will be re-ingested from the beginning!

Another thing to keep in mind is that because the identification of a file changes with its content, once a file is truncated Filebeat will log messages stating it was removed, ex:

{"log.level":"debug","@timestamp":"2024-02-23T15:26:20.140+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*logFile).handleEOF","file.name":"filestream/filestream.go","file.line":259},"message":"File was truncated as offset (2111) > size (0): /tmp/logs/foo.log","service.name":"filebeat","id":"id","source_file":"filestream::id::fingerprint::413983b17a792f86ea64f2119474677eae2c1389c8ed8ebc08a841edacbd2c98","path":"/tmp/logs/foo.log","state-id":"fingerprint::413983b17a792f86ea64f2119474677eae2c1389c8ed8ebc08a841edacbd2c98","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2024-02-23T15:26:20.140+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*filestream).readFromSource","file.name":"filestream/input.go","file.line":334},"message":"File was truncated, nothing to read. Path='/tmp/logs/foo.log'","service.name":"filebeat","id":"id","source_file":"filestream::id::fingerprint::413983b17a792f86ea64f2119474677eae2c1389c8ed8ebc08a841edacbd2c98","path":"/tmp/logs/foo.log","state-id":"fingerprint::413983b17a792f86ea64f2119474677eae2c1389c8ed8ebc08a841edacbd2c98","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-23T15:26:20.140+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream/internal/input-logfile.startHarvester.func1","file.name":"input-logfile/harvester.go","file.line":247},"message":"Stopped harvester for file","service.name":"filebeat","id":"id","source_file":"filestream::id::fingerprint::413983b17a792f86ea64f2119474677eae2c1389c8ed8ebc08a841edacbd2c98","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-23T15:26:20.140+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*filestream).Run.func1","file.name":"filestream/input.go","file.line":151},"message":"Closing reader of filestream","service.name":"filebeat","id":"id","source_file":"filestream::id::fingerprint::413983b17a792f86ea64f2119474677eae2c1389c8ed8ebc08a841edacbd2c98","path":"/tmp/logs/foo.log","state-id":"fingerprint::413983b17a792f86ea64f2119474677eae2c1389c8ed8ebc08a841edacbd2c98","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-23T15:26:22.116+0100","log.logger":"file_watcher","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*fileWatcher).watch","file.name":"filestream/fswatch.go","file.line":120},"message":"Start next scan","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"warn","@timestamp":"2024-02-23T15:26:22.116+0100","log.logger":"scanner","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*fileScanner).GetFiles","file.name":"filestream/fswatch.go","file.line":389},"message":"cannot create a file descriptor for an ingest target \"/tmp/logs/foo.log\": filesize of \"/tmp/logs/foo.log\" is 0 bytes, expected at least 1024 bytes for fingerprinting","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-23T15:26:22.116+0100","log.logger":"file_watcher","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*fileWatcher).watch","file.name":"filestream/fswatch.go","file.line":224},"message":"File scan complete","service.name":"filebeat","total":0,"written":0,"truncated":0,"renamed":0,"removed":1,"created":0,"ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-23T15:26:22.116+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*fileProspector).onFSEvent","file.name":"filestream/prospector.go","file.line":210},"message":"File /tmp/logs/foo.log has been removed","service.name":"filebeat","id":"id","prospector":"file_prospector","operation":"delete","source_name":"fingerprint::413983b17a792f86ea64f2119474677eae2c1389c8ed8ebc08a841edacbd2c98","fingerprint":"413983b17a792f86ea64f2119474677eae2c1389c8ed8ebc08a841edacbd2c98","os_id":"50761-34","old_path":"/tmp/logs/foo.log","ecs.version":"1.6.0"}

After enough content is added to calculate a new fingerprint, the file is detected as a new file, ex:

{"log.level":"debug","@timestamp":"2024-02-23T15:39:22.118+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*fileProspector).onFSEvent","file.name":"filestream/prospector.go","file.line":179},"message":"A new file /tmp/logs/foo.log has been found","service.name":"filebeat","id":"id","prospector":"file_prospector","operation":"create","source_name":"fingerprint::d154c77df40cb626acaf9b2f3ea03af24a092e4eecef61c9a05ef683c34f4720","fingerprint":"d154c77df40cb626acaf9b2f3ea03af24a092e4eecef61c9a05ef683c34f4720","os_id":"50761-34","new_path":"/tmp/logs/foo.log","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-23T15:39:22.118+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream/internal/input-logfile.(*defaultHarvesterGroup).Start","file.name":"input-logfile/harvester.go","file.line":139},"message":"Starting harvester for file","service.name":"filebeat","id":"id","source_file":"filestream::id::fingerprint::d154c77df40cb626acaf9b2f3ea03af24a092e4eecef61c9a05ef683c34f4720","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-23T15:39:22.118+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*filestream).open","file.name":"filestream/input.go","file.line":185},"message":"newLogFileReader with config.MaxBytes:10485760","service.name":"filebeat","id":"id","source_file":"filestream::id::fingerprint::d154c77df40cb626acaf9b2f3ea03af24a092e4eecef61c9a05ef683c34f4720","path":"/tmp/logs/foo.log","state-id":"fingerprint::d154c77df40cb626acaf9b2f3ea03af24a092e4eecef61c9a05ef683c34f4720","ecs.version":"1.6.0"}

@belimawr
Copy link
Contributor

@oshmyrko, I'm closing this issue. My understanding is that the problems have been solved by #38070 (that will be in the next release) and the file identity fingerprint.

If you're still facing any problems after updating and moving to fingerprint, feel free to re-open this issue.

@oshmyrko
Copy link
Author

Hi @belimawr,
I apologize, I couldn't collect logs because I was switched to other issues.

Thank you for these details. I hope your fix solves this issue.
I'll post a comment here once I can test the new version with the fix.

@rlevytskyi
Copy link

We have the same issue with Filebeat 8.15.0
(filebeat version 8.15.0 (amd64), libbeat 8.15.0 [76f45fe built 2024-08-02 17:15:22 +0000 UTC])

Here is the config excerpt:

filebeat.inputs:
...
- type: filestream
  id: freeradius
  paths:
    - /var/log/radius/radius.log
  tags: ["freeradius"]
  encoding: plain

Here is the logfile:

# ls -l /var/log/radius/radius.log
-rw-r----- 1 radiusd radiusd 259224 Sep  9 12:42 /var/log/radius/radius.log

Just in case, the free space:

# df -h /
Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda1       10G  6.1G  4.0G  61% /

So every time Filebeat is restarted or, more annoying, it get reconnected to Logstash, it resends the whole file.

What other data should I collect in order to help fixing this issue?

@belimawr
Copy link
Contributor

belimawr commented Sep 9, 2024

Hi @rlevytskyi to take a look at this I'd need debug logs, Filebeat's registry and your full Filebeat configuration (please redact any sensitive information like output credentials, IPs, etc)

For the debug logs, to control verbosity of the logs, here is a good logging configuration:

logging:
  level: debug
  selectors:
    - crawler
    - file_watcher
    - input
    - input.filestream
    - input.harvester
    - registrar
    - scanner
    - service

If you can consistently reproduce this issue, a step-by-step would be very helpful as well.

Does any restart at any point makes Filebeat re-ingest the logs or there is also a time component, like "after Filebeat is running for 1h, if restarted logs are re-ingested"

@rlevytskyi
Copy link

Thank you for your prompt reply @belimawr !

I've applied debug settings you suggested.
However, I am unable to reproduce the issue anymore.

As far as I remember, few months ago we had the same issue with one of our instances and it was the same - ~20 times logs were repeated in several days, then stopped repeating by itself.

So for now I see no other way than close the issue.
Should it appear again, I'll add debug log to collect the necessary information and will post it here.

Have a nice day!

@belimawr
Copy link
Contributor

Thanks for the update @rlevytskyi, if you face this issue again, please first try using the fingerprint file identity I suggested #36541 (comment) to rule out issues with inode reuse.

If after migrating to the fingerprint file identity the issue still persists, don't hesitate on posting here/contacting me again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

No branches or pull requests

8 participants