GA support for reading from journald #37086

cmacknz · 2023-11-10T21:29:37Z

Relates Datasets system.auth and system.syslog are not available for Debian 12 under Data streams tab. elastic-agent#3650

As of Debian 12 system logs are exclusively available via journald by default. Today we support reading journald logs via the Filebeat journald input, which is still in technical preview and has several major bugs filed against it. See https://github.com/elastic/beats/issues?q=is%3Aissue+is%3Aopen+journald notably:

[Filebeat] Journald causes Filebeat to crash #34077
Filbeat stops shipping journald logs when encountered "failed to read message field: bad message" error #32782
[Filebeat] Journald input doesn't work in container when host systemd is too recent #30398

We need to provide a GA way to read journald logs. There are two paths to this:

Fix the major issues in the journald input and GA it as is. All integrations that previously read syslog files by default will need a conditional to specify that journald should be used instead of one of the log files on Linux (see example. Possibly this conditional will need to be on the Linux distribution and not just Linux as a platform.
Fold the existing journald functionality into filestream, so that there is only one way to read log files and all existing uses of filestream to read system logs continue to work with no or minimal modification. In the ideal case we detect we are reading journald logs based on a .journal extension or well known file paths, but we may need a configuration flag for this. If we do end up with a configuration flag we could consider implementing journald support as a type of parser https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-filestream.html#_parsers

Edit:
Option 1 is the path forward, we'll keep the separate journald input.

To close this issue we'll need to:

Must have

Give feedback

[Filebeat] Journald causes Filebeat to crash #34077

Team:Elastic-Agent Team:Elastic-Agent-Data-Plane bug
[Filebeat] Journald input doesn't work in container when host systemd is too recent #30398

Filebeat Team:Elastic-Agent Team:Elastic-Agent-Data-Plane bug
[journald] Add configuration example to the default configurations #37877

Team:Elastic-Agent docs
[journald] ECS conflicts with host.ip and source.ip fields integrations#9690

Integration:journald Team:Elastic-Agent-Data-Plane
[Journald] input crashes with "failed to read message field: cannot allocate memory" #39352

Team:Elastic-Agent-Data-Plane bug
Journald input only ingests events from the current boot #41083

Team:Elastic-Agent-Data-Plane bug
Filebeat's journald input might return an empty message that breaks multiline parser #41331

Team:Elastic-Agent-Data-Plane bug
[Filebeat - System module - Journald] Some fields are missing #41353

Team:Elastic-Agent-Data-Plane bug
[System module] Only one instance of Journald runs when both syslog and auth filesets are enabled #41378

Team:Elastic-Agent-Data-Plane bug
Investigate the best way to decide when to read system logs from files or journald integrations#10797

Integration:system
Options

Nice to have

Give feedback

Filbeat stops shipping journald logs when encountered "failed to read message field: bad message" error #32782

Team:Elastic-Agent-Data-Plane bug
[journald] Review and add missing tests #37876

Team:Elastic-Agent enhancement
[Journald] Implement status reporting for Elastic-Agent #39791

Team:Elastic-Agent-Data-Plane
[Journald input] Update include_matches to reach feature parity with what is exposed by journalctl #40185

Team:Elastic-Agent-Data-Plane
[Journald] Document the parsers #40478

Team:Elastic-Agent-Data-Plane
[Journald] Better support binary blob entries #40479

Team:Elastic-Agent-Data-Plane
Investigate the best way to decide when to read system logs from files or journald #40526

Team:Elastic-Agent-Data-Plane
[iptables] [journald] Errors when testing with Elastic Agent wolfi images integrations#10998

Integration:iptables Team:Elastic-Agent
Options

The text was updated successfully, but these errors were encountered:

cmacknz · 2023-11-10T21:30:06Z

@rdner I am interested to get your opinion on this given the amount of time you are spending trying to migrate and drive consistency between the log, filestream, and container input that already exist.

leehinman · 2023-11-10T22:55:52Z

For Option1 do we have to provide a conditional? I think both inputs could be enabled at the same time, it would just have to be non-fatal for the source not to be present. For example you can enable both journald, logfile & udp in iptables integration all at the same time. (And UDP and journald are on by default)

cmacknz · 2023-11-14T20:26:22Z

If we don't have a conditional we risk duplicated logs. I think if we defaulted to always using both inputs we'd get a small amount of duplicated logs today on Debian 11, it looks like the kernal boot logs go to both journald and /var/log/

craig_mackenzie@cmackenzie-debian11-test:~$ journalctl
-- Journal begins at Tue 2023-11-14 20:15:17 UTC, ends at Tue 2023-11-14 20:19:35 UTC. --
Nov 14 20:15:17 debian kernel: Linux version 5.10.0-26-cloud-amd64 (debian-kernel@lists.debian.org) (gcc-10 (D>
Nov 14 20:15:17 debian kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-5.10.0-26-cloud-amd64 root=UUID=62c0943b>
Nov 14 20:15:17 debian kernel: BIOS-provided physical RAM map:
Nov 14 20:15:17 debian kernel: BIOS-e820: [mem 0x0000000000000000-0x0000000000000fff] reserved
Nov 14 20:15:17 debian kernel: BIOS-e820: [mem 0x0000000000001000-0x0000000000054fff] usable
Nov 14 20:15:17 debian kernel: BIOS-e820: [mem 0x0000000000055000-0x000000000005ffff] reserved
Nov 14 20:15:17 debian kernel: BIOS-e820: [mem 0x0000000000060000-0x0000000000097fff] usable
Nov 14 20:15:17 debian kernel: BIOS-e820: [mem 0x0000000000098000-0x000000000009ffff] reserved
Nov 14 20:15:17 debian kernel: BIOS-e820: [mem 0x0000000000100000-0x00000000bf8ecfff] usable
Nov 14 20:15:17 debian kernel: BIOS-e820: [mem 0x00000000bf8ed000-0x00000000bf9ecfff] reserved
Nov 14 20:15:17 debian kernel: BIOS-e820: [mem 0x00000000bf9ed000-0x00000000bfaecfff] type 20
Nov 14 20:15:17 debian kernel: BIOS-e820: [mem 0x00000000bfaed000-0x00000000bfb6cfff] reserved
Nov 14 20:15:17 debian kernel: BIOS-e820: [mem 0x00000000bfb6d000-0x00000000bfb7efff] ACPI data
Nov 14 20:15:17 debian kernel: BIOS-e820: [mem 0x00000000bfb7f000-0x00000000bfbfefff] ACPI NVS
Nov 14 20:15:17 debian kernel: BIOS-e820: [mem 0x00000000bfbff000-0x00000000bffdffff] usable
Nov 14 20:15:17 debian kernel: BIOS-e820: [mem 0x00000000bffe0000-0x00000000bfffffff] reserved
Nov 14 20:15:17 debian kernel: BIOS-e820: [mem 0x0000000100000000-0x000000013fffffff] usable
Nov 14 20:15:17 debian kernel: printk: bootconsole [earlyser0] enabled
Nov 14 20:15:17 debian kernel: NX (Execute Disable) protection: active
Nov 14 20:15:17 debian kernel: efi: EFI v2.70 by EDK II
Nov 14 20:15:17 debian kernel: efi: TPMFinalLog=0xbfbf7000 ACPI=0xbfb7e000 ACPI 2.0=0xbfb7e014 SMBIOS=0xbf9ca0>
Nov 14 20:15:17 debian kernel: secureboot: Secure boot disabled
Nov 14 20:15:17 debian kernel: SMBIOS 2.4 present.
Nov 14 20:15:17 debian kernel: DMI: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023
Nov 14 20:15:17 debian kernel: Hypervisor detected: KVM
Nov 14 20:15:17 debian kernel: kvm-clock: Using msrs 4b564d01 and 4b564d00
Nov 14 20:15:17 debian kernel: kvm-clock: cpu 0, msr 78801001, primary cpu clock
Nov 14 20:15:17 debian kernel: kvm-clock: using sched offset of 7655756989 cycles
Nov 14 20:15:17 debian kernel: clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max>
Nov 14 20:15:17 debian kernel: tsc: Detected 2200.158 MHz processor
Nov 14 20:15:17 debian kernel: e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
Nov 14 20:15:17 debian kernel: e820: remove [mem 0x000a0000-0x000fffff] usable
Nov 14 20:15:17 debian kernel: last_pfn = 0x140000 max_arch_pfn = 0x400000000
Nov 14 20:15:17 debian kernel: MTRR default type: write-back
Nov 14 20:15:17 debian kernel: MTRR fixed ranges enabled:
Nov 14 20:15:17 debian kernel:   00000-9FFFF write-back
Nov 14 20:15:17 debian kernel:   A0000-FFFFF uncachable

craig_mackenzie@cmackenzie-debian11-test:~$ grep -rn 'kvm-clock: cpu 0, msr 78801001, primary cpu cloc' /var/log
grep: /var/log/journal/3465bc73197d954b92a16251605729f5/system.journal: binary file matches
grep: /var/log/private: Permission denied
grep: /var/log/btmp: Permission denied
/var/log/syslog:125:Nov 14 20:15:18 debian kernel: [    0.000000] kvm-clock: cpu 0, msr 78801001, primary cpu clock
/var/log/messages:27:Nov 14 20:15:18 debian kernel: [    0.000000] kvm-clock: cpu 0, msr 78801001, primary cpu clock
grep: /var/log/chrony: Permission denied
/var/log/kern.log:27:Nov 14 20:15:18 debian kernel: [    0.000000] kvm-clock: cpu 0, msr 78801001, primary cpu clock

Granted if someone set their logs path to /var/log/*.log they'd pick up these logs from both syslog.log and kern.log today anyway.

cmacknz · 2023-11-14T20:29:27Z

It also looks like the journald input is using go-systemd/sdjournal/ which is just wrapping the systemd journal C API:

https://github.com/coreos/go-systemd/blob/7d375ecc2b092916968b5601f74cca28a8de45dd/sdjournal/journal.go#L424-L434

func NewJournal() (j *Journal, err error) {
	j = &Journal{}

	sd_journal_open, err := getFunction("sd_journal_open")
	if err != nil {
		return nil, err
	}

	r := C.my_sd_journal_open(sd_journal_open, &j.cjournal, C.SD_JOURNAL_LOCAL_ONLY)

This wouldn't fit with the idea of just using a filestream parser for journald, at best we could just hide the entire journald input inside filestream so there's a single log input, but we'd probably still need dedicated configuration specific to reading journald files.

cmacknz · 2023-11-14T20:34:30Z

The default journald configuration that reads everything is only two lines so I think at this point I'm convinced that keeping the journald input and improving it is the best path:

# Read all journald logs
- type: journald
  id: everything

I don't think folding this into filestream will make filestream easier to use, or be easier to maintain.

rdner · 2023-11-15T14:05:49Z

To summarise what we discussed with @cmacknz on a call:

I think we should detect whether the OS has journald in the agent and add a new variable in the host object for the integration templates to use it like we use the condition here https://github.com/elastic/integrations/blob/f1b08ddd00724eaf3b8d9eb9ef2221f8fc7eefc4/packages/system/data_stream/system/agent/stream/winlog.yml.hbs#L2C1-L2C41
I think the users who use the agent are not interested in deep configuration, so the integrations should deal with logs in both files and journald seamlessly for the user.
Users who run standalone Filebeat are used to manual configurations and should be able to take care of configuring the right input for the right OS/distribution – journald or filestream.
journald should remain separate, it's not really compatible with the filestream architecture and consumes logs via special syscalls.

andrewkroh · 2024-01-10T22:39:26Z

A few things that come to mind related to journald:

The input produces large events with lots of metadata. This could have an impact on storage usage. It also might make sense to drop some of the fields.
The input is not optimized for producing ECS fields. IIRC it does populate ECS fields but it also duplicates the same data into non-ECS fields. It would be much better to optimize the events coming out of the input before turning this on for users by default. All journald input users would benefit IMO.
To make reading data most efficient for each data stream (system.syslog and system.auth) ideally we would use journalctl filtering (e.g. system.auth might use _TRANSPORT=syslog). So we need to figure out if that associated data is available in journald and what are the appropriate filters that can be used to select it. Implementing filter in the Beat using processors would be less that ideal for efficiency. (Viewing the data with sudo journalctl -o export is a great way to determine what filtering might work).

cmacknz · 2024-01-11T18:23:41Z

Thanks, I think it would make sense to compare the events collected without the journald inputs to those collected with it for the sources needed by the system integration. If the event content is significantly different it will cause problems for dashboards and queries.

andrewkroh · 2024-02-09T14:24:58Z

This is alluded to in some linked issues, but I wanted to explicitly mention that the journald library version in our container images is v245 (from Mar 6, 2020), and when deploying this image on Ubuntu 22.04 nodes, which uses v249, you can't collect logs from the host (no crashes, just no logs). My workaround has been to repack the filebeat binaries in with a more recent base image. We might want to consider bumping our base image as part of making this GA.

belimawr · 2024-05-01T20:49:49Z

I found another bug, probably another blocker: #39352

It seems that if Filebeat falls too far behind with the Journal the input will crash shortly after starting.

belimawr · 2024-05-01T22:34:08Z

#32782 and #39352 happen intermittently on my test environments, so far I did not manage to isolate them but they both are coming from a call to github.com/coreos/go-systemd/v22/sdjournal

beats/filebeat/input/journald/pkg/journalread/reader.go

Line 185 in ffcd181

entry, err := r.journal.GetEntry()

#39352 I only managed to reproduce with Journald systemd 252 (252.16-1.amzn2023.0.2)

elasticmachine · 2024-05-05T15:50:37Z

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

belimawr · 2024-06-03T20:44:13Z

Now that we support reporting status per input when running under Elastic-Agent and given that Journald can cause Filebeat to crash (as detailed by #34077), I added a task to ensure we have input status reporting for Journald before we GA so we can better report unsupported Systemd versions.

belimawr · 2024-06-06T15:57:57Z

Talking with @cmacknz the best course of action for those crashes is to use journalctl to read the journald logs instead of the current github.com/coreos/go-systemd/v22/sdjournal

I created an issue to track it: #39820

cmacknz · 2024-06-06T16:00:38Z

Direct use of journalctl is the approach currently used by the OTel collector, related discussion in open-telemetry/opentelemetry-collector-contrib#2332

cmacknz · 2024-06-18T21:01:29Z

From the comment in #37086 (comment) implying the documents the journald input produces are unlikely to be optimal today, we need to add a task to compare the structure and metadata of the logs collected by journald to those collected today from syslog with the log input and ensure we only see expected differences.

belimawr · 2024-06-27T17:08:35Z

@cmacknz, @pierrehilbert I've been working on using journalctl to read journald logs and comparing the events generated by my POC implementation with our current implementation. With pretty little work, the generated events are the same, which is good news.

However, the filtering we have (include_matches) is pretty broken. The example in the documentation and the reference config file differ in syntax, none of them work as expected for our current implementation 😢

So far I could access that the problem is caused by two issues:

The way we represent and parse the config keys
The way we call go-systemd to add the filters

Bear in mind this include_matches is an advanced filtering, without using it we can already filter by unit, syslog identifier, facility, priority, match messages with grep-like patterns, etc.

Which brings me some product related questions:

What is the minimum feature set to GA journald?
Do we need to keep any of the current syntax for the advanced filtering?
Given Journald input is not GA and is not working as expected, how far can we go with breaking changes?

Just to give an example, the following YAML:

include_matches:
  - and:
    - match:
      - FOO=bar
    - match:
      - BAR=bar
  - or:
    - match:
      - FOO_BAR=foo_bar

generates the following logic expression:

FOO_BAR=foo_bar OR FOO=foo AND BAR=bar AND

Yes, there is an AND without one of is operands, for some reason it does not break the input, but it does not produce the expected result.

I would expect the above YAML to produce the following logic expression:

(FOO=foo AND BAR=bar) OR (FOO_BAR=foo_bar)

Honestly, I don't think we should accept YAML above, it is confusing how the AND and OR relate to each other, which will lead users to make all sorts of mistakes.

cmacknz · 2024-06-27T17:16:12Z

The journald input is in technical preview, you can make whatever changes are needed breaking or not. That said, people are using it as is so don't make breaking changes that don't actually help.

Our end goal is not just to GA journald, it is to make Debian 12 based systems (and other distros that default to journald) work in our system integration. This means we want users to be able to get the same information out of journald that they would get out of syslog, without breaking any queries, dashboards, or alerts that already exist. So breaking changes need to be focused on the shape of the output data, and if it can be filtered in the same ways, and not on the input configuration syntax used to obtain it.

If we can make it so the journald input is a drop in replacement for the log input in the system integration with no configuration changes that is even better, but I'm not sure this will be possible.

belimawr · 2024-06-27T18:24:18Z

Thanks @cmacknz, I'll take a look at the system integration on a Debian 12 host and what kind of filtering it uses for logs so I can draft the minimal requirements.

belimawr · 2024-06-27T20:56:44Z

TL;DR: The standard journald input, with no filters will collect all the data we need, to correctly add the dataset we can rely on the syslog facility code journald already adds to the events and are already published in the current version of the journald input.

So for #39820 I'll focus on getting the core of the journald input working with journalctl with a good set of tests and leave advanced filtering to another task.

Long version:
I've been doing some testing with Debian 12 and research, here are the key findings:
1. rsyslogd ships the data to the traditional log files
The Debian documentation states about the system message:

Under systemd, the system logging utility rsyslogd(8) may be uninstalled. If it is installed, it changes its behavior to read the volatile binary log data (instead of pre-systemd default "/dev/log") and to create traditional permanent ASCII system log data. This can be customized by "/etc/default/rsyslog" and "/etc/rsyslog.conf" for both the log file and on-screen display. See rsyslogd(8) and rsyslog.conf(5).

The Debian 12 Vagrant box I've been using comes with rsyslogd installed and configured to forward the messages to the traditional log files, here is a snipped from /etc/rsyslog.conf where we can see the messages being directed to the files based on the log facility.

/etc/rsyslog.conf

#
# First some standard log files.  Log by facility.
#
auth,authpriv.*                 /var/log/auth.log
*.*;auth,authpriv.none          -/var/log/syslog
#cron.*                         /var/log/cron.log
daemon.*                        -/var/log/daemon.log
kern.*                          -/var/log/kern.log
lpr.*                           -/var/log/lpr.log
mail.*                          -/var/log/mail.log
user.*                          -/var/log/user.log

#
# Logging for the mail system.  Split it up so that
# it is easy to write scripts to parse these files.
#
mail.info                       -/var/log/mail.info
mail.warn                       -/var/log/mail.warn
mail.err                        /var/log/mail.err

#
# Some "catch-all" log files.
#
*.=debug;\
        auth,authpriv.none;\
        mail.none               -/var/log/debug
*.=info;*.=notice;*.=warn;\
        auth,authpriv.none;\
        cron,daemon.none;\
        mail.none               -/var/log/messages

Wikipedia lists the facilities and their names.

With that information it should be easy to "migrate" the system integration to use the journald input and test all ingest pipelines/dashboards.

nerijus · 2024-06-27T21:14:15Z

A lot of systems do not have (and do not want) rsyslog installed, so please don't make it a requirement for reading from journald.

pierrehilbert · 2024-06-28T07:28:13Z

I agree with @nerijus here, this is the main reason we need to support journald itself: avoiding to force our users to install rsyslog.

belimawr · 2024-06-28T13:56:42Z

A lot of systems do not have (and do not want) rsyslog installed, so please don't make it a requirement for reading from journald.

We won't ;). I just mentioned rsyslog as an easy means of testing/comparing the traditional log files harvesting and the journald on the same system. We won't make rsyslog a dependency for the journald input.

belimawr · 2024-08-09T16:43:08Z

I'm not sure if that's critical to GA the journald input, but we should better support binary fields. So far I found them in two situations:

Messages with control characters:

beats/filebeat/input/journald/pkg/journalctl/testdata/corner-cases.json

Line 64 in 63fc18c

    
           "MESSAGE": [85,110,97,98,108,101,32,116,111,32,112,97,114,115,101,32,105,110,100,105,99,97,116,111,114,32,111,102,32,65,84,43,66,73,65,32,99,111,109,109,97,110,100,58,32,65,84,43,66,73,65,61,49,44,49,44,48,44,48,44,48,13,65,84,43,66,67,83,61,50],

Actual binary blobs:

beats/filebeat/input/journald/pkg/journalctl/testdata/corner-cases.json

Line 14 in 63fc18c

    
           "COREDUMP_PROC_AUXV": [33,0,0,0,0,0,0,0,0,80,245,145,254,127,0,0,51,0,0,0,0,0,0,0,48,14,0,0,0,0,0,0,16,0,0,0,0,0,0,0,255,251,235,191,0,0,0,0,6,0,0,0,0,0,0,0,0,16,0,0,0,0,0,0,17,0,0,0,0,0,0,0,100,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,64,240,154,102,56,86,0,0,4,0,0,0,0,0,0,0,56,0,0,0,0,0,0,0,5,0,0,0,0,0,0,0,13,0,0,0,0,0,0,0,7,0,0,0,0,0,0,0,0,128,235,198,39,127,0,0,8,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,9,0,0,0,0,0,0,0,96,43,155,102,56,86,0,0,11,0,0,0,0,0,0,0,232,3,0,0,0,0,0,0,12,0,0,0,0,0,0,0,232,3,0,0,0,0,0,0,13,0,0,0,0,0,0,0,232,3,0,0,0,0,0,0,14,0,0,0,0,0,0,0,232,3,0,0,0,0,0,0,23,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,25,0,0,0,0,0,0,0,137,165,239,145,254,127,0,0,26,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,31,0,0,0,0,0,0,0,227,191,239,145,254,127,0,0,15,0,0,0,0,0,0,0,153,165,239,145,254,127,0,0,27,0,0,0,0,0,0,0,28,0,0,0,0,0,0,0,28,0,0,0,0,0,0,0,32,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],

Anyway I created an issue for that: #40479

andrewkroh · 2024-08-26T15:04:50Z

The input is not optimized for producing ECS fields. IIRC it does populate ECS fields but it also duplicates the same data into non-ECS fields. It would be much better to optimize the events coming out of the input before turning this on for users by default. All journald input users would benefit IMO.

I took a look at the current state of the fields produced by the input. My suggestions for aligning closer to ECS are in this table. Once we settle on mappings we should add a table to the input's documentation.

Current Name	Proposed ECS Field	Notes
message_id	log.syslog.msgid	https://www.elastic.co/guide/en/ecs/current/ecs-log.html#field-log-syslog-msgid
syslog.priority	log.syslog.priority	https://www.elastic.co/guide/en/ecs/current/ecs-log.html#field-log-syslog-priority
syslog.facility	log.syslog.facility.code
syslog.identifier	log.syslog.appname
syslog.pid	log.syslog.procid
systemd.cgroup		Leave as is?
systemd.invocation_id		Leave as is?
systemd.owner_uid		Leave as is?
systemd.session		Leave as is?
systemd.slice		Leave as is?
systemd.unit		Leave as is?
systemd.user_slice		Leave as is?
systemd.user_unit		Leave as is?
systemd.transport		Leave as is?
container.id_truncated		Already have container.id, not need to also have id_truncated. Drop this field.
container.log.tag		Don't need it because it is duplicated into the SYLOG_IDENTIFER (https://docs.docker.com/engine/logging/drivers/journald/). Drop this field.
container.partial		Means that the message was truncated. The value is a boolean "true". The input could append("tags", "partial_message") if CONTAINER_PARTIAL_MESSAGE == "true". We should not be adding fields into the container namespace that are not part of ECS.

belimawr · 2024-10-02T19:15:33Z

I found an issue with our current (main, 8.x) journald input implementation: #41083

I've already added it to the must have list for GA. The previous implementation did not have this problem. I already have an idea of how to fix it and how to test it.

belimawr · 2024-11-12T17:51:57Z

I came across a case where host.hostname coming from the journal event is overwritten by the hostname from the machine running filebeat, that's not directly a Filebeat issue because a standalone Filebeat can easily circumvent that, however for Elastic-Agent it can be an issue.

I don't think it's a blocker, but we should at least document it.

GitHub issue: elastic/integrations#11717
Discuss thread: https://discuss.elastic.co/t/cant-get-host-name-from-filebeat-input-by-journad-mode/370296/

cmacknz · 2024-11-12T19:59:09Z

I came across a case where host.hostname coming from the journal event is overwritten by the hostname from the machine running filebeat, that's not directly a Filebeat issue because a standalone Filebeat can easily circumvent that, however for Elastic-Agent it can be an issue.

We ideally would not do this and should track this in a separate issue. This breaks use cases where we receive journald logs from a remote host.

I think this was handled in other places by defining a forwarded tag that excludes the event from the add_host_metadata default processor. See here and here for agent.

Relates Disable host fields for "cloud", panw, cef modules #18223

belimawr · 2024-11-13T19:30:33Z

I think this was handled in other places by defining a forwarded tag that excludes the event from the add_host_metadata default processor. See here and here for agent.

I saw that, that's also what I recommended in the disucss thread, however I'm not sure what is the best approach to "solve" this issue: Do we add the forwarded tag by default in the integration? Do we just add a toggle for it and document the behaviour?

My gut feeling tells me we should use different fields, at least in this context of one Filebeat ingesting logs from multiple hosts.

I'll centralise this discussion on a single issue because it's affecting multiple projects in different ways.

cmacknz added the Team:Elastic-Agent Label for the Agent team label Nov 10, 2023

cmacknz mentioned this issue Nov 10, 2023

Datasets system.auth and system.syslog are not available for Debian 12 under Data streams tab. elastic/elastic-agent#3650

Open

This was referenced Feb 6, 2024

[journald] Review and add missing tests #37876

Open

[journald] Add configuration example to the default configurations #37877

Open

[journald] Elastic Agent System Integration elastic/integrations#9067

Open

pierrehilbert assigned belimawr Feb 17, 2024

pierrehilbert mentioned this issue Mar 12, 2024

Datasets system.auth and system.syslog are not available for AL2023 under Data streams tab. elastic/elastic-agent#4250

Open

pierrehilbert mentioned this issue Mar 28, 2024

[SLES 15]: No "system.auth" logs for system integration under Data Streams tab for SLES 15 linux agent. elastic/elastic-agent#4495

Open

cmacknz mentioned this issue Apr 18, 2024

Filbeat stops shipping journald logs when encountered "failed to read message field: bad message" error #32782

Closed

pierrehilbert mentioned this issue Apr 24, 2024

[journald] ECS conflicts with host.ip and source.ip fields elastic/integrations#9690

Open

pierrehilbert added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label May 5, 2024

belimawr mentioned this issue Jul 8, 2024

Replace github.com/coreos/go-systemd/v22/sdjournal by journalctl #40061

Merged

10 tasks

cmacknz mentioned this issue Oct 30, 2024

Add official support for Debian 12 elastic/elastic-agent#5893

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GA support for reading from journald #37086

GA support for reading from journald #37086

cmacknz commented Nov 10, 2023 •

edited

Loading

Must have

Nice to have

cmacknz commented Nov 10, 2023

leehinman commented Nov 10, 2023

cmacknz commented Nov 14, 2023 •

edited

Loading

cmacknz commented Nov 14, 2023

cmacknz commented Nov 14, 2023

rdner commented Nov 15, 2023

andrewkroh commented Jan 10, 2024 •

edited

Loading

cmacknz commented Jan 11, 2024 •

edited

Loading

andrewkroh commented Feb 9, 2024

belimawr commented May 1, 2024

belimawr commented May 1, 2024

elasticmachine commented May 5, 2024

belimawr commented Jun 3, 2024

belimawr commented Jun 6, 2024

cmacknz commented Jun 6, 2024

cmacknz commented Jun 18, 2024 •

edited

Loading

belimawr commented Jun 27, 2024

cmacknz commented Jun 27, 2024

belimawr commented Jun 27, 2024

belimawr commented Jun 27, 2024

nerijus commented Jun 27, 2024 •

edited

Loading

pierrehilbert commented Jun 28, 2024

belimawr commented Jun 28, 2024

belimawr commented Aug 9, 2024

andrewkroh commented Aug 26, 2024

belimawr commented Oct 2, 2024

belimawr commented Nov 12, 2024 •

edited

Loading

cmacknz commented Nov 12, 2024

belimawr commented Nov 13, 2024

GA support for reading from journald #37086

GA support for reading from journald #37086

Comments

cmacknz commented Nov 10, 2023 • edited Loading

Must have

Nice to have

cmacknz commented Nov 10, 2023

leehinman commented Nov 10, 2023

cmacknz commented Nov 14, 2023 • edited Loading

cmacknz commented Nov 14, 2023

cmacknz commented Nov 14, 2023

rdner commented Nov 15, 2023

andrewkroh commented Jan 10, 2024 • edited Loading

cmacknz commented Jan 11, 2024 • edited Loading

andrewkroh commented Feb 9, 2024

belimawr commented May 1, 2024

belimawr commented May 1, 2024

elasticmachine commented May 5, 2024

belimawr commented Jun 3, 2024

belimawr commented Jun 6, 2024

cmacknz commented Jun 6, 2024

cmacknz commented Jun 18, 2024 • edited Loading

belimawr commented Jun 27, 2024

cmacknz commented Jun 27, 2024

belimawr commented Jun 27, 2024

belimawr commented Jun 27, 2024

nerijus commented Jun 27, 2024 • edited Loading

pierrehilbert commented Jun 28, 2024

belimawr commented Jun 28, 2024

belimawr commented Aug 9, 2024

andrewkroh commented Aug 26, 2024

belimawr commented Oct 2, 2024

belimawr commented Nov 12, 2024 • edited Loading

cmacknz commented Nov 12, 2024

belimawr commented Nov 13, 2024

cmacknz commented Nov 10, 2023 •

edited

Loading

cmacknz commented Nov 14, 2023 •

edited

Loading

andrewkroh commented Jan 10, 2024 •

edited

Loading

cmacknz commented Jan 11, 2024 •

edited

Loading

cmacknz commented Jun 18, 2024 •

edited

Loading

nerijus commented Jun 27, 2024 •

edited

Loading

belimawr commented Nov 12, 2024 •

edited

Loading