Skip to content

Commit

Permalink
Introduce auto detection of format (#18095) (#18555)
Browse files Browse the repository at this point in the history
* Introduce auto detection of format

* Update docs

* Auto detect format for slowlogs

* Exclude JSON logs from multiline matching

* Adding CHANGELOG entry

* Fix typo

* Parsing everything as JSON first

* Going back to old processor definitions

* Adding Known Issues section in doc

* Completing regex pattern

* Updating regex pattern

* Generating docs
  • Loading branch information
ycombinator authored May 15, 2020
1 parent 462a2b4 commit 28284da
Show file tree
Hide file tree
Showing 15 changed files with 109 additions and 58 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.next.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -405,6 +405,7 @@ https://github.com/elastic/beats/compare/v7.0.0-alpha2...master[Check the HEAD d
- Improve ECS categorization field mappings in osquery module. {issue}16176[16176] {pull}17881[17881]
- Add support for v10, v11 and v12 logs on Postgres {issue}13810[13810] {pull}17732[17732]
- Add dashboard for Google Cloud Audit and AWS CloudTrail. {pull}17379[17379]
- The `logstash` module can now automatically detect the log file format (JSON or plaintext) and process it accordingly. {issue}9964[9964] {pull}18095[18095]

*Heartbeat*

Expand Down
23 changes: 8 additions & 15 deletions filebeat/docs/modules/logstash.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ This file is generated! See scripts/docs_collector.py

== Logstash module

The +{modulename}+ module parse logstash regular logs and the slow log, it will support the plain text format
and the JSON format (--log.format json). The default is the plain text format.
The +{modulename}+ modules parse logstash regular logs and the slow log, it will support the plain text format
and the JSON format.

include::../include/what-happens.asciidoc[]

Expand All @@ -34,19 +34,17 @@ The Logstash `slowlog` fileset was tested with logs from Logstash 5.6 and 6.0
include::../include/configuring-intro.asciidoc[]

The following example shows how to set paths in the +modules.d/{modulename}.yml+
file to override the default paths for Logstash logs and set the format to json
file to override the default paths for Logstash logs.

["source","yaml",subs="attributes"]
-----
- module: logstash
log:
enabled: true
var.paths: ["/path/to/log/logstash.log*"]
var.format: json
slowlog:
enabled: true
var.paths: ["/path/to/log/logstash-slowlog.log*"]
var.format: json
-----

To specify the same settings at the command line, you use:
Expand All @@ -68,21 +66,11 @@ include::../include/config-option-intro.asciidoc[]

include::../include/var-paths.asciidoc[]

*`var.format`*::

The configured Logstash log format. Possible values are: `json` or `plain`. The
default is `plain`.

[float]
==== `slowlog` fileset settings

include::../include/var-paths.asciidoc[]

*`var.format`*::

The configured Logstash log format. Possible values are: `json` or `plain`. The
default is `plain`.

include::../include/timezone-support.asciidoc[]

[float]
Expand All @@ -96,6 +84,11 @@ image::./images/kibana-logstash-log.png[]
[role="screenshot"]
image::./images/kibana-logstash-slowlog.png[]

[float]
=== Known issues
When using the `log` fileset to parse plaintext logs, if a multiline plaintext log contains an embedded JSON objct such that
the JSON object starts on a new line, the fileset may not parse the multiline plaintext log event correctly.

:has-dashboards!:

:fileset_ex!:
Expand Down
23 changes: 8 additions & 15 deletions filebeat/module/logstash/_meta/docs.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@

== Logstash module

The +{modulename}+ module parse logstash regular logs and the slow log, it will support the plain text format
and the JSON format (--log.format json). The default is the plain text format.
The +{modulename}+ modules parse logstash regular logs and the slow log, it will support the plain text format
and the JSON format.

include::../include/what-happens.asciidoc[]

Expand All @@ -29,19 +29,17 @@ The Logstash `slowlog` fileset was tested with logs from Logstash 5.6 and 6.0
include::../include/configuring-intro.asciidoc[]

The following example shows how to set paths in the +modules.d/{modulename}.yml+
file to override the default paths for Logstash logs and set the format to json
file to override the default paths for Logstash logs.

["source","yaml",subs="attributes"]
-----
- module: logstash
log:
enabled: true
var.paths: ["/path/to/log/logstash.log*"]
var.format: json
slowlog:
enabled: true
var.paths: ["/path/to/log/logstash-slowlog.log*"]
var.format: json
-----

To specify the same settings at the command line, you use:
Expand All @@ -63,21 +61,11 @@ include::../include/config-option-intro.asciidoc[]

include::../include/var-paths.asciidoc[]

*`var.format`*::

The configured Logstash log format. Possible values are: `json` or `plain`. The
default is `plain`.

[float]
==== `slowlog` fileset settings

include::../include/var-paths.asciidoc[]

*`var.format`*::

The configured Logstash log format. Possible values are: `json` or `plain`. The
default is `plain`.

include::../include/timezone-support.asciidoc[]

[float]
Expand All @@ -91,6 +79,11 @@ image::./images/kibana-logstash-log.png[]
[role="screenshot"]
image::./images/kibana-logstash-slowlog.png[]

[float]
=== Known issues
When using the `log` fileset to parse plaintext logs, if a multiline plaintext log contains an embedded JSON objct such that
the JSON object starts on a new line, the fileset may not parse the multiline plaintext log event correctly.

:has-dashboards!:

:fileset_ex!:
Expand Down
4 changes: 1 addition & 3 deletions filebeat/module/logstash/log/config/log.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,10 @@ paths:
{{ end }}
exclude_files: [".gz$"]

{{ if eq .format "plain" }}
multiline:
pattern: ^\[[0-9]{4}-[0-9]{2}-[0-9]{2}
pattern: ^((\[[0-9]{4}-[0-9]{2}-[0-9]{2}[^\]]+\])|({.+}))
negate: true
match: after
{{ end }}

processors:
# Locale for time zone is only needed in non-json logs
Expand Down
3 changes: 0 additions & 3 deletions filebeat/module/logstash/log/ingest/pipeline-json.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,6 @@ processors:
- json:
field: message
target_field: logstash.log
- rename:
field: '@timestamp'
target_field: event.created
- convert:
field: logstash.log.timeMillis
type: string
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,6 @@ processors:
%{GREEDYMULTILINE:message}
- \[%{TIMESTAMP_ISO8601:logstash.log.timestamp}\]\[%{LOGSTASH_LOGLEVEL:log.level}\s?\]\[%{LOGSTASH_CLASS_MODULE:logstash.log.module}\s*\]
%{GREEDYMULTILINE:message}
- rename:
field: '@timestamp'
target_field: event.created
- date:
if: ctx.event.timezone == null
field: logstash.log.timestamp
Expand Down
24 changes: 24 additions & 0 deletions filebeat/module/logstash/log/ingest/pipeline.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
description: Pipeline for parsing logstash node logs
processors:
- rename:
field: '@timestamp'
target_field: event.created
- grok:
field: message
patterns:
- ^%{CHAR:first_char}
pattern_definitions:
CHAR: .
- pipeline:
if: ctx.first_char != '{'
name: '{< IngestPipeline "pipeline-plaintext" >}'
- pipeline:
if: ctx.first_char == '{'
name: '{< IngestPipeline "pipeline-json" >}'
- remove:
field:
- first_char
on_failure:
- set:
field: error.message
value: '{{ _ingest.on_failure_message }}'
13 changes: 8 additions & 5 deletions filebeat/module/logstash/log/manifest.yml
Original file line number Diff line number Diff line change
@@ -1,13 +1,16 @@
module_version: 1.0

var:
- name: format
default: plain
- name: paths
default:
- /var/log/logstash/logstash-{{.format}}*.log
- /var/log/logstash/logstash-plain*.log
- /var/log/logstash/logstash-json*.log
os.windows:
- c:/programdata/logstash/logs/logstash-{{.format}}*.log
- c:/programdata/logstash/logs/logstash-plain*.log
- c:/programdata/logstash/logs/logstash-json*.log

ingest_pipeline: ingest/pipeline-{{.format}}.yml
ingest_pipeline:
- ingest/pipeline.yml
- ingest/pipeline-plaintext.yml
- ingest/pipeline-json.yml
input: config/log.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,4 +31,4 @@
"message": "Encountered a retryable error. Will Retry with exponential backoff...",
"service.type": "logstash"
}
]
]
8 changes: 7 additions & 1 deletion filebeat/module/logstash/log/test/logstash-plain.log
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,10 @@
[2017-11-20T03:55:00,318][INFO ][logstash.inputs.jdbc ] (0.058950s) Select Name as [person.name]
, Address as [person.address]
from people

[2020-05-13T11:00:26,431][INFO ][logstash.inputs.json ] (0.158950s) {
"foo": [
{
"bar": "baz"
}
]
}
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,25 @@
"log.level": "INFO",
"log.offset": 175,
"logstash.log.module": "logstash.inputs.jdbc",
"message": "(0.058950s) Select Name as [person.name]\n, Address as [person.address]\nfrom people\n",
"message": "(0.058950s) Select Name as [person.name]\n, Address as [person.address]\nfrom people",
"service.type": "logstash"
},
{
"@timestamp": "2020-05-13T11:00:26.431-02:00",
"event.dataset": "logstash.log",
"event.kind": "event",
"event.module": "logstash",
"event.timezone": "-02:00",
"event.type": "info",
"fileset.name": "log",
"input.type": "log",
"log.flags": [
"multiline"
],
"log.level": "INFO",
"log.offset": 318,
"logstash.log.module": "logstash.inputs.json",
"message": "(0.158950s) {\n\"foo\": [\n{\n \"bar\": \"baz\"\n}\n]\n}",
"service.type": "logstash"
}
]
3 changes: 0 additions & 3 deletions filebeat/module/logstash/slowlog/ingest/pipeline-json.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,6 @@ processors:
- json:
field: message
target_field: logstash.slowlog
- rename:
field: '@timestamp'
target_field: event.created
- convert:
field: logstash.slowlog.timeMillis
type: string
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,6 @@ processors:
patterns:
- '{:plugin_params=>%{GREEDYDATA:logstash.slowlog.plugin_params}, :took_in_nanos=>%{NUMBER:event.duration},
:took_in_millis=>%{NUMBER:logstash.slowlog.took_in_millis}, :event=>%{GREEDYDATA:logstash.slowlog.event}}'
- rename:
field: '@timestamp'
target_field: event.created
- date:
if: ctx.event.timezone == null
field: logstash.slowlog.timestamp
Expand Down
24 changes: 24 additions & 0 deletions filebeat/module/logstash/slowlog/ingest/pipeline.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
description: Pipeline for parsing logstash slow logs
processors:
- rename:
field: '@timestamp'
target_field: event.created
- grok:
field: message
patterns:
- ^%{CHAR:first_char}
pattern_definitions:
CHAR: .
- pipeline:
if: ctx.first_char != '{'
name: '{< IngestPipeline "pipeline-plaintext" >}'
- pipeline:
if: ctx.first_char == '{'
name: '{< IngestPipeline "pipeline-json" >}'
- remove:
field:
- first_char
on_failure:
- set:
field: error.message
value: '{{ _ingest.on_failure_message }}'
13 changes: 8 additions & 5 deletions filebeat/module/logstash/slowlog/manifest.yml
Original file line number Diff line number Diff line change
@@ -1,13 +1,16 @@
module_version: 1.0

var:
- name: format
default: plain
- name: paths
default:
- /var/log/logstash/logstash-slowlog-{{.format}}*.log
- /var/log/logstash/logstash-slowlog-plain*.log
- /var/log/logstash/logstash-slowlog-json*.log
os.windows:
- c:/programdata/logstash/logs/logstash-slowlog-{{.format}}*.log
- c:/programdata/logstash/logs/logstash-slowlog-plain*.log
- c:/programdata/logstash/logs/logstash-slowlog-json*.log

ingest_pipeline: ingest/pipeline-{{.format}}.yml
ingest_pipeline:
- ingest/pipeline.yml
- ingest/pipeline-plaintext.yml
- ingest/pipeline-json.yml
input: config/slowlog.yml

0 comments on commit 28284da

Please sign in to comment.