-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[libbeat] Add Syslog parser and processor #30541
Conversation
- Add Syslog parser - Add Syslog processor - Add unit tests and benchmarks - Add processor documentation
This pull request does not have a backport label. Could you fix it @taylor-swanson? 🙏
NOTE: |
Pinging @elastic/security-external-integrations (Team:Security-External Integrations) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving some high-level comments. I did not dive into any of the parser.
- Improve timestamp comments - Append error.message if field key already exists - Fix failure and missing counters with respect to ignore options - Remap syslog fields to ECS fields - Add tag option to processor for debug annotation - Add parser documentation
This pull request is now in conflicts. Could you fix it? 🙏
|
- Fix tabs in yaml in docs - Add link to HTTP endpoint documentation - Use errors.Is - Don't set event.original if data was empty string - Remove whitespace trim when setting message field
<165>1 2003-08-24T05:14:30.000003-07:00 192.0.2.1 myproc 8710 - - at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:77) | ||
<165>1 2003-08-24T05:14:30.000003-07:00 192.0.2.1 myproc 8710 - - at org.elasticsearch.action.admin.indices.delete.TransportDeleteIndexAction.checkBlock(TransportDeleteIndexAction.java:75) | ||
`, | ||
expectedMessages: []string{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
- These fields will be set by other processors if desired by the user
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I recommend running golangci-lint before merging.
It might be worthwhile to explain the behavior of dates without years in the docs. Sometimes people ingesting logs from the previous year are surprised when those logs appear to be in the future (e.g. logs from Dec 31 ingested on Jan 1 get the wrong year).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't do a thorough review but LGTM.
What are your thoughts on something like this: func (m *message) setTimestampBSD(v string, ref time.Time) {
if parsed, err := time.ParseInLocation(time.Stamp, v, ref.Location()); err == nil {
year := ref.Year()
if parsed.Month() == time.December && ref.Month() == time.January {
year--
}
m.timestamp = parsed.AddDate(year, 0, 0)
}
} If the parsed month is December and the current month is now January, we decrement the year by 1. This gives a month on either side of the year boundary to make sure the year is correct. If we're outside of this window, all bets are off. Another benefit with this is the function now accepts a time.Time. We can write unit tests to explicitly test the year boundary and any other cases we can think of. Either way, I'll document this situation and suggest that if logs are meant to be stored more long term and ingested at a later date, then it is critical that they switch to a timestamp that provides more information (ISO 8601/RFC 3339). Heck, the RFC even says this:
|
I would avoid the heuristic based solution (or at least make it optional behavior). There are a lot of edge cases. Logstash has some discussion on the topic in logstash-plugins/logstash-filter-date#51 (comment). |
That's fair. In that case, I think this is something that should be handled in a separate issue. The current implementation behaves just like the existing input, so behavior hasn't changed. Having configuration for this that could handle several different scenarios would provide the better options for customers. What I proposed would not help a customer ingest logs older than a year, for example. It was only meant to handle year boundaries, but even at that there are still edge cases. |
* [libbeat] Add Syslog parser and processor - Add Syslog parser - Add Syslog processor - Add unit tests and benchmarks - Add processor documentation - Append error.message if field key already exists - Fix failure and missing counters with respect to ignore options - Remap syslog fields to ECS fields - Add tag option to processor for debug annotation - Add parser documentation - Add docs for time zone and year enrichment
* [libbeat] Add Syslog parser and processor - Add Syslog parser - Add Syslog processor - Add unit tests and benchmarks - Add processor documentation - Append error.message if field key already exists - Fix failure and missing counters with respect to ignore options - Remap syslog fields to ECS fields - Add tag option to processor for debug annotation - Add parser documentation - Add docs for time zone and year enrichment
What does this PR do?
Why is it important?
This change allows us to detach syslog message parsing from a specific filebeat input. A new processor and parser have been added to libbeat, each providing the ability to parse RFC 3164 and RFC 5424-formatted syslog messages.
Checklist
[ ] I have made corresponding change to the default configuration filesCHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.How to test this PR locally
Run unit tests in these packages:
Benchmark tests are also available.
Example config for a filebeat processor:
Example config for a filebeat parser:
Related issues