-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add awscloudwatch filebeat input #19025
Add awscloudwatch filebeat input #19025
Conversation
Pinging @elastic/integrations-platforms (Team:Platforms) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to see this progressing! I did a first pass, I think we can better leverage nextToken
when reading from the API, and avoid using timestamps for filtering
ScanFrequency time.Duration `config:"scan_frequency" validate:"min=0,nonzero"` | ||
APITimeout time.Duration `config:"api_timeout" validate:"min=0,nonzero"` | ||
APISleep time.Duration `config:"api_sleep" validate:"min=0,nonzero"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
api_sleep and scan_frequency are very similar concepts with very different names here. Would it make sense to unify these a little bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scan_frequency defines the sleep time between this Filebeat input recheck for new logs
api_sleep defines the sleep time between each FilterLogEvents API call in the same Filebeat collection cycle.
How about scan_frequency and api_freqency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point, let's keep api_sleep and make sure this is well documented
"awscloudwatch": common.MapStr{ | ||
"log_group": logGroup, | ||
"log_stream": *logEvent.LogStreamName, | ||
"ingestion_time": time.Unix(*logEvent.IngestionTime/1000, 0), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this can be mapped to event.ingested
https://www.elastic.co/guide/en/ecs/current/ecs-event.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
event.ingested
in ECS is the timestamp when an event arrived in the central data store. My understanding is this is the timestamp when event gets to Elasticsearch. But this ingestion_time
is the time the event was ingested into AWS CloudWatch.
Maybe I understand event.ingested
in ECS wrong? 😬
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you may be right here, let's keep current naming for now
* Add awscloudwatch filebeat input * Use log group ARN instead of log group name and region name * add api_sleep, log_group_name and region_name config
What does this PR do?
This PR is to add awscloudwatch input into Filebeat. FilterLogEvents AWS API is used to get all log events from a given log group.
The config for using
awscloudwatch
input looks like below:With this config, Filebeat will enable
awscloudwatch
input to collect all logs from log grouptest
in regionus-east-1
(this info is parsed from the log group ARN) starting from the beginning of the log group and then check for new log events every 1-minute(defined by thescan_frequency
).If users only wants to collect new log events/messages from now going forward, then
start_position
can be specified to beend
.User can also specify a list of log streams under the log group to collect logs from or a
log_stream_prefix
to collect events only from log streams that have names starting with this prefix.Sample output document:
Why is it important?
This input allows user to collect logs from CloudWatch without sending them into S3 bucket with SQS setup for notification.
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.How to test this PR locally
Related issues
closes #17292
Screenshots
When setting start_position to beginning, all existing logs will be collected and then Filebeat will scan every 30 seconds(based on the
scan_frequency
):