-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SQS metricset into AWS metricbeat module #10684
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some early comments. Hope it's ok as it's still in progress
.
init = false | ||
output, err := aws.GetMetricDataPerRegion(metricDataQueries, getMetricDataOutput.NextToken, svcCloudwatch, startTime, endTime) | ||
if err != nil { | ||
err = errors.Wrap(err, "getMetricDataPerRegion failed, skipping region "+regionName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this pattern across all our metricsets, we should come up wit a 1 liner to do this. @jsoriano FYI
I think @ycombinator already did something about this in the ES module.
if err != nil { | ||
m.logger.Error(err.Error()) | ||
event.Error = err | ||
report.Event(event) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use report.Error(...) here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure... report.Error is what I want here... Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is correct as is if you want to report an event with whatever data it has AND an error.
In this case it seems that this would be an error with some metadata (service.name
, cloud.region
) and without any metrics, I think this could be acceptable.
If this is the case and this is the only thing blocking this PR I think this would be ready to go.
"count": c.Int("ApproximateNumberOfMessagesNotVisible"), | ||
}, | ||
"visible": s.Object{ | ||
"count": c.Int("ApproximateNumberOfMessagesVisible"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's such a mess to have approximated values. I'm worried that users might rely on them as source of truth later and will open issues wondering why their maths doesn't match what they are expecting (from the field names)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These approximate metrics are directly from cloudwatch so if the users check in cloudwatch, the data should match what we have here.
|
||
metricSet, err := aws.NewMetricSet(base) | ||
if err != nil { | ||
return nil, errors.Wrap(err, "error creating aws metricset") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️ to add error to context before returning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good apart from the comments that @ruflin already left.
@@ -2,7 +2,7 @@ This module periodically fetches monitoring metrics from AWS Cloudwatch using | |||
https://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_GetMetricData.html[GetMetricData API] for running | |||
EC2 instances. Note: extra AWS charges on GetMetricData API requests will be generated by this module. | |||
|
|||
The default metricset is `ec2`. | |||
The default metricset is `ec2` and `sqs`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default metricset is `ec2` and `sqs`. | |
The default metricsets are `ec2` and `sqs`. |
if err != nil { | ||
m.logger.Error(err.Error()) | ||
event.Error = err | ||
report.Event(event) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is correct as is if you want to report an event with whatever data it has AND an error.
In this case it seems that this would be an error with some metadata (service.name
, cloud.region
) and without any metrics, I think this could be acceptable.
If this is the case and this is the only thing blocking this PR I think this would be ready to go.
"oldest_message_age": { | ||
"sec": 86404 | ||
}, | ||
"sent_message_size": {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't have bytes inside, this should not be here I assume?
}, | ||
"empty_receives": c.Float("NumberOfEmptyReceives"), | ||
"sent_message_size": s.Object{ | ||
"bytes": c.Float("SentMessageSize"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still struggling with having bytes
as a float. I understand it's because we get the average but should we just round it up or down?
The main issue I see is that we use scaled_floats
which can become inaccurate for large numbers if I remember correctly. So using long would scale better.
This PR is to add SQS metricset into AWS metricbeat module. CloudWatch metrics for Amazon SQS queues are automatically collected and pushed to CloudWatch every five minutes. So sqs metricset can share the same config as ec2.