-
Notifications
You must be signed in to change notification settings - Fork 641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes and Improvements to botocore instrumentation #150
Fixes and Improvements to botocore instrumentation #150
Conversation
e566384
to
f098c5c
Compare
27fbdaa
to
2473eff
Compare
) | ||
self.assertEqual(spans[1].attributes["params.Key"], str(params["Key"])) | ||
self.assertTrue("params.Body" not in spans[1].attributes.keys()) | ||
get_object_attributes = dict(spans[2].attributes) | ||
self.assertEqual( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why RequestId
doesn't return here, but I raised an issue to ask if my setup is wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got a reply confirming that the RequestId
is missing because its simply not added to the S3 "mock" of the service in moto
, so this is the best we can do right now :)
cb661e3
to
b1ac40f
Compare
b1ac40f
to
b423b0c
Compare
{ | ||
"aws.operation": "DescribeInstances", | ||
"aws.region": "us-west-2", | ||
"aws.request_id": "fdcdcab1-ae5c-489e-9c33-4637c5dda355", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why the IDs are sometimes consistent and sometimes not with moto
.
Sometimes the ID is consistent like this EC2 call:
{
'Reservations': [],
'ResponseMetadata': {'RequestId': 'fdcdcab1-ae5c-489e-9c33-4637c5dda355',
'HTTPStatusCode': 200,
'HTTPHeaders': {'server': 'amazon.com'},
'RetryAttempts': 0}
}
and other times it's randomly generated like in this SQS call:
{
'ResponseMetadata': {
'RequestId': 'JWH6CYKFXYT2KXQSAYY50T6FK558XMQ7U9SOWLJ5YEQZOKULQ012',
'HTTPStatusCode': 200,
'HTTPHeaders': {'server': 'amazon.com',
'x-amzn-requestid': 'JWH6CYKFXYT2KXQSAYY50T6FK558XMQ7U9SOWLJ5YEQZOKULQ012',
'x-amz-crc32': '1416772655'},
'RetryAttempts': 0
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any insights on why RequestId
looks different on SQS ? my guess would be they are not generated using same lib or similar standards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! I looked into it, and found this code used to generate the request ID in the moto
repository:
It was explained to me that this function is used by some of the resources to attach a request ID.
Notably the amzn_request_id()
decorator is used on SQS and DynamoDB and that's why their request IDs look like this 🙂
And some services have hardcoded requests like EC2 DescribeInstances
(https://github.com/spulec/moto/blob/50d71eccbee12ce5cf2c1ecf61bfcbd60317976d/moto/ec2/responses/instances.py#L479)
I'm not sure why it doesn't format the request IDs in the expected way (lower case joined with hyphens) because it used to be correct.
In that case, should I remove the regex check and just make it a check for the key? Let me know what you think!
{ | ||
"aws.operation": "DescribeInstances", | ||
"aws.region": "us-west-2", | ||
"aws.request_id": "fdcdcab1-ae5c-489e-9c33-4637c5dda355", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any insights on why RequestId
looks different on SQS ? my guess would be they are not generated using same lib or similar standards.
("action", "params", "path", "verb"), | ||
{"params", "path", "verb"}, | ||
) | ||
error = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
declare these values at the top ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure can do. I guess I thought it would be better to leave them down here though since they don't need to be in scope at the top?
span.attributes, | ||
{ | ||
"aws.operation": "ListObjects", | ||
"aws.region": "us-west-2", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason on why don't we validate status code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this is a ParamValidationError
the botocore
library will never send the HTTP request and so there's no http.status_code
to check!
But that's a good point! I can add a check to make sure this span was recorded as an error.
self.assertIs(
span.status.status_code, trace_api.status.StatusCode.ERROR,
)
instrumentation/opentelemetry-instrumentation-botocore/tests/test_botocore_instrumentation.py
Show resolved
Hide resolved
b423b0c
to
9b90b01
Compare
...tation/opentelemetry-instrumentation-boto/src/opentelemetry/instrumentation/boto/__init__.py
Show resolved
Hide resolved
} | ||
if isinstance(dict_, dict) | ||
else {prefix: dict_} | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: looks like these functions don't depend on anything from the outer context and take everything they need as arguments so do we really need them to be defined in line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be moved into opentelemetry.instrumentation.utils
anyways. I'll keep the Stack Overflow comment and add a comment about it being better suited as a util function for now.
if not span.is_recording(): | ||
return | ||
|
||
if endpoint_name not in {"kms", "sts"}: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a comment here explaining why this check exists would be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I just copied this code because I didn't want to change the boto
functionality while updating botocore
. I can only guess that it's because these are the Key Management Service
and Secure Token Service
services so maybe their API trace calls should not be traced?
I added a comment mentioning that but I can't guess pass that. It doesn't look like I the commit history says who added these in the first place either.
for key, value in { | ||
k: truncate_arg_value(v) | ||
for k, v in tags.items() | ||
if k not in {"s3": ["params.Body"]}.get(endpoint_name, []) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit cryptic and intention might not be very clear. Mind breaking it down or simplifying? Looks we really want to check something like if endpoint_name == 's3' and k == 'params.Body'
? Personally, I'd write it out as a regular loop clearly self-documenting the intent and separating logic to multiple lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really agree with you, thanks for the tip! I've updated it to look cleaner, please let me know what you think!
...pentelemetry-instrumentation-botocore/src/opentelemetry/instrumentation/botocore/__init__.py
Show resolved
Hide resolved
0144715
to
1acf758
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM :)
c9251c3
to
9fe220e
Compare
9fe220e
to
467e513
Compare
Description
This PR aims to update the
botocore
instrumentation by doing the following:Fixes:
span.resource
since resource should be immutableSpanKind
fromCONSUMER
toCLIENT
becausebotocore
api calls can be considered remote outgoing and synchronous requests requests as laid out in the SpanKind specifications - Important so that AWS services are treated as separate services in Tracing Backends. (See picture below!)instrumentation.utils.unwrap
instead of creating the function againSimplifications:
deep_getattr
and instead pulls values from properties of thebotocore.BaseClient
class since they should always be thereImprovements:
request_id
if it can be found in the headers instead of the usual location. Just like .NET botocore instrumentationaws.table_name
andaws.queue_url
for DynamoDB and SQS specifically as traced properties (this will be added in the specifications)Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
SpanKind.CLIENT
, S3 (and other services) are considered their own servicesBefore:
After:
Checklist: