-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support patching PynamoDB calls to DynamoDB #13
Conversation
Good call. Patching For current patching you only get 1. But patching the botocore vendored requests you get all three benefits. Though it means a little bit more work on parameter capture since you go one level deeper on the code path. The X-Ray SDK support requests only if your application has direct dependency on it. It leaves botocore vendored requests untouched. So when you patch PynamoDB you can directly dive into the vendored requests and do the necessary patch there. You just need to make sure when you have for example 2 retries, it creates 3 sibling subsegments (1 original call + 2 retries) where for each subsegment it represents a DynamoDB call. This way you gain more value on tracing the DynamoDB calls from your contribution. The AWS Go SDK has hooks that provide insights on internal retries and TLS info for each retry. Check what you can get from X-Ray Go SDK https://github.com/aws/aws-xray-sdk-go. Although boto doesn't provide as much as Go you can still get more insights on why a call is taking long. |
Is there a way to mark retries as retries or is it just about creating a separate subsegment per retry? |
It will just create a separate subsegment per retry. This make sure you get the timing and status code for every single retry for more insights. A retry count is not good enough IMO. And as a general library support it'd be better to give more insights to fit more use cases. |
I just pushed a new commit, now directly patching I'd appreciate if you could take another look at it. |
docs/thirdparty.rst
Outdated
Patching boto3 and botocore are equivalent since boto3 depends on botocore | ||
Patching boto3 and botocore are equivalent since boto3 depends on botocore. | ||
|
||
Patching pynamodb patches botocore as well, as it depends partially on botocore. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest put more details on what is patched for pynamodb
so other users utilizing your work have more details and know the implication of patching pynamodb
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I extended the paragraph a bit. I hope that's sufficient.
tests/ext/pynamodb/test_pynamodb.py
Outdated
|
||
subsegment = xray_recorder.current_segment().subsegments[0] | ||
assert subsegment.error | ||
assert subsegment.http['response']['status'] == 400 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What exception are you trying to produce? Is the code always guaranteed to be 400 or it could be different depend on users and machines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests currently do real requests to DynamoDB, as I don't know how to avoid them. Using the botocore stubber, as in the botocore tests, doesn't work, because we're patching requests here directly. The requests made by the tests currently return a access denied unless there are is access to resources called "mytable" and "mybucket" available.
|
||
def pynamodb_meta_processor(wrapped, instance, args, kwargs, return_value, | ||
exception, subsegment, stack): | ||
operation_name = args[0].headers['X-Amz-Target'].decode('utf-8').split('.')[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to capture request_id
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out. Turned out I searched for it in the request instead of the response. Added it now.
xray_recorder.clear_trace_entities() | ||
|
||
|
||
def test_exception(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a test to guard the behavior where vendored requests only emit subsegments when talking to DynamoDB? In other words after patching pynamodb
other AWS services are still captured only through original patched botocore code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it pretty difficult to write proper tests for the whole stuff, thanks to all the patching. I added a test now to check that only a single subsegment gets created when the pynamodb patch is enabled, but another AWS service is called. That single subsegment gets created, because the botocore patch is active in this case too.
I fixed some formatting issue on docs/thirdparty.rst when doing SDK 0.95 release and looks like that caused some conflicts. Sorry for the inconvenience. Could you pull the latest commits, resolve the conflicts and merge your commits into one single commit? Thanks. |
All remarks should be addressed now. Waiting for the next round of feedback. 😄 |
docs/thirdparty.rst
Outdated
|
||
Patching pynamodb applies the botocore patch as well, as it uses the logic from the botocore | ||
patch to apply the trace header. As capturing the pynamodb calls relies on intercepting all | ||
calls made via botocore separately from the botocore patch, there is a small performance penalty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not mention the performance penalty
here. An extra if condition is really neglectable IMO considering how much code is actually executed in botocore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
tests/ext/pynamodb/test_pynamodb.py
Outdated
subsegment = subsegments[0] | ||
assert len(subsegment.subsegments) == 0 | ||
assert subsegment.error | ||
assert subsegment.http['response']['status'] in [400, 403] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest remove status code assertion. Any 4xx could be returned depend on setup of the developer's machine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
session = botocore.session.get_session() | ||
s3 = session.create_client('s3', region_name='us-west-2') | ||
try: | ||
s3.get_bucket_location(Bucket='mybucket') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're making external http calls anyway on the dynamo test case so I would suggest to not cause a Client error here. Exception raised from remote endpoint makes sure it reach further down to the botocore code path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid I don't get your point. I am doing an external call in this test as well and as the remote end returns an error (because I'm not authorized to access that bucket) I get an error back. I have to catch this error to be able to properly do the assertions afterwards. Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad. I got confused by ClientError
. I thought it was raised by library due to invalid parameters. In that case there is not any http related code executed.
tests/ext/pynamodb/test_pynamodb.py
Outdated
|
||
subsegments = xray_recorder.current_segment().subsegments | ||
assert len(subsegments) == 1 | ||
assert len(subsegments[0].subsegments) == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A little bit more assertion to make sure this subsegment is not some random subsegment generated anywhere else. e.g. name assertion should be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
While [PynamoDB][1] uses `botocore` under the hood it doesn't call `botocore`'s `BaseClient._make_api_call`. Instead it implements its own `Connection._make_api_call`, making it necessary to separately patch PynamoDB, even though it uses `botocore`. As PynamoDB's `Connection._make_api_call` doesn't return information about the response as `botocore` does, it's necessary directly patch the `send`-method used by `requests`. For adding the trace header patching `botocore` is sufficient, as PynamoDB does use `botocore`'s prepared requests. To make that work out of the box the overall patching logic got adjusted to automatically patch `botocore` as well when pynamodb patching is selected. [1]: https://github.com/pynamodb/PynamoDB
While PynamoDB uses
botocore
under the hood it doesn't callbotocore
'sBaseClient._make_api_call
. Instead it implements its ownConnection._make_api_call
, making it necessary to separately patchPynamoDB, even though it uses
botocore
.As PynamoDB's
Connection._make_api_call
doesn't return informationabout the response as
botocore
does, there is no response informationto pass to AWS X-Ray.
For adding the trace header patching
botocore
is sufficient, asPynamoDB does use
botocore
's prepared requests. To make that work outof the box the overall patching logic got adjusted to automatically
patch
botocore
as well when pynamodb patching is selected.For tests I tried to do something useful, but only came up with the
single test which calls real DynamoDB.
I'm not sure if patching
_make_api_call
is the right approach at allor if it would be better to patch the
requests.session.Session.send
-call in PynamoDB. That would giveaccess to the response data, while causing a separate subsegment per
retry in case of throttling.
So I'm looking forward to get some feedback.
All in all it's pretty annoying that PynamoDB needs additional logic
while it uses
botocore
andrequests
which are both alreadysupported.