Support patching PynamoDB calls to DynamoDB #13

Dunedan · 2017-12-08T09:57:40Z

While PynamoDB uses botocore under the hood it doesn't call
botocore's BaseClient._make_api_call. Instead it implements its own
Connection._make_api_call, making it necessary to separately patch
PynamoDB, even though it uses botocore.

As PynamoDB's Connection._make_api_call doesn't return information
about the response as botocore does, there is no response information
to pass to AWS X-Ray.

For adding the trace header patching botocore is sufficient, as
PynamoDB does use botocore's prepared requests. To make that work out
of the box the overall patching logic got adjusted to automatically
patch botocore as well when pynamodb patching is selected.

For tests I tried to do something useful, but only came up with the
single test which calls real DynamoDB.

I'm not sure if patching _make_api_call is the right approach at all
or if it would be better to patch the
requests.session.Session.send-call in PynamoDB. That would give
access to the response data, while causing a separate subsegment per
retry in case of throttling.

So I'm looking forward to get some feedback.

All in all it's pretty annoying that PynamoDB needs additional logic
while it uses botocore and requests which are both already
supported.

haotianw465 · 2017-12-08T22:59:58Z

Good call. Patching requests.session.Session.send works better in this case. There are three things you probably most care about regarding talking to DynamoDB: 1: network roundtrip time, 2: return status code 3: internal retries.

For current patching you only get 1. But patching the botocore vendored requests you get all three benefits. Though it means a little bit more work on parameter capture since you go one level deeper on the code path.

The X-Ray SDK support requests only if your application has direct dependency on it. It leaves botocore vendored requests untouched. So when you patch PynamoDB you can directly dive into the vendored requests and do the necessary patch there. You just need to make sure when you have for example 2 retries, it creates 3 sibling subsegments (1 original call + 2 retries) where for each subsegment it represents a DynamoDB call. This way you gain more value on tracing the DynamoDB calls from your contribution.

The AWS Go SDK has hooks that provide insights on internal retries and TLS info for each retry. Check what you can get from X-Ray Go SDK https://github.com/aws/aws-xray-sdk-go. Although boto doesn't provide as much as Go you can still get more insights on why a call is taking long.

Dunedan · 2017-12-11T10:07:50Z

You just need to make sure when you have for example 2 retries, it creates 3 sibling subsegments (1 original call + 2 retries) where for each subsegment it represents a DynamoDB call.

Is there a way to mark retries as retries or is it just about creating a separate subsegment per retry?

haotianw465 · 2017-12-11T20:00:41Z

It will just create a separate subsegment per retry. This make sure you get the timing and status code for every single retry for more insights. A retry count is not good enough IMO. And as a general library support it'd be better to give more insights to fit more use cases.

Dunedan · 2017-12-12T13:40:26Z

I just pushed a new commit, now directly patching botocore.vendored.requests.sessions.Session.send. That does now also log response status and parameters and should create one subsegment per retry as well.

I'd appreciate if you could take another look at it.

haotianw465 · 2017-12-12T21:03:59Z

docs/thirdparty.rst

-Patching boto3 and botocore are equivalent since boto3 depends on botocore
+Patching boto3 and botocore are equivalent since boto3 depends on botocore.
+
+Patching pynamodb patches botocore as well, as it depends partially on botocore.


I would suggest put more details on what is patched for pynamodb so other users utilizing your work have more details and know the implication of patching pynamodb.

I extended the paragraph a bit. I hope that's sufficient.

haotianw465 · 2017-12-12T21:09:34Z

tests/ext/pynamodb/test_pynamodb.py

+
+    subsegment = xray_recorder.current_segment().subsegments[0]
+    assert subsegment.error
+    assert subsegment.http['response']['status'] == 400


What exception are you trying to produce? Is the code always guaranteed to be 400 or it could be different depend on users and machines?

These tests currently do real requests to DynamoDB, as I don't know how to avoid them. Using the botocore stubber, as in the botocore tests, doesn't work, because we're patching requests here directly. The requests made by the tests currently return a access denied unless there are is access to resources called "mytable" and "mybucket" available.

haotianw465 · 2017-12-12T21:10:18Z

aws_xray_sdk/ext/pynamodb/patch.py

+
+def pynamodb_meta_processor(wrapped, instance, args, kwargs, return_value,
+                            exception, subsegment, stack):
+    operation_name = args[0].headers['X-Amz-Target'].decode('utf-8').split('.')[1]


Is it possible to capture request_id here?

Thanks for pointing out. Turned out I searched for it in the request instead of the response. Added it now.

haotianw465 · 2017-12-12T21:14:57Z

tests/ext/pynamodb/test_pynamodb.py

+    xray_recorder.clear_trace_entities()
+
+
+def test_exception():


Could you add a test to guard the behavior where vendored requests only emit subsegments when talking to DynamoDB? In other words after patching pynamodb other AWS services are still captured only through original patched botocore code.

I find it pretty difficult to write proper tests for the whole stuff, thanks to all the patching. I added a test now to check that only a single subsegment gets created when the pynamodb patch is enabled, but another AWS service is called. That single subsegment gets created, because the botocore patch is active in this case too.

haotianw465 · 2017-12-12T21:19:47Z

I fixed some formatting issue on docs/thirdparty.rst when doing SDK 0.95 release and looks like that caused some conflicts. Sorry for the inconvenience. Could you pull the latest commits, resolve the conflicts and merge your commits into one single commit? Thanks.

Dunedan · 2017-12-15T14:33:08Z

All remarks should be addressed now. Waiting for the next round of feedback. 😄

haotianw465 · 2017-12-18T20:36:04Z

docs/thirdparty.rst

+
+Patching pynamodb applies the botocore patch as well, as it uses the logic from the botocore
+patch to apply the trace header. As capturing the pynamodb calls relies on intercepting all
+calls made via botocore separately from the botocore patch, there is a small performance penalty


I would not mention the performance penalty here. An extra if condition is really neglectable IMO considering how much code is actually executed in botocore.

haotianw465 · 2017-12-18T20:39:43Z

tests/ext/pynamodb/test_pynamodb.py

+    subsegment = subsegments[0]
+    assert len(subsegment.subsegments) == 0
+    assert subsegment.error
+    assert subsegment.http['response']['status'] in [400, 403]


I would suggest remove status code assertion. Any 4xx could be returned depend on setup of the developer's machine.

haotianw465 · 2017-12-18T20:45:38Z

tests/ext/pynamodb/test_pynamodb.py

+    session = botocore.session.get_session()
+    s3 = session.create_client('s3', region_name='us-west-2')
+    try:
+        s3.get_bucket_location(Bucket='mybucket')


You're making external http calls anyway on the dynamo test case so I would suggest to not cause a Client error here. Exception raised from remote endpoint makes sure it reach further down to the botocore code path.

I'm afraid I don't get your point. I am doing an external call in this test as well and as the remote end returns an error (because I'm not authorized to access that bucket) I get an error back. I have to catch this error to be able to properly do the assertions afterwards. Am I missing something?

My bad. I got confused by ClientError. I thought it was raised by library due to invalid parameters. In that case there is not any http related code executed.

haotianw465 · 2017-12-18T20:46:35Z

tests/ext/pynamodb/test_pynamodb.py

+
+    subsegments = xray_recorder.current_segment().subsegments
+    assert len(subsegments) == 1
+    assert len(subsegments[0].subsegments) == 0


A little bit more assertion to make sure this subsegment is not some random subsegment generated anywhere else. e.g. name assertion should be enough.

While [PynamoDB][1] uses `botocore` under the hood it doesn't call `botocore`'s `BaseClient._make_api_call`. Instead it implements its own `Connection._make_api_call`, making it necessary to separately patch PynamoDB, even though it uses `botocore`. As PynamoDB's `Connection._make_api_call` doesn't return information about the response as `botocore` does, it's necessary directly patch the `send`-method used by `requests`. For adding the trace header patching `botocore` is sufficient, as PynamoDB does use `botocore`'s prepared requests. To make that work out of the box the overall patching logic got adjusted to automatically patch `botocore` as well when pynamodb patching is selected. [1]: https://github.com/pynamodb/PynamoDB

haotianw465 suggested changes Dec 12, 2017

View reviewed changes

haotianw465 suggested changes Dec 18, 2017

View reviewed changes

haotianw465 merged commit 9584a25 into aws:master Dec 19, 2017

Dunedan deleted the support-patching-pynamodb branch March 8, 2018 10:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support patching PynamoDB calls to DynamoDB #13

Support patching PynamoDB calls to DynamoDB #13

Dunedan commented Dec 8, 2017

haotianw465 commented Dec 8, 2017

Dunedan commented Dec 11, 2017

haotianw465 commented Dec 11, 2017

Dunedan commented Dec 12, 2017

haotianw465 Dec 12, 2017

Dunedan Dec 15, 2017

haotianw465 Dec 12, 2017

Dunedan Dec 15, 2017

haotianw465 Dec 12, 2017

Dunedan Dec 15, 2017

haotianw465 Dec 12, 2017

Dunedan Dec 15, 2017

haotianw465 commented Dec 12, 2017

Dunedan commented Dec 15, 2017

haotianw465 Dec 18, 2017

Dunedan Dec 19, 2017

haotianw465 Dec 18, 2017

Dunedan Dec 19, 2017

haotianw465 Dec 18, 2017

Dunedan Dec 19, 2017

haotianw465 Dec 19, 2017

haotianw465 Dec 18, 2017

Dunedan Dec 19, 2017

Support patching PynamoDB calls to DynamoDB #13

Support patching PynamoDB calls to DynamoDB #13

Conversation

Dunedan commented Dec 8, 2017

haotianw465 commented Dec 8, 2017

Dunedan commented Dec 11, 2017

haotianw465 commented Dec 11, 2017

Dunedan commented Dec 12, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

haotianw465 commented Dec 12, 2017

Dunedan commented Dec 15, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment