Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use boto to prepare request only #72

Merged
merged 3 commits into from
Sep 23, 2015
Merged

use boto to prepare request only #72

merged 3 commits into from
Sep 23, 2015

Conversation

danielhochman
Copy link
Contributor

I'm opening this pull request primarily to start a discussion.

This is a hack I did to circumvent boto's poor performance when parsing request bodies. It's unfortunate because it requires accessing underscore "protected" members of boto. It does improve performance significantly.

There is also behavior around error handling and exceptions from boto that need to be replicated. Maybe checking the status code and throwing on non-200s would be enough. I haven't explored PynamoDB's behavior in error scenarios yet.

TODO:

  • fix tests

@joshowen
Copy link
Contributor

+1. Openfolio is using PynamoDB more and more, and I've ended up forking the codebase to add in things like request retries with exponential backoff, use ujson for parsing. I'm happy to contribute our changes back as well. Perhaps we should create an issue to track performance improvements as a whole?

I guess the alternative is to speed up the botocore parsers with cython.

@jlafon
Copy link
Contributor

jlafon commented Sep 18, 2015

I think this is great, and I'd like to pull it in. I suspect that the build is failing because of the way the test framework monkey patches botocore to capture requests. If I get time today I'll look into it and see if we can get the tests passing again.

@jlafon jlafon merged commit c1cd220 into pynamodb:devel Sep 23, 2015
@dylanlingelbach
Copy link

Just wanted to chime in here.

First, I wanted to say this library has been a huge help to us, so thank you very much!

Second, this change caused a lot of problems for us. For context we use PynamoDB in celery tasks that run with eventlet concurrency and process a moderate volume of messages through our system (~30/sec). After deploying this change we started seeing substantially higher error rates from our system, with the majority of them being network errors.

Additionally this change prevents upgrading to newer versions of the requests library as later versions check the type of request object sent in. botocore creates an AWSPreparedRequest object, and requests expects a PreparedRequest object only.

botocore has some very handy features (like retrying after a network error) that get lost with this change. I think that is why PR #83 was created.

Is there a reason this approach was taken instead of hooking into botocore's response handling functionality? If I understand this change correctly the performance issue is related to botocore's response parsing. It seems like hooking in and just changing the response handling behavior would be safer and less code than routing the request through requests.

Let me know if I am missing something. We can try to put together a PR if people think this is a reasonable approach.

cc @gtowne

@jlafon
Copy link
Contributor

jlafon commented Dec 18, 2015

Hi @dylanlingelbach

I think hooking into botocore's response handling is a great idea and I'd be happy to collaborate on a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants