-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
requests with pyOpenSSL is drastically slower than with stdlib's ssl module #625
Comments
requests will use pyOpenSSL by default when it is present in an environment, so that's why you're seeing this. You can tell requests not to do that with However, a drop in performance like that is very unexpected anyway. What version of pyOpenSSL is this occurring under? |
Thanks for the quick reply @reaperhulk! We tried with versions 17.0.0, 16.2.0 and 16.1.0. Is there something in specific that I should be looking for that could be the culprit? For now, I can make an update to our code to not rely on pyOpenSSL, but I assume that, overall, it would be beneficial over basic SSL, so we would like to be able to use it without negative impact :). The only custom thing we do with requests is this:
|
Well that's all the latest versions (which is really all I was curious about). It would definitely be good to improve pyOpenSSL's performance here but I'm not sure if anyone has the time to take on a project like this right now. @Lukasa do you have an opinion about this? |
@reaperhulk one other question, since I am still fairly new to python: How/where would I use |
If you import that and call it after importing requests it should be fine. |
Issue: pyca/pyopenssl#625 causes a massive perf refression in our SDK when pyOpenSSL is installed. This allows us to ensure that we do not use pyOpenSSL when it is installed.
Is this a CFFI thing? |
Yeah, my question is the same as @hynek's: is this a CFFI thing? PyOpenSSL is likely, in general, to do more data copying than the standard library does. It's possible that this gets a bit worse if the client is sending extremely large chunks of data, as a bunch of time might be spent spinning around in the PyOpenSSL @begoldsm It'd be really interesting to see if you can get a |
@Lukasa sounds good! I have an initial profile run through the new Visual Studio profiler but that didn't yield too much information/obvious information (just that the code spent roughly the same amount of exclusive time in the two different SSL stacks, which is expected). I will get a cProfile run done and attach the output once I have it. |
@Lukasa I have uploaded my two cProfile logs. I took a look and didn't see anything really obvious that popped out at me, but you can see the time difference between the two (same script run): 190 seconds to upload 100GB when pyOpenSSL is not installed and over 500 seconds when it is installed. |
Cool, I'll chase this tomorrow morning. If anyone wants to take a swing before me, go for it. |
@Lukasa and @reaperhulk one more question to try and make the work around as complete as I can. On ubuntu/debian systems if someone chooses to install our package outside of a venv (I know this is a bad practice, but just in case :) ), do you have the equivalent "undo monkey-patch" code for that requests module? I believe it is called python-requests, and it separates out urllib3 from requests. Thanks again for all the support guys! |
@begoldsm |
So the first thing to note is that, in both cases, the bulk of the time is spent in
Note that the vast, vast majority of your runtime is spent in
So I'd be interested to know what is sleeping: when we enter |
@Lukasa good questions. The time.sleep calls are mostly happening in a polling function that is checking on the current state of the operation. The code in multithread.py is "parallelizing" requests to the web server and transfer.py is keeping track of the status of those requests in a polling function. The basic algorithm is this:
|
For reference, here is where the time.sleep is being called the most (we also have back-off retry logic if a request fails, but we would see more time spent in core.py if that were the case): Ultimately multithread.py just defines the logic that transfer.py will use to execute the transfer (whether it is upload or download and any specific customization of how the transfer will take place). The core request logic that sends data is here: Which ultimately calls the PUT http method for each small 4mb piece of data in the chunk, there are a series of helper functions along the way but they ultimately call this one with a PUT operation and a 4mb buffer of data to send: |
@Lukasa I am kind of bombarding you guys with a lot of solution specific information. Is there any additional debug information (perhaps from cProfile or another tool) that I can gather for you? I am admittedly fairly new to python so I am not sure what information is best for debugging this for you guys, or how to go about obtaining it for you. I am definitely interested in helping get to the bottom of this though, so anything you need just let me know! Thanks again! |
This update addresses a performance issue and works around github issue: pyca/pyopenssl#625
@begoldsm So one real possibility about where this slowdown is coming from is contention for the GIL. Substantially more of the stdlib There isn't much else going on. The PyOpenSSL code spends about half as much time as the C code acquiring Python locks, so I don't think your queues are a concern. Both spend about the same amount of time dealing with Want to try removing the progress monitor to see if the problem persists? |
@Lukasa I just tried commenting out all the time.sleep and all of the _wait() method with no luck. I am generating a new set of cProfiles now to see if there is a difference. |
@Lukasa I have data from the runs without the sleep, which is very strange. In both cases it claims it only executed for < 10 seconds, even though the actual run time (real time) was about the same as it was previously (190 without pySSL and 560 with pySSL). I have attached the raw data that can be manipulated with pstats, but I am confused as to why it would show such a small run number. command: output (of time and pstats)
|
This is usually an indication that the time is being spent inside C functions that don't have useful Python antecedents in their stack. It might be useful to get a profile at the C level. I'm not 100% up-to-speed on how you'd do this with Linux: probably using |
Thanks @Lukasa! I am actually not too familiar with profiling on Linux myself (or profiling in general, this is all brand new to me :) ). For now, I am going to move forward with this specific module ensuring pySSL isn't used for the rapid request data transfer logic while we continue to dive into what is going on at the deeper levels. Thank you all again for helping me to understand what is going on here and giving me a quick work around! |
I'm having this problem as well.... Profiling indicates a huge amount of time spent in SSL libs, specifically in SSL_read... This is on Ubuntu 14.04 with both python 2.7.6 as well as 2.7.14. Using requests.packages.urllib3.contrib.pyopenssl.extract_from_urllib3 as a workaround fixes the speed problem, but then instead I get errors like
The original slowness looks like this
EDIT (b)
|
This slowdown is present on latest versions of pyOpenSSL? |
Hello pyOpenSSL gurus!
We have a file transfer client (https://github.com/Azure/azure-data-lake-store-python) that relies on:
cffi
oathlib
requests
requests-oathlib
When we run performance benchmarks on this client with just those packages installed, we are seeing upload throughput around 7-8gbps.
When pyOpenSSL is installed (by a different package that depends on it) or upload throughput drops to no higher than 2gbps.
Based on the above info, I have a couple questions:
Please let me know if you need any other information from me or if you have any quick fixes that I can try out and thank you all so much for your time!
Ben
The text was updated successfully, but these errors were encountered: