diff --git a/CHANGELOG.md b/CHANGELOG.md index 8e19835d3..c4bccd4ee 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -10,6 +10,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 * Require `typing_extensions` on Python 3.11 (already required on earlier versinons) for better compatibility with pydantic v2 * Fix `RawSimulator` handling of `cache_control` parameter during tests. +### Changed +* Uploading files now makes use of `Expect: 100-continue` header + ## [1.22.1] - 2023-07-24 ### Fixed diff --git a/b2sdk/_botocore/LICENSE b/b2sdk/_botocore/LICENSE new file mode 100644 index 000000000..4947287f7 --- /dev/null +++ b/b2sdk/_botocore/LICENSE @@ -0,0 +1,177 @@ + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS \ No newline at end of file diff --git a/b2sdk/_botocore/NOTICE b/b2sdk/_botocore/NOTICE new file mode 100644 index 000000000..57420250c --- /dev/null +++ b/b2sdk/_botocore/NOTICE @@ -0,0 +1,74 @@ +b2sdk +Copyright 2023 Backblaze Inc. + +b2sdk includes vendorized parts of the botocore python library for 100-continue functionality. + +Copyright 2023 Backblaze Inc. +Changes made to the original source: +* Updated botocore.awsrequest.request method to work with str/byte header values (urllib 1.x vs 2.x) +* Updated botocore.awsrequest._handle_expect_response method to work with b2 responses +* Updated botocore.awsrequest._send_output method to change 100 response timeout +* Add a new test test_handles_expect_100_with_no_reason_phrase to TestAWSHTTPConnection test class + +--- + +Botocore +Copyright 2012-2022 Amazon.com, Inc. or its affiliates. All Rights Reserved. + +---- + +Botocore includes vendorized parts of the requests python library for backwards compatibility. + +Requests License +================ + +Copyright 2013 Kenneth Reitz + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +Botocore includes vendorized parts of the urllib3 library for backwards compatibility. + +Urllib3 License +=============== + +This is the MIT license: http://www.opensource.org/licenses/mit-license.php + +Copyright 2008-2011 Andrey Petrov and contributors (see CONTRIBUTORS.txt), +Modifications copyright 2012 Kenneth Reitz. + +Permission is hereby granted, free of charge, to any person obtaining a copy of this +software and associated documentation files (the "Software"), to deal in the Software +without restriction, including without limitation the rights to use, copy, modify, merge, +publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons +to whom the Software is furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all copies or +substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR +PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE +FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR +OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER +DEALINGS IN THE SOFTWARE. + +Bundle of CA Root Certificates +============================== + +***** BEGIN LICENSE BLOCK ***** +This Source Code Form is subject to the terms of the +Mozilla Public License, v. 2.0. If a copy of the MPL +was not distributed with this file, You can obtain +one at http://mozilla.org/MPL/2.0/. + +***** END LICENSE BLOCK ***** diff --git a/b2sdk/_botocore/README.md b/b2sdk/_botocore/README.md new file mode 100644 index 000000000..cb77d81bc --- /dev/null +++ b/b2sdk/_botocore/README.md @@ -0,0 +1,3 @@ +This module contains modified parts of the botocore module (https://github.com/boto/botocore). +The modules original license is included in LICENSE. +Changes made to the original source are listed in NOTICE, along with original NOTICE. \ No newline at end of file diff --git a/b2sdk/_botocore/__init__.py b/b2sdk/_botocore/__init__.py new file mode 100644 index 000000000..d00a42864 --- /dev/null +++ b/b2sdk/_botocore/__init__.py @@ -0,0 +1,10 @@ +###################################################################### +# +# File: b2sdk/_botocore/__init__.py +# +# Copyright 2023 Backblaze Inc. All Rights Reserved. +# +# License https://www.backblaze.com/using_b2_code.html +# License Apache License 2.0 (http://www.apache.org/licenses/ and LICENSE file in this directory) +# +###################################################################### \ No newline at end of file diff --git a/b2sdk/_botocore/awsrequest.py b/b2sdk/_botocore/awsrequest.py new file mode 100644 index 000000000..75477a146 --- /dev/null +++ b/b2sdk/_botocore/awsrequest.py @@ -0,0 +1,229 @@ +###################################################################### +# +# File: b2sdk/_botocore/awsrequest.py +# +# Copyright 2023 Backblaze Inc. All Rights Reserved. +# Copyright (c) 2012-2013 Mitch Garnaat http://garnaat.org/ +# Copyright 2012-2014 Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# License https://www.backblaze.com/using_b2_code.html +# License Apache License 2.0 (http://www.apache.org/licenses/ and LICENSE file in this directory) +# +###################################################################### +"""\ +This module contains modified parts of the botocore module (https://github.com/boto/botocore). +The modules original license is included in LICENSE. +Changes made to the original source are listed in NOTICE, along with original NOTICE. +""" + +import functools +import logging +from http.client import HTTPResponse + +import urllib3.util +from urllib3.connection import HTTPConnection, VerifiedHTTPSConnection +from urllib3.connectionpool import HTTPConnectionPool, HTTPSConnectionPool + +logger = logging.getLogger(__name__) + + +class AWSHTTPResponse(HTTPResponse): + # The *args, **kwargs is used because the args are slightly + # different in py2.6 than in py2.7/py3. + def __init__(self, *args, **kwargs): + self._status_tuple = kwargs.pop('status_tuple') + HTTPResponse.__init__(self, *args, **kwargs) + + def _read_status(self): + if self._status_tuple is not None: + status_tuple = self._status_tuple + self._status_tuple = None + return status_tuple + else: + return HTTPResponse._read_status(self) + + +class AWSConnection: + """Mixin for HTTPConnection that supports Expect 100-continue. + + This when mixed with a subclass of httplib.HTTPConnection (though + technically we subclass from urllib3, which subclasses + httplib.HTTPConnection) and we only override this class to support Expect + 100-continue, which we need for S3. As far as I can tell, this is + general purpose enough to not be specific to S3, but I'm being + tentative and keeping it in botocore because I've only tested + this against AWS services. + + """ + + def __init__(self, *args, **kwargs): + super().__init__(*args, **kwargs) + self._original_response_cls = self.response_class + # This variable is set when we receive an early response from the + # server. If this value is set to True, any calls to send() are noops. + # This value is reset to false every time _send_request is called. + # This is to workaround changes in urllib3 2.0 which uses separate + # send() calls in request() instead of delegating to endheaders(), + # which is where the body is sent in CPython's HTTPConnection. + self._response_received = False + self._expect_header_set = False + self._send_called = False + self._continue_timeout = 10.0 + + def close(self): + super().close() + # Reset all of our instance state we were tracking. + self._response_received = False + self._expect_header_set = False + self._send_called = False + self.response_class = self._original_response_cls + + def request(self, method, url, body=None, headers=None, *args, **kwargs): + if headers is None: + headers = {} + self._response_received = False + if headers.get('Expect', b'') in [b'100-continue', '100-continue']: + self._expect_header_set = True + timeout = headers.pop('X-Expect-100-Timeout', self._continue_timeout) + self._continue_timeout = float(timeout) + else: + self._expect_header_set = False + self.response_class = self._original_response_cls + rval = super().request(method, url, body, headers, *args, **kwargs) + self._expect_header_set = False + return rval + + def _convert_to_bytes(self, mixed_buffer): + # Take a list of mixed str/bytes and convert it + # all into a single bytestring. + # Any str will be encoded as utf-8. + bytes_buffer = [] + for chunk in mixed_buffer: + if isinstance(chunk, str): + bytes_buffer.append(chunk.encode('utf-8')) + else: + bytes_buffer.append(chunk) + msg = b"\r\n".join(bytes_buffer) + return msg + + def _send_output(self, message_body=None, *args, **kwargs): + self._buffer.extend((b"", b"")) + msg = b"\r\n".join(self._buffer) + del self._buffer[:] + # If msg and message_body are sent in a single send() call, + # it will avoid performance problems caused by the interaction + # between delayed ack and the Nagle algorithm. + if isinstance(message_body, bytes): + msg += message_body + message_body = None + self.send(msg) + if self._expect_header_set: + # This is our custom behavior. If the Expect header was + # set, it will trigger this custom behavior. + logger.debug("Waiting for 100 Continue response.") + if urllib3.util.wait_for_read(self.sock, self._continue_timeout): + self._handle_expect_response(message_body) + return + else: + # From the RFC: + # Because of the presence of older implementations, the + # protocol allows ambiguous situations in which a client may + # send "Expect: 100-continue" without receiving either a 417 + # (Expectation Failed) status or a 100 (Continue) status. + # Therefore, when a client sends this header field to an origin + # server (possibly via a proxy) from which it has never seen a + # 100 (Continue) status, the client SHOULD NOT wait for an + # indefinite period before sending the request body. + logger.debug( + "No response seen from server, continuing to " + "send the response body." + ) + if message_body is not None: + # message_body was not a string (i.e. it is a file), and + # we must run the risk of Nagle. + self.send(message_body) + + def _consume_headers(self, fp): + # Most servers (including S3) will just return + # the CLRF after the 100 continue response. However, + # some servers (I've specifically seen this for squid when + # used as a straight HTTP proxy) will also inject a + # Connection: keep-alive header. To account for this + # we'll read until we read '\r\n', and ignore any headers + # that come immediately after the 100 continue response. + current = None + while current != b'\r\n': + current = fp.readline() + + def _handle_expect_response(self, message_body): + # This is called when we sent the request headers containing + # an Expect: 100-continue header and received a response. + # We now need to figure out what to do. + fp = self.sock.makefile('rb', 0) + try: + maybe_status_line = fp.readline() + parts = maybe_status_line.split(None, 2) + + # Check for 'HTTP/ 100 Continue\r\n' or, 'HTTP/ 100\r\n' + if len(parts) >= 2 and parts[0].startswith(b'HTTP/') and parts[1] == b'100': + self._consume_headers(fp) + logger.debug("100 Continue response seen, now sending request body.") + self._send_message_body(message_body) + elif len(parts) >= 2 and parts[0].startswith(b'HTTP/'): + # From the RFC: + # Requirements for HTTP/1.1 origin servers: + # + # - Upon receiving a request which includes an Expect + # request-header field with the "100-continue" + # expectation, an origin server MUST either respond with + # 100 (Continue) status and continue to read from the + # input stream, or respond with a final status code. + # + # So if we don't get a 100 Continue response, then + # whatever the server has sent back is the final response + # and don't send the message_body. + logger.debug( + "Received a non 100 Continue response " + "from the server, NOT sending request body." + ) + status_tuple = ( + parts[0].decode('ascii'), + int(parts[1]), + parts[2].decode('ascii') if len(parts) > 2 else '', + ) + response_class = functools.partial(AWSHTTPResponse, status_tuple=status_tuple) + self.response_class = response_class + self._response_received = True + finally: + fp.close() + + def _send_message_body(self, message_body): + if message_body is not None: + self.send(message_body) + + def send(self, str): + if self._response_received: + if not self._send_called: + # urllib3 2.0 chunks and calls send potentially + # thousands of times inside `request` unlike the + # standard library. Only log this once for sanity. + logger.debug("send() called, but response already received. " "Not sending data.") + self._send_called = True + return + return super().send(str) + + +class AWSHTTPConnection(AWSConnection, HTTPConnection): + """An HTTPConnection that supports 100 Continue behavior.""" + + +class AWSHTTPSConnection(AWSConnection, VerifiedHTTPSConnection): + """An HTTPSConnection that supports 100 Continue behavior.""" + + +class AWSHTTPConnectionPool(HTTPConnectionPool): + ConnectionCls = AWSHTTPConnection + + +class AWSHTTPSConnectionPool(HTTPSConnectionPool): + ConnectionCls = AWSHTTPSConnection diff --git a/b2sdk/_botocore/included_source_meta.py b/b2sdk/_botocore/included_source_meta.py new file mode 100644 index 000000000..dae8e4573 --- /dev/null +++ b/b2sdk/_botocore/included_source_meta.py @@ -0,0 +1,92 @@ +###################################################################### +# +# File: b2sdk/_botocore/included_source_meta.py +# +# Copyright 2023 Backblaze Inc. All Rights Reserved. +# +# License https://www.backblaze.com/using_b2_code.html +# +###################################################################### +from b2sdk.included_sources import IncludedSourceMeta, add_included_source + +included_source_meta = IncludedSourceMeta( + 'botocore', 'Included in a revised form', { + 'NOTICE': + """b2sdk +Copyright 2023 Backblaze Inc. + +b2sdk includes vendorized parts of the botocore python library for 100-continue functionality. + +Copyright 2023 Backblaze Inc. +Changes made to the original source: +* Updated botocore.awsrequest.request method to work with str/byte header values (urllib 1.x vs 2.x) +* Updated botocore.awsrequest._handle_expect_response method to work with b2 responses +* Updated botocore.awsrequest._send_output method to change 100 response timeout +* Add a new test test_handles_expect_100_with_no_reason_phrase to TestAWSHTTPConnection test class + +--- + +Botocore +Copyright 2012-2022 Amazon.com, Inc. or its affiliates. All Rights Reserved. + +---- + +Botocore includes vendorized parts of the requests python library for backwards compatibility. + +Requests License +================ + +Copyright 2013 Kenneth Reitz + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +Botocore includes vendorized parts of the urllib3 library for backwards compatibility. + +Urllib3 License +=============== + +This is the MIT license: http://www.opensource.org/licenses/mit-license.php + +Copyright 2008-2011 Andrey Petrov and contributors (see CONTRIBUTORS.txt), +Modifications copyright 2012 Kenneth Reitz. + +Permission is hereby granted, free of charge, to any person obtaining a copy of this +software and associated documentation files (the "Software"), to deal in the Software +without restriction, including without limitation the rights to use, copy, modify, merge, +publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons +to whom the Software is furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all copies or +substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR +PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE +FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR +OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER +DEALINGS IN THE SOFTWARE. + +Bundle of CA Root Certificates +============================== + +***** BEGIN LICENSE BLOCK ***** +This Source Code Form is subject to the terms of the +Mozilla Public License, v. 2.0. If a copy of the MPL +was not distributed with this file, You can obtain +one at http://mozilla.org/MPL/2.0/. + +***** END LICENSE BLOCK ***** +""" + } +) +add_included_source(included_source_meta) diff --git a/b2sdk/api_config.py b/b2sdk/api_config.py index 1dd10d314..0ff62d608 100644 --- a/b2sdk/api_config.py +++ b/b2sdk/api_config.py @@ -26,7 +26,9 @@ def __init__( install_clock_skew_hook: bool = True, user_agent_append: str | None = None, _raw_api_class: type[AbstractRawApi] | None = None, - decode_content: bool = False + decode_content: bool = False, + expect_100_continue: bool = True, + expect_100_timeout: float = 10.0, ): """ A structure with params to be passed to low level API. @@ -43,6 +45,8 @@ def __init__( self.user_agent_append = user_agent_append self.raw_api_class = _raw_api_class or self.DEFAULT_RAW_API_CLASS self.decode_content = decode_content + self.expect_100_continue = expect_100_continue + self.expect_100_timeout = expect_100_timeout DEFAULT_HTTP_API_CONFIG = B2HttpApiConfig() diff --git a/b2sdk/b2http.py b/b2sdk/b2http.py index e497c7195..b4396db16 100644 --- a/b2sdk/b2http.py +++ b/b2sdk/b2http.py @@ -22,7 +22,6 @@ from typing import Any import requests -from requests.adapters import HTTPAdapter from .api_config import DEFAULT_HTTP_API_CONFIG, B2HttpApiConfig from .exception import ( @@ -40,7 +39,7 @@ UnknownHost, interpret_b2_error, ) -from .requests import NotDecompressingResponse +from .requests import HTTPAdapterWithContinue, NotDecompressingResponse from .version import USER_AGENT LOCALE_LOCK = threading.Lock() @@ -534,7 +533,7 @@ def _translate_and_retry(cls, fcn, try_count, post_params=None): return cls._translate_errors(fcn, post_params) -class NotDecompressingHTTPAdapter(HTTPAdapter): +class NotDecompressingHTTPAdapter(HTTPAdapterWithContinue): """ HTTP adapter that uses :class:`b2sdk.requests.NotDecompressingResponse` instead of the default :code:`requests.Response` class. diff --git a/b2sdk/file_version.py b/b2sdk/file_version.py index 0c3b5524d..99ec74c49 100644 --- a/b2sdk/file_version.py +++ b/b2sdk/file_version.py @@ -351,6 +351,7 @@ def _get_upload_headers(self) -> bytes: file_retention=self.file_retention, legal_hold=self.legal_hold, cache_control=self.cache_control, + expect_100_continue=False, ) headers_str = ''.join( diff --git a/b2sdk/raw_api.py b/b2sdk/raw_api.py index cd6302d1d..2ac52c911 100644 --- a/b2sdk/raw_api.py +++ b/b2sdk/raw_api.py @@ -351,6 +351,8 @@ def get_upload_file_headers( legal_hold: LegalHold | None, custom_upload_timestamp: int | None = None, cache_control: str | None = None, + expect_100_continue: bool = True, + expect_100_timeout: float = 10.0, ) -> dict: headers = { 'Authorization': upload_auth_token, @@ -379,6 +381,10 @@ def get_upload_file_headers( if custom_upload_timestamp is not None: headers['X-Bz-Custom-Upload-Timestamp'] = str(custom_upload_timestamp) + if expect_100_continue: + headers['Expect'] = '100-continue' + headers['X-Expect-100-Timeout'] = str(expect_100_timeout) + return headers @abstractmethod @@ -397,6 +403,8 @@ def upload_file( legal_hold: LegalHold | None = None, custom_upload_timestamp: int | None = None, cache_control: str | None = None, + expect_100_continue: bool = True, + expect_100_timeout: float = 10.0, ): pass @@ -932,6 +940,8 @@ def upload_file( legal_hold: LegalHold | None = None, custom_upload_timestamp: int | None = None, cache_control: str | None = None, + expect_100_continue: bool = True, + expect_100_timeout: float = 10.0, ): """ Upload one, small file to b2. @@ -949,6 +959,8 @@ def upload_file( :param legal_hold: legal hold setting for the file :param custom_upload_timestamp: custom upload timestamp for the file :param cache_control: an optional cache control setting. Syntax based on the section 14.9 of RFC 2616. Example string value: 'public, max-age=86400, s-maxage=3600, no-transform'. + :param expect_100_continue: whether to use 'Expect: 100-continue' header + :param expect_100_timeout: timeout of 100 response when expect_100_continue is True :return: """ # Raise UnusableFileName if the file_name doesn't meet the rules. @@ -965,6 +977,8 @@ def upload_file( legal_hold=legal_hold, custom_upload_timestamp=custom_upload_timestamp, cache_control=cache_control, + expect_100_continue=expect_100_continue, + expect_100_timeout=expect_100_timeout, ) return self.b2_http.post_content_return_json(upload_url, headers, data_stream) @@ -977,6 +991,8 @@ def upload_part( content_sha1, data_stream, server_side_encryption: EncryptionSetting | None = None, + expect_100_continue: bool = True, + expect_100_timeout: float = 10.0, ): headers = { 'Authorization': upload_auth_token, @@ -989,6 +1005,9 @@ def upload_part( EncryptionMode.NONE, EncryptionMode.SSE_B2, EncryptionMode.SSE_C ) server_side_encryption.add_to_upload_headers(headers) + if expect_100_continue: + headers['Expect'] = '100-continue' + headers['X-Expect-100-Timeout'] = str(expect_100_timeout) return self.b2_http.post_content_return_json(upload_url, headers, data_stream) diff --git a/b2sdk/raw_simulator.py b/b2sdk/raw_simulator.py index 9d01c7b4f..d9fc54c43 100644 --- a/b2sdk/raw_simulator.py +++ b/b2sdk/raw_simulator.py @@ -1014,6 +1014,8 @@ def upload_file( legal_hold: LegalHold | None = None, custom_upload_timestamp: int | None = None, cache_control: str | None = None, + expect_100_continue: bool = True, + expect_100_timeout: float = 10.0, ): data_bytes = self._simulate_chunked_post(data_stream, content_length) assert len(data_bytes) == content_length @@ -1800,6 +1802,8 @@ def get_upload_file_headers( legal_hold: LegalHold | None, custom_upload_timestamp: int | None = None, cache_control: str | None = None, + expect_100_continue: bool = True, + expect_100_timeout: float = 10.0, ) -> dict: # fix to allow calculating headers on unknown key - only for simulation @@ -1820,6 +1824,8 @@ def get_upload_file_headers( legal_hold=legal_hold, custom_upload_timestamp=custom_upload_timestamp, cache_control=cache_control, + expect_100_continue=expect_100_continue, + expect_100_timeout=expect_100_timeout, ) def upload_file( @@ -1837,6 +1843,8 @@ def upload_file( legal_hold: LegalHold | None = None, custom_upload_timestamp: int | None = None, cache_control: str | None = None, + expect_100_continue: bool = True, + expect_100_timeout: float = 10.0, ): with ConcurrentUsedAuthTokenGuard( self.currently_used_auth_tokens[upload_auth_token], upload_auth_token @@ -1869,6 +1877,8 @@ def upload_file( legal_hold=legal_hold, custom_upload_timestamp=custom_upload_timestamp, cache_control=cache_control, + expect_100_continue=expect_100_continue, + expect_100_timeout=expect_100_timeout, ) response = bucket.upload_file( @@ -1900,6 +1910,8 @@ def upload_part( sha1_sum, input_stream, server_side_encryption: EncryptionSetting | None = None, + expect_100_continue: bool = True, + expect_100_timeout: float = 10.0, ): with ConcurrentUsedAuthTokenGuard( self.currently_used_auth_tokens[upload_auth_token], upload_auth_token diff --git a/b2sdk/requests/NOTICE b/b2sdk/requests/NOTICE index 12e0576bb..c90dc5c78 100644 --- a/b2sdk/requests/NOTICE +++ b/b2sdk/requests/NOTICE @@ -3,5 +3,6 @@ Copyright 2019 Kenneth Reitz Copyright 2021 Backblaze Inc. Changes made to the original source: -requests.models.Response.iter_content has been overridden to pass `decode_content=False` argument to `self.raw.stream` -in order to NOT decompress data based on Content-Encoding header \ No newline at end of file +* requests.models.Response.iter_content has been overridden to pass `decode_content=False` argument to `self.raw.stream` + in order to NOT decompress data based on Content-Encoding header +* requests.adapters.HTTPAdapter has been overridden to use patched Urllib3 connection pools \ No newline at end of file diff --git a/b2sdk/requests/__init__.py b/b2sdk/requests/__init__.py index d1b657bd4..60adfbae3 100644 --- a/b2sdk/requests/__init__.py +++ b/b2sdk/requests/__init__.py @@ -16,11 +16,13 @@ """ from requests import Response, ConnectionError +from requests.adapters import HTTPAdapter, DEFAULT_POOLBLOCK from requests.exceptions import ChunkedEncodingError, ContentDecodingError, StreamConsumedError from requests.utils import iter_slices, stream_decode_response_unicode from urllib3.exceptions import ProtocolError, DecodeError, ReadTimeoutError from . import included_source_meta +from .._botocore.awsrequest import AWSHTTPConnectionPool, AWSHTTPSConnectionPool class NotDecompressingResponse(Response): @@ -77,3 +79,12 @@ def from_builtin_response(cls, response: Response): setattr(new_response, attr_name, getattr(response, attr_name)) new_response.raw = response.raw return new_response + + +class HTTPAdapterWithContinue(HTTPAdapter): + def init_poolmanager(self, connections, maxsize, block=DEFAULT_POOLBLOCK, **pool_kwargs): + super().init_poolmanager(connections, maxsize, block, **pool_kwargs) + self.poolmanager.pool_classes_by_scheme = { + "http": AWSHTTPConnectionPool, + "https": AWSHTTPSConnectionPool, + } diff --git a/b2sdk/requests/included_source_meta.py b/b2sdk/requests/included_source_meta.py index 40bfcbe2d..c789d4741 100644 --- a/b2sdk/requests/included_source_meta.py +++ b/b2sdk/requests/included_source_meta.py @@ -17,8 +17,9 @@ Copyright 2021 Backblaze Inc. Changes made to the original source: -requests.models.Response.iter_content has been overridden to pass `decode_content=False` argument to `self.raw.stream` -in order to NOT decompress data based on Content-Encoding header""" +* requests.models.Response.iter_content has been overridden to pass `decode_content=False` argument to `self.raw.stream` + in order to NOT decompress data based on Content-Encoding header +* requests.adapters.HTTPAdapter has been overridden to use patched Urllib3 connection pools""" } ) add_included_source(included_source_meta) diff --git a/b2sdk/session.py b/b2sdk/session.py index f3214dd24..66876ddf1 100644 --- a/b2sdk/session.py +++ b/b2sdk/session.py @@ -87,6 +87,9 @@ def __init__( TokenType.UPLOAD_PART: self._upload_part, } + self.expect_100_continue = api_config.expect_100_continue + self.expect_100_timeout = api_config.expect_100_timeout + def authorize_automatically(self): """ Perform automatic account authorization, retrieving all account data @@ -370,6 +373,8 @@ def upload_file( legal_hold=legal_hold, custom_upload_timestamp=custom_upload_timestamp, cache_control=cache_control, + expect_100_continue=self.expect_100_continue, + expect_100_timeout=self.expect_100_timeout, ) def upload_part( @@ -390,6 +395,8 @@ def upload_part( sha1_sum, input_stream, server_side_encryption, + expect_100_continue=self.expect_100_continue, + expect_100_timeout=self.expect_100_timeout, ) def get_download_url_by_id(self, file_id): diff --git a/requirements.txt b/requirements.txt index e99bd83ef..208309ab7 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,5 +1,6 @@ importlib-metadata>=3.3.0; python_version < '3.8' logfury>=1.0.1,<2.0.0 requests>=2.9.1,<3.0.0 +urllib3>=1.21.1,<3 tqdm>=4.5.0,<5.0.0 typing-extensions>=4.7.1; python_version < '3.12' diff --git a/test/integration/conftest.py b/test/integration/conftest.py index 210cbaab4..e7f3d7bf2 100644 --- a/test/integration/conftest.py +++ b/test/integration/conftest.py @@ -9,8 +9,18 @@ ###################################################################### from __future__ import annotations +import http.client +import os +import random +import time + import pytest +from b2sdk.b2http import B2Http +from b2sdk.raw_api import REALM_URLS, B2RawHTTPApi + +from . import get_b2_auth_data + def pytest_addoption(parser): """Add a flag for not cleaning up old buckets""" @@ -24,3 +34,61 @@ def pytest_addoption(parser): @pytest.fixture def dont_cleanup_old_buckets(request): return request.config.getoption("--dont-cleanup-old-buckets") + + +@pytest.fixture(scope="session") +def raw_api(): + return B2RawHTTPApi(B2Http()) + + +@pytest.fixture(scope="session") +def auth_dict(raw_api): + try: + application_key_id, application_key = get_b2_auth_data() + except ValueError as ex: + pytest.fail(ex.args[0]) + else: + realm = os.environ.get('B2_TEST_ENVIRONMENT', 'production') + realm_url = REALM_URLS.get(realm, realm) + return raw_api.authorize_account(realm_url, application_key_id, application_key) + + +@pytest.fixture(scope="session") +def bucket_dict(raw_api, auth_dict): + bucket_name = 'test-raw-api-%s-%d-%d' % ( + auth_dict['accountId'], int(time.time()), random.randint(1000, 9999) + ) + + return raw_api.create_bucket( + auth_dict['apiUrl'], + auth_dict['authorizationToken'], + auth_dict['accountId'], + bucket_name, + 'allPublic', + is_file_lock_enabled=True, + ) + + +@pytest.fixture(scope="session") +def upload_url_dict(raw_api, auth_dict, bucket_dict): + return raw_api.get_upload_url( + auth_dict['apiUrl'], auth_dict['authorizationToken'], bucket_dict['bucketId'] + ) + + +@pytest.fixture +def http_sent_data(monkeypatch): + orig_send = http.client.HTTPConnection.send + sent_data = bytearray() + + def patched_send(self, data): + sent_data.extend(data) + return orig_send(self, data) + + monkeypatch.setattr( + http.client.HTTPConnection, + "send", + patched_send, + ) + + return sent_data diff --git a/test/integration/test_raw_expect_100.py b/test/integration/test_raw_expect_100.py new file mode 100644 index 000000000..c5e751b0f --- /dev/null +++ b/test/integration/test_raw_expect_100.py @@ -0,0 +1,136 @@ +###################################################################### +# +# File: test/integration/test_raw_expect_100.py +# +# Copyright 2023 Backblaze Inc. All Rights Reserved. +# +# License https://www.backblaze.com/using_b2_code.html +# +###################################################################### +import io +import secrets +from unittest import mock + +import pytest +from urllib3.util import wait_for_read + +from b2sdk.encryption.setting import EncryptionSetting +from b2sdk.encryption.types import EncryptionAlgorithm, EncryptionMode +from b2sdk.exception import InvalidAuthToken +from b2sdk.utils import hex_sha1_of_stream + + +def test_expect_100_non_100_response(raw_api, upload_url_dict, http_sent_data): + file_name = 'test-100-continue.txt' + file_contents = secrets.token_bytes() + file_length = len(file_contents) + file_sha1 = hex_sha1_of_stream(io.BytesIO(file_contents), file_length) + data = io.BytesIO(file_contents) + + with pytest.raises(InvalidAuthToken), mock.patch( + "urllib3.util.wait_for_read", side_effect=wait_for_read + ) as wait_mock: + raw_api.upload_file( + upload_url_dict['uploadUrl'], + upload_url_dict['authorizationToken'] + 'wrong token', + file_name, + file_contents, + 'text/plain', + file_sha1, + {'color': 'blue'}, + data, + server_side_encryption=EncryptionSetting( + mode=EncryptionMode.SSE_B2, + algorithm=EncryptionAlgorithm.AES256, + ), + ) + assert file_contents not in http_sent_data + assert wait_mock.call_count == 1 + + +def test_expect_100_timeout(raw_api, upload_url_dict, http_sent_data): + file_name = 'test-100-continue.txt' + file_contents = secrets.token_bytes() + file_length = len(file_contents) + file_sha1 = hex_sha1_of_stream(io.BytesIO(file_contents), file_length) + data = io.BytesIO(file_contents) + timeout = 0 + + with mock.patch("urllib3.util.wait_for_read", side_effect=wait_for_read) as wait_mock: + raw_api.upload_file( + upload_url_dict['uploadUrl'], + upload_url_dict['authorizationToken'], + file_name, + file_contents, + 'text/plain', + file_sha1, + {'color': 'blue'}, + data, + server_side_encryption=EncryptionSetting( + mode=EncryptionMode.SSE_B2, + algorithm=EncryptionAlgorithm.AES256, + ), + expect_100_timeout=timeout, + ) + assert file_contents in http_sent_data + assert wait_mock.call_count == 1 + args, _ = wait_mock.call_args + assert args[1] == timeout + + +def test_expect_100_disabled(raw_api, upload_url_dict, http_sent_data): + file_name = 'test-100-continue.txt' + file_contents = secrets.token_bytes() + file_length = len(file_contents) + file_sha1 = hex_sha1_of_stream(io.BytesIO(file_contents), file_length) + data = io.BytesIO(file_contents) + + with mock.patch("urllib3.util.wait_for_read", side_effect=wait_for_read) as wait_mock: + raw_api.upload_file( + upload_url_dict['uploadUrl'], + upload_url_dict['authorizationToken'], + file_name, + file_contents, + 'text/plain', + file_sha1, + {'color': 'blue'}, + data, + server_side_encryption=EncryptionSetting( + mode=EncryptionMode.SSE_B2, + algorithm=EncryptionAlgorithm.AES256, + ), + expect_100_continue=False, + ) + assert file_contents in http_sent_data + assert wait_mock.call_count == 0 + + +def test_expect_100_data_sent_after_wait(raw_api, upload_url_dict, http_sent_data): + file_name = 'test-100-continue.txt' + file_contents = secrets.token_bytes() + file_length = len(file_contents) + file_sha1 = hex_sha1_of_stream(io.BytesIO(file_contents), file_length) + data = io.BytesIO(file_contents) + + def patched_wait(*args, **kwargs): + # verify that, data is not sent before waiting + assert file_contents not in http_sent_data, "data sent before waiting for 100 Continue" + return wait_for_read(*args, **kwargs) + + with mock.patch("urllib3.util.wait_for_read", side_effect=patched_wait) as wait_mock: + raw_api.upload_file( + upload_url_dict['uploadUrl'], + upload_url_dict['authorizationToken'], + file_name, + file_contents, + 'text/plain', + file_sha1, + {'color': 'blue'}, + data, + server_side_encryption=EncryptionSetting( + mode=EncryptionMode.SSE_B2, + algorithm=EncryptionAlgorithm.AES256, + ), + ) + assert file_contents in http_sent_data + assert wait_mock.call_count == 1 diff --git a/test/unit/botocore/__init__.py b/test/unit/botocore/__init__.py new file mode 100644 index 000000000..ca334aed4 --- /dev/null +++ b/test/unit/botocore/__init__.py @@ -0,0 +1,10 @@ +###################################################################### +# +# File: test/unit/botocore/__init__.py +# +# Copyright 2023 Backblaze Inc. All Rights Reserved. +# +# License https://www.backblaze.com/using_b2_code.html +# +###################################################################### +from __future__ import annotations diff --git a/test/unit/botocore/test_awsrequest.py b/test/unit/botocore/test_awsrequest.py new file mode 100644 index 000000000..a80990f6a --- /dev/null +++ b/test/unit/botocore/test_awsrequest.py @@ -0,0 +1,332 @@ +###################################################################### +# +# File: test/unit/botocore/test_awsrequest.py +# +# Copyright 2023 Backblaze Inc. All Rights Reserved. +# Copyright (c) 2012-2013 Mitch Garnaat http://garnaat.org/ +# Copyright 2012-2014 Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# License https://www.backblaze.com/using_b2_code.html +# See NOTICE and LICENSE files in b2sdk/_botocore directory. +# +###################################################################### +from __future__ import annotations + +import io +import socket +import unittest +from unittest import mock + +from b2sdk._botocore.awsrequest import AWSHTTPConnection + + +class IgnoreCloseBytesIO(io.BytesIO): + def close(self): + pass + + +class FakeSocket: + def __init__(self, read_data, fileclass=IgnoreCloseBytesIO): + self.sent_data = b'' + self.read_data = read_data + self.fileclass = fileclass + self._fp_object = None + + def sendall(self, data): + self.sent_data += data + + def makefile(self, mode, bufsize=None): + if self._fp_object is None: + self._fp_object = self.fileclass(self.read_data) + return self._fp_object + + def close(self): + pass + + def settimeout(self, value): + pass + + +class BytesIOWithLen(io.BytesIO): + def __len__(self): + return len(self.getvalue()) + + +class TestAWSHTTPConnection(unittest.TestCase): + def create_tunneled_connection(self, url, port, response): + s = FakeSocket(response) + conn = AWSHTTPConnection(url, port) + conn.sock = s + conn._tunnel_host = url + conn._tunnel_port = port + conn._tunnel_headers = {'key': 'value'} + + # Create a mock response. + self.mock_response = mock.Mock() + self.mock_response.fp = mock.Mock() + + # Imitate readline function by creating a list to be sent as + # a side effect of the mocked readline to be able to track how the + # response is processed in ``_tunnel()``. + delimiter = b'\r\n' + side_effect = [] + response_components = response.split(delimiter) + for i in range(len(response_components)): + new_component = response_components[i] + # Only add the delimiter on if it is not the last component + # which should be an empty string. + if i != len(response_components) - 1: + new_component += delimiter + side_effect.append(new_component) + + self.mock_response.fp.readline.side_effect = side_effect + + response_components = response.split(b' ') + self.mock_response._read_status.return_value = ( + response_components[0], + int(response_components[1]), + response_components[2], + ) + conn.response_class = mock.Mock() + conn.response_class.return_value = self.mock_response + return conn + + def test_expect_100_continue_returned(self): + with mock.patch('urllib3.util.wait_for_read') as wait_mock: + # Shows the server first sending a 100 continue response + # then a 200 ok response. + s = FakeSocket(b'HTTP/1.1 100 Continue\r\n\r\nHTTP/1.1 200 OK\r\n') + conn = AWSHTTPConnection('s3.amazonaws.com', 443) + conn.sock = s + wait_mock.return_value = True + conn.request('GET', '/bucket/foo', b'body', {'Expect': b'100-continue'}) + response = conn.getresponse() + # Assert that we waited for the 100-continue response + self.assertEqual(wait_mock.call_count, 1) + # Now we should verify that our final response is the 200 OK + self.assertEqual(response.status, 200) + + def test_handles_expect_100_with_different_reason_phrase(self): + with mock.patch('urllib3.util.wait_for_read') as wait_mock: + # Shows the server first sending a 100 continue response + # then a 200 ok response. + s = FakeSocket(b'HTTP/1.1 100 (Continue)\r\n\r\nHTTP/1.1 200 OK\r\n') + conn = AWSHTTPConnection('s3.amazonaws.com', 443) + conn.sock = s + wait_mock.return_value = True + conn.request( + 'GET', + '/bucket/foo', + io.BytesIO(b'body'), + { + 'Expect': b'100-continue', + 'Content-Length': b'4' + }, + ) + response = conn.getresponse() + # Now we should verify that our final response is the 200 OK. + self.assertEqual(response.status, 200) + # Assert that we waited for the 100-continue response + self.assertEqual(wait_mock.call_count, 1) + # Verify that we went the request body because we got a 100 + # continue. + self.assertIn(b'body', s.sent_data) + + def test_expect_100_sends_connection_header(self): + # When using squid as an HTTP proxy, it will also send + # a Connection: keep-alive header back with the 100 continue + # response. We need to ensure we handle this case. + with mock.patch('urllib3.util.wait_for_read') as wait_mock: + # Shows the server first sending a 100 continue response + # then a 500 response. We're picking 500 to confirm we + # actually parse the response instead of getting the + # default status of 200 which happens when we can't parse + # the response. + s = FakeSocket( + b'HTTP/1.1 100 Continue\r\n' + b'Connection: keep-alive\r\n' + b'\r\n' + b'HTTP/1.1 500 Internal Service Error\r\n' + ) + conn = AWSHTTPConnection('s3.amazonaws.com', 443) + conn.sock = s + wait_mock.return_value = True + conn.request('GET', '/bucket/foo', b'body', {'Expect': b'100-continue'}) + # Assert that we waited for the 100-continue response + self.assertEqual(wait_mock.call_count, 1) + response = conn.getresponse() + self.assertEqual(response.status, 500) + + def test_expect_100_continue_sends_307(self): + # This is the case where we send a 100 continue and the server + # immediately sends a 307 + with mock.patch('urllib3.util.wait_for_read') as wait_mock: + # Shows the server first sending a 100 continue response + # then a 200 ok response. + s = FakeSocket( + b'HTTP/1.1 307 Temporary Redirect\r\n' + b'Location: http://example.org\r\n' + ) + conn = AWSHTTPConnection('s3.amazonaws.com', 443) + conn.sock = s + wait_mock.return_value = True + conn.request('GET', '/bucket/foo', b'body', {'Expect': b'100-continue'}) + # Assert that we waited for the 100-continue response + self.assertEqual(wait_mock.call_count, 1) + response = conn.getresponse() + # Now we should verify that our final response is the 307. + self.assertEqual(response.status, 307) + + def test_expect_100_continue_no_response_from_server(self): + with mock.patch('urllib3.util.wait_for_read') as wait_mock: + # Shows the server first sending a 100 continue response + # then a 200 ok response. + s = FakeSocket( + b'HTTP/1.1 307 Temporary Redirect\r\n' + b'Location: http://example.org\r\n' + ) + conn = AWSHTTPConnection('s3.amazonaws.com', 443) + conn.sock = s + # By settings wait_mock to return False, this indicates + # that the server did not send any response. In this situation + # we should just send the request anyways. + wait_mock.return_value = False + conn.request('GET', '/bucket/foo', b'body', {'Expect': b'100-continue'}) + # Assert that we waited for the 100-continue response + self.assertEqual(wait_mock.call_count, 1) + response = conn.getresponse() + self.assertEqual(response.status, 307) + + def test_message_body_is_file_like_object(self): + # Shows the server first sending a 100 continue response + # then a 200 ok response. + body = BytesIOWithLen(b'body contents') + s = FakeSocket(b'HTTP/1.1 200 OK\r\n') + conn = AWSHTTPConnection('s3.amazonaws.com', 443) + conn.sock = s + conn.request('GET', '/bucket/foo', body) + response = conn.getresponse() + self.assertEqual(response.status, 200) + + def test_no_expect_header_set(self): + # Shows the server first sending a 100 continue response + # then a 200 ok response. + s = FakeSocket(b'HTTP/1.1 200 OK\r\n') + conn = AWSHTTPConnection('s3.amazonaws.com', 443) + conn.sock = s + conn.request('GET', '/bucket/foo', b'body') + response = conn.getresponse() + self.assertEqual(response.status, 200) + + def test_tunnel_readline_normal(self): + # Tests that ``_tunnel`` function behaves normally when it comes + # across the usual http ending. + conn = self.create_tunneled_connection( + url='s3.amazonaws.com', + port=443, + response=b'HTTP/1.1 200 OK\r\n\r\n', + ) + conn._tunnel() + # Ensure proper amount of readline calls were made. + self.assertEqual(self.mock_response.fp.readline.call_count, 2) + + def test_tunnel_raises_socket_error(self): + # Tests that ``_tunnel`` function throws appropriate error when + # not 200 status. + conn = self.create_tunneled_connection( + url='s3.amazonaws.com', + port=443, + response=b'HTTP/1.1 404 Not Found\r\n\r\n', + ) + with self.assertRaises(socket.error): + conn._tunnel() + + def test_tunnel_uses_std_lib(self): + s = FakeSocket(b'HTTP/1.1 200 OK\r\n') + conn = AWSHTTPConnection('s3.amazonaws.com', 443) + conn.sock = s + # Test that the standard library method was used by patching out + # the ``_tunnel`` method and seeing if the std lib method was called. + with mock.patch('urllib3.connection.HTTPConnection._tunnel') as mock_tunnel: + conn._tunnel() + self.assertTrue(mock_tunnel.called) + + def test_encodes_unicode_method_line(self): + s = FakeSocket(b'HTTP/1.1 200 OK\r\n') + conn = AWSHTTPConnection('s3.amazonaws.com', 443) + conn.sock = s + # Note the combination of unicode 'GET' and + # bytes 'Utf8-Header' value. + conn.request( + 'GET', + '/bucket/foo', + b'body', + headers={"Utf8-Header": b"\xe5\xb0\x8f"}, + ) + response = conn.getresponse() + self.assertEqual(response.status, 200) + + def test_state_reset_on_connection_close(self): + # This simulates what urllib3 does with connections + # in its connection pool logic. + with mock.patch('urllib3.util.wait_for_read') as wait_mock: + # First fast fail with a 500 response when we first + # send the expect header. + s = FakeSocket(b'HTTP/1.1 500 Internal Server Error\r\n') + conn = AWSHTTPConnection('s3.amazonaws.com', 443) + conn.sock = s + wait_mock.return_value = True + + conn.request('GET', '/bucket/foo', b'body', {'Expect': b'100-continue'}) + self.assertEqual(wait_mock.call_count, 1) + response = conn.getresponse() + self.assertEqual(response.status, 500) + + # Now what happens in urllib3 is that when the next + # request comes along and this connection gets checked + # out. We see that the connection needs to be + # reset. So first the connection is closed. + conn.close() + + # And then a new connection is established. + new_conn = FakeSocket(b'HTTP/1.1 100 (Continue)\r\n\r\nHTTP/1.1 200 OK\r\n') + conn.sock = new_conn + + # And we make a request, we should see the 200 response + # that was sent back. + wait_mock.return_value = True + + conn.request('GET', '/bucket/foo', b'body', {'Expect': b'100-continue'}) + # Assert that we waited for the 100-continue response + self.assertEqual(wait_mock.call_count, 2) + response = conn.getresponse() + # This should be 200. If it's a 500 then + # the prior response was leaking into our + # current response., + self.assertEqual(response.status, 200) + + def test_handles_expect_100_with_no_reason_phrase(self): + with mock.patch('urllib3.util.wait_for_read') as wait_mock: + # Shows the server first sending a 100 continue response + # then a 200 ok response. + s = FakeSocket(b'HTTP/1.1 100\r\n\r\nHTTP/1.1 200 OK\r\n') + conn = AWSHTTPConnection('s3.amazonaws.com', 443) + conn.sock = s + wait_mock.return_value = True + conn.request( + 'GET', + '/bucket/foo', + io.BytesIO(b'body'), + { + 'Expect': b'100-continue', + 'Content-Length': b'4' + }, + ) + response = conn.getresponse() + # Now we should verify that our final response is the 200 OK. + self.assertEqual(response.status, 200) + # Assert that we waited for the 100-continue response + self.assertEqual(wait_mock.call_count, 1) + # Verify that we went the request body because we got a 100 + # continue. + self.assertIn(b'body', s.sent_data)