Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: retry uploads only conditionally #316

Merged
merged 3 commits into from
Nov 20, 2020
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 54 additions & 13 deletions google/cloud/storage/blob.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,9 +103,11 @@
"release. The default behavior (when `num_retries` is not specified) when "
"a transient error (e.g. 429 Too Many Requests or 500 Internal Server "
"Error) occurs will be as follows: upload requests will be automatically "
"retried. Subsequent retries will be sent after waiting 1, 2, 4, 8, etc. "
"seconds (exponential backoff) until 10 minutes of wait time have "
"elapsed. At that point, there will be no more attempts to retry."
"retried if and only if `if_metageneration_match` is specified (thus "
"making the upload idempotent). Subsequent retries will be sent after "
"waiting 1, 2, 4, 8, etc. seconds (exponential backoff) until 10 minutes "
"of wait time have elapsed. At that point, there will be no more attempts "
"to retry."
)
_READ_LESS_THAN_SIZE = (
"Size {:d} was specified but the file-like object only had " "{:d} bytes remaining."
Expand Down Expand Up @@ -1550,8 +1552,14 @@ def _do_multipart_upload(
concluded once ``stream`` is exhausted (or :data:`None`).

:type num_retries: int
:param num_retries: Number of upload retries. (Deprecated: This
argument will be removed in a future release.)
:param num_retries: Number of upload retries. By default, only uploads
with if_metageneration_match set will be retried, as
uploads without the argument are not guaranteed to
be idempotent. Setting num_retries will override
this default behavior and guarantee retries even
when if_metageneration_match is not set.
(Deprecated: This argument will be removed in a
future release.)

:type predefined_acl: str
:param predefined_acl: (Optional) Predefined access control list
Expand Down Expand Up @@ -1709,8 +1717,14 @@ def _initiate_resumable_upload(
:param predefined_acl: (Optional) Predefined access control list

:type num_retries: int
:param num_retries: Number of upload retries. (Deprecated: This
argument will be removed in a future release.)
:param num_retries: Number of upload retries. By default, only uploads
with if_metageneration_match set will be retried, as
uploads without the argument are not guaranteed to
be idempotent. Setting num_retries will override
this default behavior and guarantee retries even
when if_metageneration_match is not set.
(Deprecated: This argument will be removed in a
future release.)

:type extra_headers: dict
:param extra_headers: (Optional) Extra headers to add to standard
Expand Down Expand Up @@ -1887,8 +1901,14 @@ def _do_resumable_upload(
concluded once ``stream`` is exhausted (or :data:`None`).

:type num_retries: int
:param num_retries: Number of upload retries. (Deprecated: This
argument will be removed in a future release.)
:param num_retries: Number of upload retries. By default, only uploads
with if_metageneration_match set will be retried, as
uploads without the argument are not guaranteed to
be idempotent. Setting num_retries will override
this default behavior and guarantee retries even
when if_metageneration_match is not set.
(Deprecated: This argument will be removed in a
future release.)

:type predefined_acl: str
:param predefined_acl: (Optional) Predefined access control list
Expand Down Expand Up @@ -2005,8 +2025,14 @@ def _do_upload(
concluded once ``stream`` is exhausted (or :data:`None`).

:type num_retries: int
:param num_retries: Number of upload retries. (Deprecated: This
argument will be removed in a future release.)
:param num_retries: Number of upload retries. By default, only uploads
with if_metageneration_match set will be retried, as
uploads without the argument are not guaranteed to
be idempotent. Setting num_retries will override
this default behavior and guarantee retries even
when if_metageneration_match is not set.
(Deprecated: This argument will be removed in a
future release.)

:type predefined_acl: str
:param predefined_acl: (Optional) Predefined access control list
Expand Down Expand Up @@ -2058,6 +2084,15 @@ def _do_upload(
**only** response in the multipart case and it will be the
**final** response in the resumable case.
"""
if if_metageneration_match is None and num_retries is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should update the docstring as well to make it clear that setting num_retries can be used to override the idempotency requirement (assuming that's the intention), to make sure that users know the implications.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's the intention. Will do.

# Uploads are only idempotent (safe to retry) if
# if_metageneration_match is set. If it is not set, the default
# num_retries should be 0. Note: Because retry logic for uploads is
# provided by the google-resumable-media-python package, it doesn't
# use the ConditionalRetryStrategy class used in other API calls in
# this library to solve this problem.
num_retries = 0

if size is not None and size <= _MAX_MULTIPART_SIZE:
response = self._do_multipart_upload(
client,
Expand Down Expand Up @@ -2161,8 +2196,14 @@ def upload_from_file(
:param content_type: (Optional) Type of content being uploaded.

:type num_retries: int
:param num_retries: Number of upload retries. (Deprecated: This
argument will be removed in a future release.)
:param num_retries: Number of upload retries. By default, only uploads
with if_metageneration_match set will be retried, as
uploads without the argument are not guaranteed to
be idempotent. Setting num_retries will override
this default behavior and guarantee retries even
when if_metageneration_match is not set.
(Deprecated: This argument will be removed in a
future release.)

:type client: :class:`~google.cloud.storage.client.Client`
:param client: (Optional) The client to use. If not passed, falls back
Expand Down
6 changes: 6 additions & 0 deletions tests/unit/test_blob.py
Original file line number Diff line number Diff line change
Expand Up @@ -2577,6 +2577,12 @@ def _do_upload_helper(
if_metageneration_not_match,
**timeout_kwarg
)

# Adjust num_retries expectations to reflect the conditional default in
# _do_upload()
if num_retries is None and if_metageneration_match is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you comment on to what extent the conditional is covered by test cases?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tritone the cover session under nox requires (since the merge of #308) 100% coverage (branch and statements).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah that's right, good point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Branch coverage of the code I've added is 100% according to the coverage tool. We are exercising scenarios where if_metageneration_match = true and num_retries is not None.

num_retries = 0

self.assertIs(created_json, mock.sentinel.json)
response.json.assert_called_once_with()
if size is not None and size <= _MAX_MULTIPART_SIZE:
Expand Down