Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seek file before retry #94

Merged
merged 13 commits into from
Dec 22, 2020
32 changes: 23 additions & 9 deletions deepomatic/api/http_helper.py
Original file line number Diff line number Diff line change
Expand Up @@ -209,15 +209,29 @@ def recursive_json_dump(prefix, obj, data_dict, omit_dot=False):
return new_dict

def send_request(self, requests_callable, *args, **kwargs):
# requests_callable must be a method from the requests module
thomas-riccardi marked this conversation as resolved.
Show resolved Hide resolved

# this is the timeout of requests module
requests_timeout = kwargs.pop('timeout', self.requests_timeout)

files = kwargs.pop('files', None)
if files:
for f in files.values():
# seek files before each retry
maingoh marked this conversation as resolved.
Show resolved Hide resolved
if hasattr(f, 'seek'):
f.seek(0)
Copy link
Contributor

@thomas-riccardi thomas-riccardi Dec 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about non-seekable files? maybe we should fail instead of keeping the current bug of sending empty data?

Related issue in google storage python client: googleapis/google-resumable-media-python#165
they mandate seekable streams. (that being said I remember seeing in their code that they memory buffered everything that is in flight and not yet confirmed by API, via application-level multi part uploads, so it's strange... maybe it has changed?)

or maybe it doesn't make sense for the files argument to have non seekable inputs; but the scenario should apply to other ways of passing input data: data argument.

Copy link
Contributor Author

@maingoh maingoh Dec 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which non seekable files are you refering to ? Overall either those are bytes or stream that once read need to be reseeked.
I don't think the bug will still be there bug after this PR.

I don't want to break the current behavior now, though we can think about it for later (I would like also to remove the magic one day #33)

Copy link
Contributor

@thomas-riccardi thomas-riccardi Dec 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for reference, after meeting with hugo and some tests:

  • files accepts non-streamable too, the real constraint is: if it's streamable and non seekable, then we cannot support retries: raise error on second try (maybe on first too?)
  • data accepts generator/iterator, or a dict of such thing: hard to check all that for now. maybe types constraints will help.


return requests_callable(*args, files=files,
timeout=requests_timeout,
verify=self.verify_ssl,
**kwargs)

def maybe_retry_send_request(self, requests_callable, *args, **kwargs):
# requests_callable must be a method from the requests module

http_retry = kwargs.pop('http_retry', self.http_retry)

functor = functools.partial(requests_callable, *args,
verify=self.verify_ssl,
timeout=requests_timeout, **kwargs)
functor = functools.partial(self.send_request,
requests_callable,
*args, **kwargs)

if http_retry is not None:
return http_retry.retry(functor)
Expand Down Expand Up @@ -274,10 +288,10 @@ def make_request(self, func, resource, params=None, data=None,
if not resource.startswith('http'):
resource = self.resource_prefix + resource

response = self.send_request(func, resource, *args,
params=params, data=data,
files=files, headers=headers,
stream=stream, **kwargs)
response = self.maybe_retry_send_request(func, resource, *args,
params=params, data=data,
files=files, headers=headers,
stream=stream, **kwargs)

# Close opened files
for file in opened_files:
Expand Down
2 changes: 1 addition & 1 deletion deepomatic/api/version.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
__title__ = 'deepomatic-api'
__description__ = 'Deepomatic API client'
__version__ = '0.9.2'
__version__ = '0.9.3'
__author__ = 'deepomatic'
__author_email__ = 'support@deepomatic.com'
__url__ = 'https://github.com/deepomatic/deepomatic-client-python'
Expand Down
2 changes: 1 addition & 1 deletion demo.py
Original file line number Diff line number Diff line change
Expand Up @@ -359,7 +359,7 @@ def download_file(url):
filename = os.path.join(tempfile.gettempdir(),
hashlib.sha1(url.encode()).hexdigest() + ext)
if os.path.exists(filename): # avoid redownloading
logger.info("Skipping download of {}: file already exist in ".format(url, filename))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe only fix the bug in this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flake got bumped automatically, I think either we freeze it, or if we let it flexible we can embed those kind of small fixes in PRs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requirements-dev.txt should be fully frozen, even for python libs: it's not used by the lib users, it's OK to freeze/pin.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok it wasn't in requirement.dev so i moved it there and froze it

logger.info("Skipping download of {}: file already exist in {}".format(url, filename))
return filename
r = requests.get(url, stream=True)
r.raise_for_status()
Expand Down
13 changes: 3 additions & 10 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,9 @@
import io
from setuptools import find_packages, setup

try:
# for pip >= 10
from pip._internal.req import parse_requirements
except ImportError:
# for pip <= 9.0.3
from pip.req import parse_requirements

Comment on lines -5 to -11
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe only fix the bug in this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This prevented me to install it on my computer, it would be nice to include it in this PR


here = os.path.abspath(os.path.dirname(__file__))


about = {}
with io.open(os.path.join(here, 'deepomatic', 'api', 'version.py'), 'r', encoding='utf-8') as f:
exec(f.read(), about)
Expand All @@ -24,7 +16,8 @@
os.chdir(os.path.normpath(os.path.join(os.path.abspath(__file__), os.pardir)))

# Read requirements
install_reqs = parse_requirements(os.path.join(here, 'requirements.txt'), session='hack')
with io.open(os.path.join(here, 'requirements.txt'), encoding='utf-8') as f:
requirements = f.readlines()

namespaces = ['deepomatic']

Expand All @@ -43,7 +36,7 @@
long_description=README,
long_description_content_type='text/markdown',
data_files=[('', ['requirements.txt'])],
install_requires=[str(ir.req) for ir in install_reqs],
install_requires=requirements,
python_requires=">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*",
classifiers=[
'Operating System :: OS Independent',
Expand Down