-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Conversation
b6d91c7
to
7fb941a
Compare
Looks like a breaking change |
Yes, this should be considered a breaking API change. It removes the download function from gluon utils. Sorry, I should have called that out in the description. |
|
||
|
||
def _download_retry_with_backoff(fname, url, max_attempts=4): | ||
"""Downloads a url with backoff applied, automatically re-requesting after a failure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Retry can be added to existing download function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This requires using a session right? I'm not sure if that then lets us stream to a file. I'll check it out though. I'd prefer to use built in functionality provided we get backoff and streaming support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice change
backoff_coef = 50.0 | ||
while True: | ||
try: | ||
print('Downloading %s from %s...' % (fname, url)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would use logging instead of print and arguments to logging instead of %
# Likely non-2** result, possibly timeout or redirection failure. | ||
attempts = attempts + 1 | ||
if attempts > max_attempts: | ||
print('Downloading %s from %s, failed after #%d attempts' % (fname, url, attempts)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above.
|
||
if overwrite or not os.path.exists(fname) or (sha1_hash and not check_sha1(fname, sha1_hash)): | ||
dirname = os.path.dirname(os.path.abspath(os.path.expanduser(fname))) | ||
if not os.path.exists(dirname): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can use exist_ok and avoid the check:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's py3 feature. https://docs.python.org/2.7/library/os.html#os.makedirs
|
||
|
||
def create_dir(dirname): | ||
if not os.path.exists(dirname): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above about flag: https://docs.python.org/3/library/os.html#os.makedirs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't have this change until 2.0. You may consider adding the retry to existing download functions.
OK |
return fname | ||
|
||
|
||
def create_dir(dirname): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if not os.path.exists(dirname):
os.makedirs(dirname)
This is just two lines. I don't think its worth making it a function
Can we keep the alias in test_utils and gluon to keep backward compatibility? |
@@ -61,13 +52,13 @@ def split_data(data, num_slice, batch_axis=0, even_split=True): | |||
size = data.shape[batch_axis] | |||
if size < num_slice: | |||
raise ValueError( | |||
"Too many slices for data with shape %s. Arguments are " \ | |||
"num_slice=%d and batch_axis=%d."%(str(data.shape), num_slice, batch_axis)) | |||
"Too many slices for data with shape %s. Arguments are " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the backslash required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't triple quotes useful in this case? """blahs dah"""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think triplequotes would be the equiv of
"Too many slices for data with shape %s. Arguments are \n" \
"num_slice=%d and batch_axis=%d."%(str(data.shape), num_slice, batch_axis))
but I'll try a few options.
"data with shape %s cannot be evenly split into %d slices along axis %d. " \ | ||
"Use a batch size that's multiple of %d or set even_split=False to allow " \ | ||
"uneven partitioning of data."%( | ||
"data with shape %s cannot be evenly split into %d slices along axis %d. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Backslashes
@eric-haibin-lin Yes I was thinking of just pointing everything to the gluon download function for now. |
Hi, the community has passed to vote about associating the code changes with JIRA (https://lists.apache.org/thread.html/ab22cf0e35f1bce2c3bf3bec2bc5b85a9583a3fe7fd56ba1bbade55f@%3Cdev.mxnet.apache.org%3E) We have updated the guidelines for contributors in https://cwiki.apache.org/confluence/display/MXNET/Development+Process, please ensure that you have created a JIRA at https://issues.apache.org/jira/projects/MXNET/issues/ to describe your work in this pull request and include the JIRA title in your PR as [MXNET-xxxx] your title where MXNET-xxxx is the JIRA id Thanks! |
Closing, will reopen if I have time to refactor. |
Description
The primary goal of this PR was to add exponential backoff to our http request. This should be a good feature for customers with slow Internet connections, but I also hope it will help mitigate test issues such as: #9669.
I noticed that we have a lot of duplicate code between test_utils.download and gluon.download. I've attempted to combine these two functions, but can roll it back if they had a good reason to be separated.
Checklist
Essentials
make lint
)Note examples are affected by this change. I've updated them so that they do not reference download functions that no longer exist.
Comments