Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Add progress bar to Gluon download function (was: Gluon-cv cannot download model file with model_zoo.get_model) #19279

Open
moseswmwong opened this issue Oct 2, 2020 · 3 comments

Comments

@moseswmwong
Copy link

moseswmwong commented Oct 2, 2020

Description

On MacOS, my python code is:

import glouoncv as gcv
...
net = gcv.model_zoo.get_model('ssd_512_mobilenet1.0_voc', 
            pretrained=True, root='.')

The problem is when I run it download fail, it get stuck at the following, not having any progress for a long time:

Model file not found. Downloading.
Downloading ./ssd_512_mobilenet1.0_voc-37c18076.zip from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/ssd_512_mobilenet1.0_voc-37c18076.zip...

When I look into the folder, one files exist, the *.lock file , and the python script only create the *.lock file and it is always 0 byte:

-rw-r--r--  1 moseswong  staff          0 Oct  2 12:22 ssd_512_mobilenet1.0_voc-37c18076.lock
...

I tried to pre-load the file ssd_512_mobilenet1.0_voc-37c18076.zip into the download folder (which is "." as specifiy in get_model method), and run the python script again and it still stuck. In this setting the python script should not even try to download but the fact is not only it try to download and it stuck at downloading again.

When I look into the folder, two files exist, the *.zip and *.lock , the *.zip file is not from the python script but manually, and the python script only create the *.lock file and it is always 0 byte.

-rw-r--r--  1 moseswong  staff          0 Oct  2 12:22 ssd_512_mobilenet1.0_voc-37c18076.lock
-rw-r--r--@ 1 moseswong  staff   51421665 Oct  2 12:32 ssd_512_mobilenet1.0_voc-37c18076.zip
...

I tried many times.

In summary, it is found that not only it get stuck when there is no model file, it get stuck even when the model file exist.

UPDATE:
After leaving it run for a few hours the following error appears, looks like network error !!!??? If it is network error why I use a browser to download the zip file with the URL takes only 2 minutes --- https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/ssd_512_mobilenet1.0_voc-37c18076.zip. And this same Python script runs perfectly on another computer (Windows 10) connects to the Internet through the same router.

Model file not found. Downloading.
Downloading ./ssd_512_mobilenet1.0_voc-37c18076.zip from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/ssd_512_mobilenet1.0_voc-37c18076.zip...
65%|######4 | 32400/50216 [06:01<03:18, 89.73KB/s]
Traceback (most recent call last):
File "cvisionc.py", line 92, in
net = gcv.model_zoo.get_model('ssd_512_mobilenet1.0_voc', pretrained=True, root='.')
File "/Users/moseswong/opt/anaconda3/envs/mx4/lib/python3.6/site-packages/gluoncv/model_zoo/model_zoo.py", line 403, in get_model
net = _modelsname
File "/Users/moseswong/opt/anaconda3/envs/mx4/lib/python3.6/site-packages/gluoncv/model_zoo/ssd/presets.py", line 544, in ssd_512_mobilenet1_0_voc
pretrained_base=pretrained_base, **kwargs)
File "/Users/moseswong/opt/anaconda3/envs/mx4/lib/python3.6/site-packages/gluoncv/model_zoo/ssd/ssd.py", line 420, in get_ssd
net.load_parameters(get_model_file(full_name, tag=pretrained, root=root), ctx=ctx)
File "/Users/moseswong/opt/anaconda3/envs/mx4/lib/python3.6/site-packages/gluoncv/model_zoo/model_store.py", line 293, in get_model_file
with zipfile.ZipFile(zip_file_path) as zf:
File "/Users/moseswong/opt/anaconda3/envs/mx4/lib/python3.6/zipfile.py", line 1131, in init
self._RealGetContents()
File "/Users/moseswong/opt/anaconda3/envs/mx4/lib/python3.6/zipfile.py", line 1198, in _RealGetContents
raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file

Error Message

Model file not found. Downloading.
Downloading ./ssd_512_mobilenet1.0_voc-37c18076.zip from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/ssd_512_mobilenet1.0_voc-37c18076.zip...

Steps to reproduce

  1. On MacBook Pro - MacOS 10.15.3 - i7 intel cpu, 16G memory
  2. Install Anaconda 3 - v1.9.12
  3. Open terminal
  4. conda config --add channels conda-forge
  5. conda create -n mx4 -q -y python=3.6
  6. conda activate mx4
  7. conda install -n mx4 -q -y conda-forge::py-opencv==4.4.0
  8. pip install mxnet==1.6.0
  9. pip install gluoncv==0.8.0

run python script

import sys
import mxnet as mx
import gluoncv as gcv
import cv2

net = gcv.model_yoo.get_model('ssd_512_mobilenet1.0_voc', pretrained=True, root='.')
net = hybridize()
...

What have you tried to solve it?

  1. Add "root=" to the get_model() method to specific local folder as destination instead of the .mxnet folder at root of current user
  2. Manually download the model zip file to target folder, but still stuck on downloading.
  3. Change to the following code and same error occurs:
    net = gcv.model_zoo.ssd_512_mobilenet1_0_voc(pretrained=True, pretrained_base=False)

Environment

We recommend using our script for collecting the diagnositc information. Run the following command and paste the outputs below:

curl --retry 10 -s https://raw.githubusercontent.com/apache/incubator-mxnet/master/tools/diagnose.py | python

# paste outputs here
(mx4) Mosess-MacBook-Pro:~ moseswong$ curl --retry 10 -s https://raw.githubusercontent.com/apache/incubator-mxnet/master/tools/diagnose.py | python
----------Python Info----------
Version      : 3.6.11
Compiler     : GCC Clang 10.0.1 
Build        : ('default', 'Aug  5 2020 20:19:23')
Arch         : ('64bit', '')
------------Pip Info-----------
Version      : 20.2.3
Directory    : /Users/moseswong/opt/anaconda3/envs/mx4/lib/python3.6/site-packages/pip
----------MXNet Info-----------
Version      : 1.6.0
Directory    : /Users/moseswong/opt/anaconda3/envs/mx4/lib/python3.6/site-packages/mxnet
Commit Hash   : 6eec9da55c5096079355d1f1a5fa58dcf35d6752
Library      : ['/Users/moseswong/opt/anaconda3/envs/mx4/lib/python3.6/site-packages/mxnet/libmxnet.so']
Build features:
✖ CUDA
✖ CUDNN
✖ NCCL
✖ CUDA_RTC
✖ TENSORRT
✔ CPU_SSE
✔ CPU_SSE2
✔ CPU_SSE3
✔ CPU_SSE4_1
✔ CPU_SSE4_2
✖ CPU_SSE4A
✔ CPU_AVX
✖ CPU_AVX2
✖ OPENMP
✖ SSE
✔ F16C
✖ JEMALLOC
✖ BLAS_OPEN
✖ BLAS_ATLAS
✖ BLAS_MKL
✔ BLAS_APPLE
✔ LAPACK
✖ MKLDNN
✔ OPENCV
✖ CAFFE
✖ PROFILER
✔ DIST_KVSTORE
✖ CXX14
✖ INT64_TENSOR_SIZE
✔ SIGNAL_HANDLER
✖ DEBUG
✖ TVM_OP
----------System Info----------
Platform     : Darwin-19.3.0-x86_64-i386-64bit
system       : Darwin
node         : Mosess-MacBook-Pro.local
release      : 19.3.0
version      : Darwin Kernel Version 19.3.0: Thu Jan  9 20:58:23 PST 2020; root:xnu-6153.81.5~1/RELEASE_X86_64
----------Hardware Info----------
machine      : x86_64
processor    : i386
b'machdep.cpu.brand_string: Intel(R) Core(TM) i7-5557U CPU @ 3.10GHz'
b'machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 PCLMULQDQ DTES64 MON DSCPL VMX EST TM2 SSSE3 FMA CX16 TPR PDCM SSE4.1 SSE4.2 x2APIC MOVBE POPCNT AES PCID XSAVE OSXSAVE SEGLIM64 TSCTMR AVX1.0 RDRAND F16C'
b'machdep.cpu.leaf7_features: RDWRFSGS TSC_THREAD_OFFSET BMI1 AVX2 SMEP BMI2 ERMS INVPCID FPU_CSDS RDSEED ADX SMAP IPT MDCLEAR IBRS STIBP L1DF SSBD'
b'machdep.cpu.extfeatures: SYSCALL XD 1GBPAGE EM64T LAHF LZCNT PREFETCHW RDTSCP TSCI'
----------Network Test----------
Setting timeout: 10
Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0013 sec, LOAD: 0.6192 sec.
Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.0008 sec, LOAD: 160.7595 sec.
Error open Gluon Tutorial(cn): https://zh.gluon.ai, <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:852)>, DNS finished in 0.08813118934631348 sec.
Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.0249 sec, LOAD: 80.8874 sec.
Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0247 sec, LOAD: 130.8603 sec.
Error open Conda: https://repo.continuum.io/pkgs/free/, HTTP Error 403: Forbidden, DNS finished in 0.024617910385131836 sec.

@moseswmwong moseswmwong changed the title Gluon-cv cannot down model file with model_zoo.get_model Gluon-cv cannot download model file with model_zoo.get_model Oct 2, 2020
@moseswmwong
Copy link
Author

moseswmwong commented Oct 3, 2020

UPDATE: I switched network for my MacBook Pro from the problem network to another network (4G smartphone share from my iPhone) and the model downloaded instantly! Furthermore, after that, I switch back to the originally problematic network and there is no problem with the download the model file also download instantly via this previously problematic network.

I am not close the case. As it means somewhere get stuck, OS? Cloud?

As a matter of fact, first time deployment of Mxnet code on MacOS using gcv.model_zoo.get_model() stuck on its first installation attempt, then it means the installation by end user will always fail. As we are working to deploy the code as new product to customer locations, this means MacOS installation a total failure if we depend on download approach. Note, asking customers to switch to alternate network and return to the original is not a acceptable workaround for naïve end users.

@szha
Copy link
Member

szha commented Oct 3, 2020

It looks like there are two issues:

@szha szha added Feature request and removed Bug labels Oct 3, 2020
@szha szha changed the title Gluon-cv cannot download model file with model_zoo.get_model Add progress bar to Gluon download function (was: Gluon-cv cannot download model file with model_zoo.get_model) Oct 3, 2020
@moseswmwong
Copy link
Author

I believe this will solve the problem, thanks so much!

Please keep us posted when the new version is available.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants