Skip to content
This repository has been archived by the owner on Oct 9, 2023. It is now read-only.

Flash Requires Internet to Load a Local Checkpoint #155

Closed
aribornstein opened this issue Mar 2, 2021 · 5 comments · Fixed by #237
Closed

Flash Requires Internet to Load a Local Checkpoint #155

aribornstein opened this issue Mar 2, 2021 · 5 comments · Fixed by #237
Assignees
Labels
bug / fix Something isn't working help wanted Extra attention is needed lightning Priority
Milestone

Comments

@aribornstein
Copy link
Contributor

aribornstein commented Mar 2, 2021

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

  1. Turn of internet on device
  2. Load checkpoint from local file
clf = ImageClassifier.load_from_checkpoint('/kaggle/input/baserazcrmodels/razcr_resnet50_base_model.pt')
  1. Error caused by ImageClassifer Init which pulls from torchvision on init even though it's loading from a local checkpoint
---------------------------------------------------------------------------
gaierror                                  Traceback (most recent call last)
/opt/conda/lib/python3.7/urllib/request.py in do_open(self, http_class, req, **http_conn_args)
   1349                 h.request(req.get_method(), req.selector, req.data, headers,
-> 1350                           encode_chunked=req.has_header('Transfer-encoding'))
   1351             except OSError as err: # timeout error
/opt/conda/lib/python3.7/http/client.py in request(self, method, url, body, headers, encode_chunked)
   1276         """Send a complete request to the server."""
-> 1277         self._send_request(method, url, body, headers, encode_chunked)
   1278 
/opt/conda/lib/python3.7/http/client.py in _send_request(self, method, url, body, headers, encode_chunked)
   1322             body = _encode(body, 'body')
-> 1323         self.endheaders(body, encode_chunked=encode_chunked)
   1324 
/opt/conda/lib/python3.7/http/client.py in endheaders(self, message_body, encode_chunked)
   1271             raise CannotSendHeader()
-> 1272         self._send_output(message_body, encode_chunked=encode_chunked)
   1273 
/opt/conda/lib/python3.7/http/client.py in _send_output(self, message_body, encode_chunked)
   1031         del self._buffer[:]
-> 1032         self.send(msg)
   1033 
/opt/conda/lib/python3.7/http/client.py in send(self, data)
    971             if self.auto_open:
--> 972                 self.connect()
    973             else:
/opt/conda/lib/python3.7/http/client.py in connect(self)
   1438 
-> 1439             super().connect()
   1440 
/opt/conda/lib/python3.7/http/client.py in connect(self)
    943         self.sock = self._create_connection(
--> 944             (self.host,self.port), self.timeout, self.source_address)
    945         self.sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
/opt/conda/lib/python3.7/socket.py in create_connection(address, timeout, source_address)
    706     err = None
--> 707     for res in getaddrinfo(host, port, 0, SOCK_STREAM):
    708         af, socktype, proto, canonname, sa = res
/opt/conda/lib/python3.7/socket.py in getaddrinfo(host, port, family, type, proto, flags)
    751     addrlist = []
--> 752     for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
    753         af, socktype, proto, canonname, sa = res
gaierror: [Errno -3] Temporary failure in name resolution
During handling of the above exception, another exception occurred:
URLError                                  Traceback (most recent call last)
<ipython-input-8-d3a1d7810d85> in <module>
      5 #                       num_classes=len(columns))
      6 
----> 7 clf = ImageClassifier.load_from_checkpoint("../input/baserazcrmodels/razcr_resnet50_base_model.pt")
/opt/conda/lib/python3.7/site-packages/pytorch_lightning/core/saving.py in load_from_checkpoint(cls, checkpoint_path, map_location, hparams_file, strict, **kwargs)
    155         checkpoint[cls.CHECKPOINT_HYPER_PARAMS_KEY].update(kwargs)
    156 
--> 157         model = cls._load_model_state(checkpoint, strict=strict, **kwargs)
    158         return model
    159 
/opt/conda/lib/python3.7/site-packages/pytorch_lightning/core/saving.py in _load_model_state(cls, checkpoint, strict, **cls_kwargs_new)
    196             _cls_kwargs = {k: v for k, v in _cls_kwargs.items() if k in cls_init_args_name}
    197 
--> 198         model = cls(**_cls_kwargs)
    199 
    200         # give model a chance to load something
/opt/conda/lib/python3.7/site-packages/flash/vision/classification/model.py in __init__(self, num_classes, backbone, pretrained, loss_fn, optimizer, metrics, learning_rate, multilabel)
     61         self.save_hyperparameters()
     62 
---> 63         self.backbone, num_features = backbone_and_num_features(backbone, pretrained=pretrained)
     64 
     65         self.head = nn.Sequential(
/opt/conda/lib/python3.7/site-packages/flash/vision/backbones.py in backbone_and_num_features(model_name, fpn, pretrained, trainable_backbone_layers, **kwargs)
     69 
     70     if model_name in TORCHVISION_MODELS:
---> 71         return torchvision_backbone_and_num_features(model_name, pretrained)
     72 
     73     raise ValueError(f"{model_name} is not supported yet.")
/opt/conda/lib/python3.7/site-packages/flash/vision/backbones.py in torchvision_backbone_and_num_features(model_name, pretrained)
    128 
    129     elif model_name in RESNET_MODELS:
--> 130         model = model(pretrained=pretrained)
    131         # remove the last two layers & turn it into a Sequential model
    132         backbone = nn.Sequential(*list(model.children())[:-2])
/opt/conda/lib/python3.7/site-packages/torchvision/models/resnet.py in resnet50(pretrained, progress, **kwargs)
    263     """
    264     return _resnet('resnet50', Bottleneck, [3, 4, 6, 3], pretrained, progress,
--> 265                    **kwargs)
    266 
    267 
/opt/conda/lib/python3.7/site-packages/torchvision/models/resnet.py in _resnet(arch, block, layers, pretrained, progress, **kwargs)
    225     if pretrained:
    226         state_dict = load_state_dict_from_url(model_urls[arch],
--> 227                                               progress=progress)
    228         model.load_state_dict(state_dict)
    229     return model
/opt/conda/lib/python3.7/site-packages/torch/hub.py in load_state_dict_from_url(url, model_dir, map_location, progress, check_hash, file_name)
    553             r = HASH_REGEX.search(filename)  # r is Optional[Match[str]]
    554             hash_prefix = r.group(1) if r else None
--> 555         download_url_to_file(url, cached_file, hash_prefix, progress=progress)
    556 
    557     if _is_legacy_zip_format(cached_file):
/opt/conda/lib/python3.7/site-packages/torch/hub.py in download_url_to_file(url, dst, hash_prefix, progress)
    423     # certificates in older Python
    424     req = Request(url, headers={"User-Agent": "torch.hub"})
--> 425     u = urlopen(req)
    426     meta = u.info()
    427     if hasattr(meta, 'getheaders'):
/opt/conda/lib/python3.7/urllib/request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
    220     else:
    221         opener = _opener
--> 222     return opener.open(url, data, timeout)
    223 
    224 def install_opener(opener):
/opt/conda/lib/python3.7/urllib/request.py in open(self, fullurl, data, timeout)
    523             req = meth(req)
    524 
--> 525         response = self._open(req, data)
    526 
    527         # post-process response
/opt/conda/lib/python3.7/urllib/request.py in _open(self, req, data)
    541         protocol = req.type
    542         result = self._call_chain(self.handle_open, protocol, protocol +
--> 543                                   '_open', req)
    544         if result:
    545             return result
/opt/conda/lib/python3.7/urllib/request.py in _call_chain(self, chain, kind, meth_name, *args)
    501         for handler in handlers:
    502             func = getattr(handler, meth_name)
--> 503             result = func(*args)
    504             if result is not None:
    505                 return result
/opt/conda/lib/python3.7/urllib/request.py in https_open(self, req)
   1391         def https_open(self, req):
   1392             return self.do_open(http.client.HTTPSConnection, req,
-> 1393                 context=self._context, check_hostname=self._check_hostname)
   1394 
   1395         https_request = AbstractHTTPHandler.do_request_
/opt/conda/lib/python3.7/urllib/request.py in do_open(self, http_class, req, **http_conn_args)
   1350                           encode_chunked=req.has_header('Transfer-encoding'))
   1351             except OSError as err: # timeout error
-> 1352                 raise URLError(err)
   1353             r = h.getresponse()
   1354         except:
URLError: <urlopen error [Errno -3] Temporary failure in name resolution>
@aribornstein aribornstein added bug / fix Something isn't working help wanted Extra attention is needed labels Mar 2, 2021
@Borda Borda self-assigned this Mar 3, 2021
@edenlightning edenlightning assigned kaushikb11 and unassigned Borda Mar 3, 2021
@kaushikb11
Copy link
Contributor

Temp fix sent to Ari to unblock for the Task.

@Borda
Copy link
Member

Borda commented Mar 4, 2021

Temp fix sent to Ari to unblock for the Task.

Can you please link the PR?

@kaushikb11
Copy link
Contributor

kaushikb11 commented Mar 4, 2021

@Borda The temp fix is this, to check if it's called from _load_model_state.

calling_function = inspect.getframeinfo(sys._getframe(1))[2]
if calling_function == "_load_model_state":
   pretrained = False

@edgarriba
Copy link
Contributor

edgarriba commented Mar 25, 2021

@aribornstein can you confirm that with 0.2.2rc2 if you still get this error ?

@ethanwharris
Copy link
Collaborator

ethanwharris commented Apr 22, 2021

@aribornstein I have been able to reproduce this. For anyone interested you have to delete the cached torchvision models from ~/.cache/torch/hub/checkpoints then the error appears. Working on a fix now 😃

@ethanwharris ethanwharris mentioned this issue Apr 22, 2021
7 tasks
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug / fix Something isn't working help wanted Extra attention is needed lightning Priority
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants