-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WandDBLogger ddp race condition #1972
Comments
When are you guys planning 7.7 release? |
0.8.0 will be released next friday (june 12). This will be the last release before the stable 1.0.0 release slated for mid-july. did this PR solve the problem you were having? test out HF + PL on master now to make sure it works well. Mainly we made it so that pickling is no longer an issue :) this ddp implementation is much more scalable and works much better on single node instances. |
Testing it now. pip install https://github.com/PytorchLightning/pytorch-lightning/archive/master.zip --upgrade fails with AttributeError: type object 'Callable' has no attribute '_abc_registry'
----------------------------------------
ERROR: Command errored out with exit status 1: /home/shleifer/miniconda3/envs/nb/bin/python /home/shleifer/miniconda3/envs/nb/lib/python3.7/site-packages/pip install --ignore-installed --no-user --prefix /tmp/pip-build-env-dvi7mzvd/overlay --no-warn-script-location --no-binary :none: --only-binary :none: -i https://pypi.org/simple -- setuptools wheel Check the logs for full command output. |
pip install git+https://github.com/PytorchLightning/pytorch-lightning.git@master --upgrade also fails (nb) ➜ ~ pip install git+https://github.com/PytorchLightning/pytorch-lightning.git@master --upgrade
Collecting git+https://github.com/PytorchLightning/pytorch-lightning.git@master
Cloning https://github.com/PytorchLightning/pytorch-lightning.git (to revision master) to /tmp/pip-req-build-q98ryeoi
Running command git clone -q https://github.com/PytorchLightning/pytorch-lightning.git /tmp/pip-req-build-q98ryeoi
Installing build dependencies ... error
ERROR: Command errored out with exit status 1:
command: /home/shleifer/miniconda3/envs/nb/bin/python /home/shleifer/miniconda3/envs/nb/lib/python3.7/site-packages/pip install --
ignore-installed --no-user --prefix /tmp/pip-build-env-le91y8yf/overlay --no-warn-script-location --no-binary :none: --only-binary :n
one: -i https://pypi.org/simple -- setuptools wheel
cwd: None
Complete output (44 lines):
Traceback (most recent call last):
File "/home/shleifer/miniconda3/envs/nb/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/home/shleifer/miniconda3/envs/nb/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/shleifer/miniconda3/envs/nb/lib/python3.7/site-packages/pip/__main__.py", line 26, in <module>
sys.exit(_main())
File "/home/shleifer/miniconda3/envs/nb/lib/python3.7/site-packages/pip/_internal/cli/main.py", line 73, in main
command = create_command(cmd_name, isolated=("--isolated" in cmd_args))
File "/home/shleifer/miniconda3/envs/nb/lib/python3.7/site-packages/pip/_internal/commands/__init__.py", line 104, in create_comm
and
module = importlib.import_module(module_path)
File "/home/shleifer/miniconda3/envs/nb/lib/python3.7/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
File "<frozen importlib._bootstrap>", line 983, in _find_and_load
File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 728, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/home/shleifer/miniconda3/envs/nb/lib/python3.7/site-packages/pip/_internal/commands/install.py", line 24, in <module>
from pip._internal.cli.req_command import RequirementCommand, with_cleanup
File "/home/shleifer/miniconda3/envs/nb/lib/python3.7/site-packages/pip/_internal/cli/req_command.py", line 16, in <module>
from pip._internal.index.package_finder import PackageFinder
File "/home/shleifer/miniconda3/envs/nb/lib/python3.7/site-packages/pip/_internal/index/package_finder.py", line 21, in <module>
from pip._internal.index.collector import parse_links
File "/home/shleifer/miniconda3/envs/nb/lib/python3.7/site-packages/pip/_internal/index/collector.py", line 14, in <module>
from pip._vendor import html5lib, requests
File "/home/shleifer/miniconda3/envs/nb/lib/python3.7/site-packages/pip/_vendor/requests/__init__.py", line 114, in <module>
from . import utils
File "/home/shleifer/miniconda3/envs/nb/lib/python3.7/site-packages/pip/_vendor/requests/utils.py", line 25, in <module>
from . import certs
File "/home/shleifer/miniconda3/envs/nb/lib/python3.7/site-packages/pip/_vendor/requests/certs.py", line 15, in <module>
from pip._vendor.certifi import where
File "/home/shleifer/miniconda3/envs/nb/lib/python3.7/site-packages/pip/_vendor/certifi/__init__.py", line 1, in <module>
from .core import contents, where
File "/home/shleifer/miniconda3/envs/nb/lib/python3.7/site-packages/pip/_vendor/certifi/core.py", line 12, in <module>
from importlib.resources import read_text
File "/home/shleifer/miniconda3/envs/nb/lib/python3.7/importlib/resources.py", line 11, in <module>
from typing import Iterable, Iterator, Optional, Set, Union # noqa: F401
File "/home/shleifer/miniconda3/envs/nb/lib/python3.7/site-packages/typing.py", line 1357, in <module>
class Callable(extra=collections_abc.Callable, metaclass=CallableMeta):
File "/home/shleifer/miniconda3/envs/nb/lib/python3.7/site-packages/typing.py", line 1005, in __new__
self._abc_registry = extra._abc_registry
AttributeError: type object 'Callable' has no attribute '_abc_registry'
----------------------------------------
ERROR: Command errored out with exit status 1: /home/shleifer/miniconda3/envs/nb/bin/python /home/shleifer/miniconda3/envs/nb/lib/pyt
hon3.7/site-packages/pip install --ignore-installed --no-user --prefix /tmp/pip-build-env-le91y8yf/overlay --no-warn-script-location
--no-binary :none: --only-binary :none: -i https://pypi.org/simple -- setuptools wheel Check the logs for full command output. |
@sshleifer can't replicate. install works fine. https://colab.research.google.com/drive/1G-UZqDxkORegvy0oXgYU6IENN3Y9PQPz?usp=sharing Sounds you have something weird in your env. I googled "type object 'Callable' has no attribute FYI... we test installs on every PR for all operating systems and python versions. I haven't seen a broken test about install, so it's likely your env |
I did I get terminal output
and my multi_gpu unittest also hangs with a similar message, even though I never pass a Let me know if anything jumps out there, otherwise I'll debug. |
ah yes. lightning now calls the script passing in the gpus flag. but if you called it you would have specified gpus. so, it’s weird that this is not working. can you share a colab? |
I can't share my code, but I am using There is another bug on master in loading of checkpoints, potentially related to upgrading pl mid run?
|
When I run trainer(gpus=2, logger=wandblogger()) it tries to connect to wandb twice (I think) and then one of those attempts fails.
How to fix?
[original feature] (#627)
Code
Epic traceback
here's the 2nd connection
Then error.
The text was updated successfully, but these errors were encountered: