TypeError: optimizers must be either a single optimizer or a list of optimizers. #18

guerriep · 2020-08-24T14:18:24Z

Hello,

I'm trying to run main_swav.py with the following command:

python -m torch.distributed.launch --nproc_per_node=1 main_swav.py --images_path=<path to data directory> --train_annotations_path <path to data file> --epochs 400 --base_lr 0.6 --final_lr 0.0006 --warmup_epochs 0 --batch_size 32 --size_crops 224 96 --nmb_crops 2 6 --min_scale_crops 0.14 0.05 --max_scale_crops 1. 0.14 --use_fp16 true --freeze_prototypes_niters 5005 --queue_length 3840 --epoch_queue_starts 15

Some of those parameters have been added to accommodate our data. The only changes I have made to the code are minor changes to the dataset and additional/changed arguments. When I run this command I get the following error:

`Traceback (most recent call last):
File "main_swav.py", line 380, in
main()
File "main_swav.py", line 189, in main
model, optimizer = apex.amp.initialize(model, optimizer, opt_level="O1")
File "/opt/conda/lib/python3.6/site-packages/apex/amp/frontend.py", line 358, in initialize
return _initialize(models, optimizers, _amp_state.opt_properties, num_losses, cast_model_outputs)
File "/opt/conda/lib/python3.6/site-packages/apex/amp/_initialize.py", line 158, in _initialize
raise TypeError("optimizers must be either a single optimizer or a list of optimizers.")
TypeError: optimizers must be either a single optimizer or a list of optimizers.

Traceback (most recent call last):
File "/opt/conda/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/opt/conda/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/opt/conda/lib/python3.6/site-packages/torch/distributed/launch.py", line 263, in
main()
File "/opt/conda/lib/python3.6/site-packages/torch/distributed/launch.py", line 259, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/opt/conda/bin/python', '-u', 'main_swav.py', '--local_rank=0', '--images_path=/data/computer_vision_projects/rare_planes/classification_data/images/', '--train_annotations_path', '/data/computer_vision_projects/rare_planes/classification_data/annotations/instances_train_role_mislabel_category_id_033_chipped.json', '--epochs', '400', '--base_lr', '0.6', '--final_lr', '0.0006', '--warmup_epochs', '0', '--batch_size', '32', '--size_crops', '224', '96', '--nmb_crops', '2', '6', '--min_scale_crops', '0.14', '0.05', '--max_scale_crops', '1.', '0.14', '--use_fp16', 'true', '--freeze_prototypes_niters', '5005', '--queue_length', '3840', '--epoch_queue_starts', '15']' returned non-zero exit status 1.
make: *** [Makefile:69: train-rare-planes] Error 1`

Immediately before the line that throws the error I placed a couple print statements:
print("type(OPTIMIZER)", type(optimizer)) print("OPTIMIZER", optimizer)

The output from those is:
type(OPTIMIZER) <class 'apex.parallel.LARC.LARC'> OPTIMIZER SGD ( Parameter Group 0 dampening: 0 lr: 0.6 momentum: 0.9 nesterov: False weight_decay: 1e-06 )

Here are some version numbers I'm using:
Python 3.6.9 :: Anaconda, Inc. PyTorch == 1.5.0a0+8f84ded torchvision == 0.6.0a0 CUDA == 10.2 apex == 0.1

Any ideas why I would be seeing this error? Thanks in advance!

The text was updated successfully, but these errors were encountered:

mathildecaron31 · 2020-09-07T15:55:51Z

Hi @guerriep,
It seems that amp initialize method does not recognize your optimizer, which is weird since it is the right type (LARC). Can you add print statements in your apex library around this line https://github.com/NVIDIA/apex/blob/4ef930c1c884fdca5f472ab2ce7cb9b505d26c1a/apex/amp/_initialize.py#L149 in order to understand why it does not return True at ('LARC' in globals() and isinstance(optimizers, LARC) ?

mathildecaron31 · 2020-09-15T07:33:07Z

No activity so I close the issue. Feel free to re-open if you need further assistance

John-P · 2020-10-12T14:10:33Z

I have encountered the same issue. Added prints in apex:

type(optimizers) <class 'apex.parallel.LARC.LARC'>
else hit

that line for me is also slightly different (lines 148-160):

print("type(optimizers)", type(optimizers))
    if isinstance(optimizers, torch.optim.Optimizer) or ('LARC' in sys.modules and isinstance(optimizers, LARC)):
        print("isinstance LARC True")
        optimizers = [optimizers]
    elif optimizers is None:
        optimizers = []
    elif isinstance(optimizers, list):
        optimizers_was_list = True
        check_optimizers(optimizers)
    else:
        print("else hit")
        check_optimizers([optimizers])
        raise TypeError("optimizers must be either a single optimizer or a list of optimizers.")

I have also checked isinstance(optimizers, LARC) does indeed return True but 'LARC' in sys.modules is False.

Version numbers:

python 3.8.3
apex 0.1
pytorch 1.6.0
CUDA 11

kdexd · 2020-10-18T02:15:32Z

+1 facing the same issue, following this thread.

kdexd · 2020-10-20T04:33:34Z

NVIDIA/apex#978 is probably related.

GuoleiSun · 2020-12-07T20:48:41Z

+1 facing the same issue, any idea how to solve?

John-P · 2020-12-16T19:50:15Z

@mathildecaron31 Is it possible to re-open this issue as it appears to be affecting a number of people and is unresolved? Would you also be able to share version numbers for libraries in order to re-create your environment?

mathildecaron31 · 2020-12-18T14:44:11Z

I tested this code with:

python 3.6.6
apex commit: 4a1aa97e31ca87514e17c3cd3bbc03f4204579d0
torch 1.4.0
cuda 10.1

mathildecaron31 · 2020-12-18T14:48:30Z

Here is how I installed apex:

git clone "https://github.com/NVIDIA/apex"
cd apex
git checkout 4a1aa97e31ca87514e17c3cd3bbc03f4204579d0
python setup.py install --cuda_ext

python -c 'import apex; from apex.parallel import LARC' # should run and return nothing
python -c 'import apex; from apex.parallel import SyncBatchNorm; print(SyncBatchNorm.__module__)' # should run and return apex.parallel.optimized_sync_batchnorm

Hope that helps

John-P · 2020-12-23T16:46:58Z

I have been able to get it to run with these specific versions now. Still a bit curious as to why it does not work with newer versions of apex.

For others trying to replicate, these are my steps using anaconda and pip:

conda create --name=swav python=3.6.6
# CUDA 10.1 with torchvision
conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch
# NVCC for CUDA 10.1
conda install -c conda-forge cudatoolkit-dev=10.1.243,  pandas, opencv
# Pip should return a path with env name in it
which pip
# Apex commit 4a1aa97e31ca87514e17c3cd3bbc03f4204579d0 with cuda extentions enabled
pip install git+git://github.com/NVIDIA/apex.git@4a1aa97e31ca87514e17c3cd3bbc03f4204579d0 --install-option="--cuda_ext"

kaushal-py · 2020-12-23T17:33:15Z

Here is how I installed apex:

git clone "https://github.com/NVIDIA/apex"
cd apex
git checkout 4a1aa97e31ca87514e17c3cd3bbc03f4204579d0
python setup.py install --cuda_ext

python -c 'import apex; from apex.parallel import LARC' # should run and return nothing
python -c 'import apex; from apex.parallel import SyncBatchNorm; print(SyncBatchNorm.__module__)' # should run and return apex.parallel.optimized_sync_batchnorm

Hope that helps

Thanks a lot! This worked for me. The specific version of apex seems like an important dependency for the code to run. It would be beneficial if this can be added to the Readme.

mathildecaron31 · 2021-01-04T10:54:17Z

027a54a

DreamMemory001 · 2021-01-19T10:00:38Z

Here is how I installed apex:
git clone "https://github.com/NVIDIA/apex"
cd apex
git checkout 4a1aa97e31ca87514e17c3cd3bbc03f4204579d0
python setup.py install --cuda_ext

python -c 'import apex; from apex.parallel import LARC' # should run and return nothing
python -c 'import apex; from apex.parallel import SyncBatchNorm; print(SyncBatchNorm.__module__)' # should run and return apex.parallel.optimized_sync_batchnorm
Hope that helps
Thanks a lot! This worked for me. The specific version of apex seems like an important dependency for the code to run. It would be beneficial if this can be added to the Readme.

I have not fix my bug, here AttributeError: module 'torch.distributed' has no attribute 'deprecated' I don't have other thought, who has check? please help me! Thanks u.

ayl · 2021-03-14T15:57:16Z

To build on @John-P 's work,

For building apex, make sure you have gcc > 5 and < 8. For example, the NVIDIA Docker container: nvidia/cuda:10.1-base has gcc v7.5, Ubuntu 18.04 and I was able to build apex successfully.

conda create --name=swav python=3.6.6
conda activate swav
# CUDA 10.1 with torchvision
conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch
# NVCC for CUDA 10.1
conda install -c conda-forge cudatoolkit-dev=10.1.243  pandas opencv numpy scipy
# Pip should return a path with env name in it
which pip
# Apex commit 4a1aa97e31ca87514e17c3cd3bbc03f4204579d0 with cuda extentions enabled
pip install git+git://github.com/NVIDIA/apex.git@4a1aa97e31ca87514e17c3cd3bbc03f4204579d0 --install-option="--cuda_ext"

escorciav · 2021-05-21T14:22:58Z

The comment above made it for me but the last line should be
pip install --upgrade-strategy only-if-needed git+https://github.com/NVIDIA/apex.git@4a1aa97e31ca87514e17c3cd3bbc03f4204579d0 --install-option="--cuda_ext"

compiled in a cluster without sudo, only ubuntu 18..04 and nvidia-drivers 😉

ClaudiaShu · 2022-05-07T15:23:43Z

Here is how I installed apex:

git clone "https://github.com/NVIDIA/apex"
cd apex
git checkout 4a1aa97e31ca87514e17c3cd3bbc03f4204579d0
python setup.py install --cuda_ext

python -c 'import apex; from apex.parallel import LARC' # should run and return nothing
python -c 'import apex; from apex.parallel import SyncBatchNorm; print(SyncBatchNorm.__module__)' # should run and return apex.parallel.optimized_sync_batchnorm

Hope that helps

Hi, can I compile the apex with cuda 11.1?
I got this error when compiling:

torch.__version__  =  1.8.1+cu111
setup.py:46: UserWarning: Option --pyprof not specified. Not installing PyProf dependencies!
  warnings.warn("Option --pyprof not specified. Not installing PyProf dependencies!")

Compiling cuda extensions with
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
from /usr/bin

Traceback (most recent call last):
  File "setup.py", line 106, in <module>
    check_cuda_torch_binary_vs_bare_metal(torch.utils.cpp_extension.CUDA_HOME)
  File "setup.py", line 76, in check_cuda_torch_binary_vs_bare_metal
    raise RuntimeError("Cuda extensions are being compiled with a version of Cuda that does " +
RuntimeError: Cuda extensions are being compiled with a version of Cuda that does not match the version used to compile Pytorch binaries.  Pytorch binaries were compiled with Cuda 11.1.
In some cases, a minor-version mismatch will not cause later errors:  https://github.com/NVIDIA/apex/pull/323#discussion_r287021798.  You can try commenting out this check (at your own risk).

I'm working on RTX3090 and it only supports cuda version that is 11 or above.

zcalhoun · 2022-08-30T18:54:55Z

I found Mathilde's suggestion to require a slight change to ensure dependencies played nicely together. Use pip to install the checked out version of apex.

git clone "https://github.com/NVIDIA/apex"
cd apex
git checkout 4a1aa97e31ca87514e17c3cd3bbc03f4204579d0
pip install -v --disable-pip-version-check --no-cache-dir ./

python -c 'import apex; from apex.parallel import LARC' # should run and return nothing
python -c 'import apex; from apex.parallel import SyncBatchNorm; print(SyncBatchNorm.__module__)' # should run and return apex.parallel.optimized_sync_batchnorm

yousuf907 · 2022-10-19T16:16:00Z

If anyone is getting error for "from torch._six import container_abcs" line 14 in "_amp_state.py" script of apex, you may replace that line with "import collections.abc as container_abcs" and it should work.

GSusan · 2023-06-03T02:11:43Z

您好！我已收到邮件，会尽快回复。

Taoww21480 · 2023-06-03T02:15:27Z

If you encounter this”from torch. _six import string_classes“ error reported line2 in "initialize.py"script of apex, please comment out this line of code and replace it with a ”string classes=str“ is sufficient.

replace ”from torch._six import string_classes" with "string_classes = str".

mathildecaron31 closed this as completed Sep 15, 2020

mathildecaron31 reopened this Dec 18, 2020

mathildecaron31 closed this as completed Jan 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TypeError: optimizers must be either a single optimizer or a list of optimizers. #18

TypeError: optimizers must be either a single optimizer or a list of optimizers. #18

guerriep commented Aug 24, 2020 •

edited

Loading

mathildecaron31 commented Sep 7, 2020 •

edited

Loading

mathildecaron31 commented Sep 15, 2020

John-P commented Oct 12, 2020 •

edited

Loading

kdexd commented Oct 18, 2020

kdexd commented Oct 20, 2020

GuoleiSun commented Dec 7, 2020

John-P commented Dec 16, 2020 •

edited

Loading

mathildecaron31 commented Dec 18, 2020

mathildecaron31 commented Dec 18, 2020

John-P commented Dec 23, 2020

kaushal-py commented Dec 23, 2020

mathildecaron31 commented Jan 4, 2021

DreamMemory001 commented Jan 19, 2021

ayl commented Mar 14, 2021

escorciav commented May 21, 2021

ClaudiaShu commented May 7, 2022

zcalhoun commented Aug 30, 2022

yousuf907 commented Oct 19, 2022

GSusan commented Jun 3, 2023 via email

Taoww21480 commented Jun 3, 2023

TypeError: optimizers must be either a single optimizer or a list of optimizers. #18

TypeError: optimizers must be either a single optimizer or a list of optimizers. #18

Comments

guerriep commented Aug 24, 2020 • edited Loading

mathildecaron31 commented Sep 7, 2020 • edited Loading

mathildecaron31 commented Sep 15, 2020

John-P commented Oct 12, 2020 • edited Loading

kdexd commented Oct 18, 2020

kdexd commented Oct 20, 2020

GuoleiSun commented Dec 7, 2020

John-P commented Dec 16, 2020 • edited Loading

mathildecaron31 commented Dec 18, 2020

mathildecaron31 commented Dec 18, 2020

John-P commented Dec 23, 2020

kaushal-py commented Dec 23, 2020

mathildecaron31 commented Jan 4, 2021

DreamMemory001 commented Jan 19, 2021

ayl commented Mar 14, 2021

escorciav commented May 21, 2021

ClaudiaShu commented May 7, 2022

zcalhoun commented Aug 30, 2022

yousuf907 commented Oct 19, 2022

GSusan commented Jun 3, 2023 via email

Taoww21480 commented Jun 3, 2023

guerriep commented Aug 24, 2020 •

edited

Loading

mathildecaron31 commented Sep 7, 2020 •

edited

Loading

John-P commented Oct 12, 2020 •

edited

Loading

John-P commented Dec 16, 2020 •

edited

Loading