Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when trying to train #5

Closed
jpsml opened this issue Jan 14, 2022 · 6 comments
Closed

Error when trying to train #5

jpsml opened this issue Jan 14, 2022 · 6 comments

Comments

@jpsml
Copy link

jpsml commented Jan 14, 2022

I am getting the following error when I try to run training, how should I proceed in order to solve it?

(mvp) jpsml@jpsml-ubuntu:~/mvp$ python -m torch.distributed.launch --nproc_per_node=8 --use_env run/train_3d.py --cfg configs/campus/mvp_campus.yaml


Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.


Traceback (most recent call last):
File "run/train_3d.py", line 34, in
import dataset
File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in
from dataset.h36m import H36M as h36m
File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in
from lib.utils.cameras_cpu import camera_to_world_frame, project_pose
ModuleNotFoundError: No module named 'lib'
Traceback (most recent call last):
File "run/train_3d.py", line 34, in
import dataset
File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in
from dataset.h36m import H36M as h36m
File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in
from lib.utils.cameras_cpu import camera_to_world_frame, project_pose
ModuleNotFoundError: No module named 'lib'
Traceback (most recent call last):
File "run/train_3d.py", line 34, in
import dataset
File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in
from dataset.h36m import H36M as h36m
File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in
from lib.utils.cameras_cpu import camera_to_world_frame, project_pose
ModuleNotFoundError: No module named 'lib'
Traceback (most recent call last):
File "run/train_3d.py", line 34, in
import dataset
File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in
from dataset.h36m import H36M as h36m
File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in
from lib.utils.cameras_cpu import camera_to_world_frame, project_pose
ModuleNotFoundError: No module named 'lib'
Traceback (most recent call last):
File "run/train_3d.py", line 34, in
import dataset
File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in
from dataset.h36m import H36M as h36m
File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in
from lib.utils.cameras_cpu import camera_to_world_frame, project_pose
ModuleNotFoundError: No module named 'lib'
Traceback (most recent call last):
File "run/train_3d.py", line 34, in
import dataset
File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in
from dataset.h36m import H36M as h36m
File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in
from lib.utils.cameras_cpu import camera_to_world_frame, project_pose
ModuleNotFoundError: No module named 'lib'
Traceback (most recent call last):
File "run/train_3d.py", line 34, in
import dataset
File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in
from dataset.h36m import H36M as h36m
File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in
from lib.utils.cameras_cpu import camera_to_world_frame, project_pose
ModuleNotFoundError: No module named 'lib'
Traceback (most recent call last):
File "run/train_3d.py", line 34, in
import dataset
File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in
from dataset.h36m import H36M as h36m
File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in
from lib.utils.cameras_cpu import camera_to_world_frame, project_pose
ModuleNotFoundError: No module named 'lib'
Traceback (most recent call last):
File "/home/jpsml/anaconda3/envs/mvp/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/jpsml/anaconda3/envs/mvp/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/jpsml/anaconda3/envs/mvp/lib/python3.6/site-packages/torch/distributed/launch.py", line 261, in
main()
File "/home/jpsml/anaconda3/envs/mvp/lib/python3.6/site-packages/torch/distributed/launch.py", line 257, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/home/jpsml/anaconda3/envs/mvp/bin/python', '-u', 'run/train_3d.py', '--cfg', 'configs/campus/mvp_campus.yaml']' returned non-zero exit status 1.

@twangnh
Copy link
Collaborator

twangnh commented Jan 17, 2022

please add lib to the PYTHONPATH

@jpsml
Copy link
Author

jpsml commented Jan 17, 2022

thanks

@Taylorminer
Copy link

@jpsml I meet the same problem.I try to add lib to the PYTHONPATH, such as add"import sys
sys.path.append('/media/chen-group/9400EADF00EAC778/cy/mvp-main/lib/')" in the validate_3d.py and create the .pth file in the python site-site packages. But all of them are failed. Can you tell me how to fix it?

@jpsml
Copy link
Author

jpsml commented Feb 8, 2022

run the following in the terminal before running the training:

export PYTHONPATH="${PYTHONPATH}:/home/jpsml/mvp/lib"

in your case you need to replace "home/jpsml/mvp" by the local path where mvp is located

@Taylorminer
Copy link

run the following in the terminal before running the training:

export PYTHONPATH="${PYTHONPATH}:/home/jpsml/mvp/lib"

in your case you need to replace "home/jpsml/mvp" by the local path where mvp is located

I run the line in the terminal. I run 'print(sys.path)' lib path in the pythonpath. But There is still the error:
ModuleNotFoundError: No module named 'lib'

@jpsml
Copy link
Author

jpsml commented Feb 10, 2022

sorry, I believe the correct command is the following:

export PYTHONPATH="${PYTHONPATH}:/home/jpsml/mvp"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants