Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Illegal instruction when trying to evaluate pre-trained MCIL model with dataset D #70

Closed
AtteNyyssonen opened this issue Feb 16, 2024 · 5 comments

Comments

@AtteNyyssonen
Copy link

Hello,

I've been trying to get CALVIN evaluation to work using the dataset D and the pre-trained MCIL model. I am working on a virtual machine with Ubuntu 22.04.03.

I have followed the instructions given with setting up the conda env and downloads, but running the following command doesn't work

python ../calvin_models/calvin_agent/evaluation/evaluate_policy.py --dataset_path ./task_D_D/ --train_folder ./D_D_static_rgb_baseline/ --checkpoint ./D_D_static_rgb_baseline/mcil_baseline.ckpt

Output: Illegal instructions (core dumped)

Commands were tried from the directory calvin/dataset/ where I installed the pre-trained model and dataset D.

Did I miss something in the instructions? I can't figure out what to do next.

@lukashermann
Copy link
Collaborator

Hi @AtteNyyssonen ,
Is that the full output that was printed to the command line? (i.e., did it happen in the first line that was printed?)
If not, could you provide us with the complete output log?

@AtteNyyssonen
Copy link
Author

AtteNyyssonen commented Feb 19, 2024

Hi @lukashermann,

Yes that is the only thing printed in CL, here is the complete output

(base) atte@atte-VirtualBox: ~ /calvin$ conda activate calvin_venv
(calvin_venv) atte@atte-VirtualBox: ~ /calvin$ cd dataset/
(calvin_venv) atte@atte-VirtualBox:~/calvin/dataset$ python ../calvin_models/calvin_agent/evaluation/evaluate_policy.py
--dataset_path ./task_D_D/ --train_folder ./D_D_static_rgb_baseline/ --checkpoint ./D_D_static_rgb_baseline/mcil_baseline.ckpt
Illegal instruction (core dumped)

(Added spaces around the first two ~ because github interpreted those as omitting text )

@AtteNyyssonen
Copy link
Author

Hi @lukashermann,

I figured out after extensive trying and googling that this was caused by Hyper-V still being active due to Windows 11 Memory Integrity VBS.

Now I've hit another problem with finding the correct EGL device. Output of running the evaluation command stated above:

Couldn't find correct EGL device. Setting EGL_VISIBLE_DEVICE=0. When using DDP with many GPUs this can lead to OOM errors. Did you install PyBullet correctly? Please refer to calvin env README
argv[0]=--width=200
argv[1]=--height=200
EGL device choice: 0 of 2 (from EGL_VISIBLE_DEVICES)
Loaded EGL 1.4 after reload.
Unable to create EGL context (eglError: 12292)

I looked at some of the other issues and found the one where you asked for the output of this:
cd calvin_env/egl_check
bash build.sh # should have been built automatically, but try running this again
python list_egl_options.py

Here is the output:

----------Default-------------
Starting EGL query
Loaded EGL 1.4 after reload.
b'EGL device choice: -1 of 2.\nUnable to create EGL context (eglError: 12297)\n'
number of EGL devices: 2
----------Option #1 (id=0)-------------
Starting EGL query
EGL device choice: 0 of 2 (from EGL_VISIBLE_DEVICE)
Loaded EGL 1.4 after reload.
Unable to create EGL context (eglError: 12297)

----------Option #2 (id=1)-------------
Starting EGL query
EGL device choice: 1 of 2 (from EGL_VISIBLE_DEVICE)
Loaded EGL 1.5 after reload.
GL_VENDOR=Mesa
GL_RENDERER=llvmpipe (LLVM 15.0.7, 256 bits)
GL_VERSION=4.5 (Core Profile) Mesa 23.0.4-0ubuntu1~22.04.1
GL_SHADING_LANGUAGE_VERSION=4.50
Completeing EGL query

There was also a mention of older PyBullet versions being the issue, my calvin_venv currently has
pybullet 3.2.6

What could be the cause of this EGL issue?

@lukashermann
Copy link
Collaborator

Hi @AtteNyyssonen, which GPU do you have? We have only tested the code on machines with Nvidia GPUs. In case you do have an Nvidia GPU, maybe you need to reinstall the drivers.

@AtteNyyssonen
Copy link
Author

Yes, that was the issue. The VM can't access my Nvidia GPU and was using a virtualized one which caused this error. I will close this issue as the problems have been solved by switching to WSL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants