system has unsupported display driver / cuda driver combination #57

Dadiao-shuai · 2023-09-27T14:09:51Z

python3 train.py -p first_ -d /root/data --dynamic

  0           test first_test0.types
  0          train first_train0.types
  1           test first_test1.types
  1          train first_train1.types
  2           test first_test2.types
  2          train first_train2.types
WRITING solver.36822.prototxt

Traceback (most recent call last):
  File "/root/data/gnina_train/train.py", line 932, in <module>
    results = train_and_test_model(args, train_test_files[i], outname, cont)
  File "/root/data/gnina_train/train.py", line 441, in train_and_test_model
    solver = caffe.get_solver(solverf)
RuntimeError: system has unsupported display driver / cuda driver combination

SYSTEM INFORMATION

nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_May__3_18:49:52_PDT_2022
Cuda compilation tools, release 11.7, V11.7.64
Build cuda_11.7.r11.7/compiler.31294372_0

ls /usr/local
bin cuda cuda-11 cuda-11.7 etc games include lib man python sbin share src

The text was updated successfully, but these errors were encountered:

Kerro-junior · 2023-09-27T14:25:27Z

I have a suggestion, when running train.py, you may need to specific --gpu 0 to use the gpu in your machine?

Dadiao-shuai · 2023-09-27T14:29:52Z

I tried --gpu 0, but still : RuntimeError: system has unsupported display driver / cuda driver combination

for your imformation, I build this container with : docker run -itd --gpus '"device=0"' ...

dkoes · 2023-09-27T15:14:19Z

Sounds like a driver/cuda mismatch. Perhaps you have updated your drivers recently and need to reboot.

Dadiao-shuai · 2023-09-28T08:14:22Z

I uninstall cuda11.7 and install 11.6, and nvcc is in my path.
please check why it still report RuntimeError when I run :

python3 train.py -m default2018.model -p first_ -d /root/data -i 1000 --weights crossdock_default2018.caffemodel --gpu 0 --dynamic

0           test first_test0.types
 0          train first_train0.types
 1           test first_test1.types
 1          train first_train1.types
 2           test first_test2.types
 2          train first_train2.types
WRITING solver.2471.prototxt
Traceback (most recent call last):
 File "/root/data/gnina_train/train.py", line 932, in <module>
   results = train_and_test_model(args, train_test_files[i], outname, cont)
 File "/root/data/gnina_train/train.py", line 439, in train_and_test_model
   caffe.set_device(args.gpu)
RuntimeError: system has unsupported display driver / cuda driver combination

Dadiao-shuai · 2023-09-28T08:20:20Z

plus, is there anything to do with the model file? Do I need to modify the following:

        stratify_receptor: true
        stratify_affinity_min: 0
        stratify_affinity_max: 0
        stratify_affinity_step: 1.000000

I also deleted the line cachefile for receptor and ligand in the model, because I only use the default cross-val (0,1,2) types files.

My nvidia-smi is like:

I believe this Driver is enough for cuda11.6

Dadiao-shuai · 2023-09-28T08:24:56Z

I installed this cuda11.6 from *.deb(local) according to the official website, and apt-get -y install cuda, do I need to use pip/pip3 to install some lib for cuda in python3?

JonasLi-19 · 2023-09-28T08:40:23Z

I'm afraid the caffe you installed when installing gnina does not properly linking to your CUDA directories(/usr/local/cuda)?

I am not an expert on caffe and cuda, but you can look the caffe/CMakelist.txt:

Dadiao-shuai · 2023-09-28T09:16:13Z

I'm afraid the caffe you installed when installing gnina does not properly linking to your CUDA directories(/usr/local/cuda)?

I am not an expert on caffe and cuda, but you can look the caffe/CMakelist.txt:

I aleady add the /usr.local/python to PYTHONPATH, where you can find caffe.

I forget is there any warning information about the caffe & cuda when installing gnina,
But cmake,make,make install all finished, and I've successfully run gnina to dock ligands.

I guess the problem is about caffe, not the Nvidia Driver, because 520.61.05 is enough for cuda11.6/11.7.

dkoes · 2023-09-28T13:09:56Z

Are you running inside docker? Does the host system driver match the docker driver?
NVIDIA/nvidia-docker#1256

Dadiao-shuai · 2023-09-28T13:45:48Z

Yes, I run gnina train.py inside docker, the nvidia driver in host system and docker are both 520.61.05, host uses CUDA11.8, docker uses CUDA11.6.

I do not think anything wrong... Because I was told it's ok to use former cuda-toolkit in docker container, right?

Dadiao-shuai · 2023-09-28T18:36:58Z

Is there anything wrong about my Docker? My Docker is version 19.03.13, build 4484c46d9d.

I have met Runtime Error boost::thread_resource_error before in gnina/gnina:latest image to use gnina, and now met system has unsupported display driver / cuda driver combination in another docker container to use train.py.

Fed up with gpu errors! Or should I just use cpu to do train.py?

Dadiao-shuai · 2023-09-30T07:01:19Z

I find that it is a pretty common error of system has unsupported display driver / cuda driver combination for docker to run gpus in many github issues.

As you mentioned, this is my ~/.bashrc:

This is what I got in directory related to train.py:

And this is the error report:

WRITING solver.2695.prototxt
Traceback (most recent call last):
  File "/root/data/gnina_iter_train/train.py", line 932, in <module>
    results = train_and_test_model(args, train_test_files[i], outname, cont)
  File "/root/data/gnina_iter_train/train.py", line 441, in train_and_test_model
    solver = caffe.get_solver(solverf)
RuntimeError: system has unsupported display driver / cuda driver combination

YOU MIGHT HAVE NOTICED THAT: WRITING solver.2695.prototxt

@SanFran-Me

Dadiao-shuai · 2023-09-30T07:05:31Z

I just find the problem solver file:

AND this is a part of traintrain2695.txt:

layer {
  name: "data"
  type: "MolGridData"
  top: "data"
  top: "label"
  top: "affinity"
  top: "rmsd_true"
  include {
    phase: TEST
  }
  molgrid_data_param {
    source: "first_iter_train0.types"
    batch_size: 50
    dimension: 23.5
    resolution: 0.5
    shuffle: false
    balanced: false
    root_folder: "/root/data"
    recmap: "completerec"
    ligmap: "completelig"
    has_affinity: true
    has_rmsd: true
  }
}
layer {
  name: "data"
  type: "MolGridData"
  top: "data"
  top: "label"
  top: "affinity"
  top: "rmsd_true"
  include {
    phase: TRAIN
  }
  molgrid_data_param {
    source: "first_iter_train0.types"
    batch_size: 50
    dimension: 23.5
    resolution: 0.5
    shuffle: true
    balanced: true
    root_folder: "/root/data"
    random_rotation: true
    random_translate: 6.0
    recmap: "completerec"
    ligmap: "completelig"
    has_affinity: true
    has_rmsd: true
    stratify_receptor: true
    stratify_affinity_min: 0.0
    stratify_affinity_max: 0.0
    stratify_affinity_step: 1.0
    jitter: 0.0
  }
}

Additionally, this is my types file example:

1 6.22 0 pdb2019_refi_train_gninatypes/3acl/3acl_rec.gninatypes pdb2019_refi_train_gninatypes/3acl/3acl_ligand.gninatypes
1 6.0096 0.5408 pdb2019_refi_train_gninatypes/4djw/4djw_rec.gninatypes first_pdbbind_v2019_docked_gninatypes/4djw_docked_0.gninatypes
0 0.6729999999999999 7.694 pdb2019_refi_train_gninatypes/4gzw/4gzw_rec.gninatypes first_pdbbind_v2019_docked_gninatypes/4gzw_docked_6.gninatypes

dkoes · 2023-10-01T01:55:16Z

Update your docker image to match your host.

Dadiao-shuai closed this as completed Nov 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

system has unsupported display driver / cuda driver combination #57

system has unsupported display driver / cuda driver combination #57

Dadiao-shuai commented Sep 27, 2023 •

edited

Loading

Kerro-junior commented Sep 27, 2023

Dadiao-shuai commented Sep 27, 2023

dkoes commented Sep 27, 2023

Dadiao-shuai commented Sep 28, 2023

Dadiao-shuai commented Sep 28, 2023

Dadiao-shuai commented Sep 28, 2023

JonasLi-19 commented Sep 28, 2023

Dadiao-shuai commented Sep 28, 2023

dkoes commented Sep 28, 2023

Dadiao-shuai commented Sep 28, 2023

Dadiao-shuai commented Sep 28, 2023

Dadiao-shuai commented Sep 30, 2023 •

edited

Loading

Dadiao-shuai commented Sep 30, 2023 •

edited

Loading

dkoes commented Oct 1, 2023

system has unsupported display driver / cuda driver combination #57

system has unsupported display driver / cuda driver combination #57

Comments

Dadiao-shuai commented Sep 27, 2023 • edited Loading

SYSTEM INFORMATION

Kerro-junior commented Sep 27, 2023

Dadiao-shuai commented Sep 27, 2023

dkoes commented Sep 27, 2023

Dadiao-shuai commented Sep 28, 2023

Dadiao-shuai commented Sep 28, 2023

Dadiao-shuai commented Sep 28, 2023

JonasLi-19 commented Sep 28, 2023

Dadiao-shuai commented Sep 28, 2023

dkoes commented Sep 28, 2023

Dadiao-shuai commented Sep 28, 2023

Dadiao-shuai commented Sep 28, 2023

Dadiao-shuai commented Sep 30, 2023 • edited Loading

Dadiao-shuai commented Sep 30, 2023 • edited Loading

dkoes commented Oct 1, 2023

Dadiao-shuai commented Sep 27, 2023 •

edited

Loading

Dadiao-shuai commented Sep 30, 2023 •

edited

Loading

Dadiao-shuai commented Sep 30, 2023 •

edited

Loading