Cuda not available in pytorch/torchserve:0.2.0-cuda10.1-cudnn7-runtime Docker image #640

sidd-pandey · 2020-08-24T10:11:30Z

Context

Model deployed using the latest image does not use GPU for inference. Torch shows cuda not available.

Your Environment

Installed using source? No
Are you planning to deploy it using docker container? Yes (pytorch/torchserve:0.2.0-cuda10.1-cudnn7-runtime)
Is it a CPU or GPU environment?: GPU
Using a default/custom handler?: Default handler (image_classifier)
What kind of model is it e.g. vision, text, audio? densenet161 (one in the documentation)
Are you planning to use local models from model-store or public url being used e.g. from S3 bucket etc.?: Local

Expected Behavior

Cuda should be available and GPU support should be out of box in the official image.

Current Behavior

Torch cannot detect cuda, thus ends up running the inference on CPU.

Possible Solution

Fix the torch installation command at
https://github.com/pytorch/serve/blob/master/docker/Dockerfile#L50
I think instead of RUN pip install --no-cache-dir torch torchvision it should be RUN pip install --no-cache-dir torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html. I was able to fix the issue through this. Another solution could be use different default image having Cuda 10.2.

Using the default pip install command is installing Pytorch build with Cuda 10.2, while according to image it should be Pytroch build with Cuda 10.1.

Steps to Reproduce

Should be reproducible in an environment where Cuda 10.1 is configured. Use the official image and deploy using it. Check if GPU is being used.

Environment Details Before Fix:

>> python -m "torch.utils.collect_env"
Collecting environment information...
PyTorch version: 1.6.0
Is debug build: No
CUDA used to build PyTorch: 10.2
OS: Ubuntu 18.04.3 LTS
GCC version: Could not collect
CMake version: Could not collect
Python version: 3.6
Is CUDA available: No
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: Tesla M60
Nvidia driver version: 418.87.00
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5
Versions of relevant libraries:
[pip3] numpy==1.18.5
[pip3] torch==1.6.0
[pip3] torch-model-archiver==0.2.0
[pip3] torchserve==0.2.0
[pip3] torchtext==0.7.0
[pip3] torchvision==0.7.0

Environment Details After Fix

python -m "torch.utils.collect_env"
Collecting environment information...
PyTorch version: 1.6.0+cu101
Is debug build: No
CUDA used to build PyTorch: 10.1

OS: Ubuntu 18.04.3 LTS
GCC version: Could not collect
CMake version: Could not collect

Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: Tesla M60
Nvidia driver version: 418.87.00
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5

Versions of relevant libraries:
[pip3] numpy==1.19.1
[pip3] torch==1.6.0+cu101
[pip3] torch-model-archiver==0.2.0
[pip3] torchserve==0.2.0
[pip3] torchtext==0.7.0
[pip3] torchvision==0.7.0+cu101
[conda] Could not collect

The text was updated successfully, but these errors were encountered:

harshbafna self-assigned this Aug 24, 2020

harshbafna added the bug Something isn't working label Aug 24, 2020

harshbafna added this to the v0.3.0 milestone Aug 24, 2020

harshbafna mentioned this issue Aug 24, 2020

Fix for Cuda 10.1 /Cuda 10.2 related torch package installation issue in Docker. #642

Merged

3 tasks

maaquib closed this as completed in #642 Nov 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cuda not available in pytorch/torchserve:0.2.0-cuda10.1-cudnn7-runtime Docker image #640

Cuda not available in pytorch/torchserve:0.2.0-cuda10.1-cudnn7-runtime Docker image #640

sidd-pandey commented Aug 24, 2020 •

edited

Loading

Cuda not available in pytorch/torchserve:0.2.0-cuda10.1-cudnn7-runtime Docker image #640

Cuda not available in pytorch/torchserve:0.2.0-cuda10.1-cudnn7-runtime Docker image #640

Comments

sidd-pandey commented Aug 24, 2020 • edited Loading

Context

Your Environment

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce

Environment Details Before Fix:

Environment Details After Fix

sidd-pandey commented Aug 24, 2020 •

edited

Loading