Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cuda not available in pytorch/torchserve:0.2.0-cuda10.1-cudnn7-runtime Docker image #640

Closed
sidd-pandey opened this issue Aug 24, 2020 · 0 comments · Fixed by #642
Closed
Assignees
Labels
bug Something isn't working
Milestone

Comments

@sidd-pandey
Copy link

sidd-pandey commented Aug 24, 2020

Context

Model deployed using the latest image does not use GPU for inference. Torch shows cuda not available.

Your Environment

  • Installed using source? No
  • Are you planning to deploy it using docker container? Yes (pytorch/torchserve:0.2.0-cuda10.1-cudnn7-runtime)
  • Is it a CPU or GPU environment?: GPU
  • Using a default/custom handler?: Default handler (image_classifier)
  • What kind of model is it e.g. vision, text, audio? densenet161 (one in the documentation)
  • Are you planning to use local models from model-store or public url being used e.g. from S3 bucket etc.?: Local

Expected Behavior

Cuda should be available and GPU support should be out of box in the official image.

Current Behavior

Torch cannot detect cuda, thus ends up running the inference on CPU.

Possible Solution

Fix the torch installation command at
https://github.com/pytorch/serve/blob/master/docker/Dockerfile#L50
I think instead of RUN pip install --no-cache-dir torch torchvision it should be RUN pip install --no-cache-dir torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html. I was able to fix the issue through this. Another solution could be use different default image having Cuda 10.2.

Using the default pip install command is installing Pytorch build with Cuda 10.2, while according to image it should be Pytroch build with Cuda 10.1.

Steps to Reproduce

Should be reproducible in an environment where Cuda 10.1 is configured. Use the official image and deploy using it. Check if GPU is being used.

Environment Details Before Fix:

>> python -m "torch.utils.collect_env"
Collecting environment information...
PyTorch version: 1.6.0
Is debug build: No
CUDA used to build PyTorch: 10.2
OS: Ubuntu 18.04.3 LTS
GCC version: Could not collect
CMake version: Could not collect
Python version: 3.6
Is CUDA available: No
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: Tesla M60
Nvidia driver version: 418.87.00
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5
Versions of relevant libraries:
[pip3] numpy==1.18.5
[pip3] torch==1.6.0
[pip3] torch-model-archiver==0.2.0
[pip3] torchserve==0.2.0
[pip3] torchtext==0.7.0
[pip3] torchvision==0.7.0

Environment Details After Fix

python -m "torch.utils.collect_env"
Collecting environment information...
PyTorch version: 1.6.0+cu101
Is debug build: No
CUDA used to build PyTorch: 10.1

OS: Ubuntu 18.04.3 LTS
GCC version: Could not collect
CMake version: Could not collect

Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: Tesla M60
Nvidia driver version: 418.87.00
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5

Versions of relevant libraries:
[pip3] numpy==1.19.1
[pip3] torch==1.6.0+cu101
[pip3] torch-model-archiver==0.2.0
[pip3] torchserve==0.2.0
[pip3] torchtext==0.7.0
[pip3] torchvision==0.7.0+cu101
[conda] Could not collect
@harshbafna harshbafna self-assigned this Aug 24, 2020
@harshbafna harshbafna added the bug Something isn't working label Aug 24, 2020
@harshbafna harshbafna added this to the v0.3.0 milestone Aug 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants