New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Adding optional CUDA DLLs when installing onnxruntime_gpu #22506

Closed

jchen351 wants to merge 50 commits into main from Cjian/cuda_pip

Contributor

jchen351 commented Oct 18, 2024

Description

This code change enable user to install Nvidia CUDA DLLs when installing onnxruntime_gpu. with pip install onnxruntime_gpu[cuda_dlls].

It will also enable onnxruntime_gpu to use dynamic libraries under site-packages/nvidia that contain .dll files for Windows and .so files for Linux by temporary updating the environmental variables within an ORT Inferencing session.

Motivation and Context


          Adding optional CUDA DLLs when installing onnxruntime_gpu

65b0f6b

jchen351 requested a review from a team as a code owner

October 18, 2024 21:59

jchen351 linked an issue

that may be closed by this pull request

[Feature Request] Using the cuda dlls installed with pip from official Nvidia python packages in onnxruntime-gpu #19350

Open

github-advanced-security bot found potential problems

View reviewed changes

onnxruntime/python/onnxruntime_cuda_temp_env.py Fixed Show fixed Hide fixed

onnxruntime/python/onnxruntime_cuda_temp_env.py Fixed Show fixed Hide fixed

onnxruntime/python/onnxruntime_cuda_temp_env.py Fixed Show fixed Hide fixed

onnxruntime/python/onnxruntime_inference_collection.py Fixed Show fixed Hide fixed

onnxruntime/python/onnxruntime_inference_collection.py Fixed Show fixed Hide fixed

onnxruntime/python/onnxruntime_inference_collection.py Fixed Show fixed Hide fixed

onnxruntime/python/onnxruntime_inference_collection.py Fixed Show fixed Hide fixed

onnxruntime/python/onnxruntime_inference_collection.py Fixed Show fixed Hide fixed

github-advanced-security bot found potential problems

View reviewed changes

onnxruntime/python/onnxruntime_cuda_temp_env.py Fixed Show fixed Hide fixed

onnxruntime/python/onnxruntime_inference_collection.py Fixed Show fixed Hide fixed


          Lintrunner -a

a335dc6

github-advanced-security bot found potential problems

View reviewed changes

onnxruntime/python/onnxruntime_cuda_temp_env.py Fixed Show fixed Hide fixed

onnxruntime/python/onnxruntime_cuda_temp_env.py Fixed Show fixed Hide fixed

onnxruntime/python/onnxruntime_inference_collection.py Fixed Show fixed Hide fixed

tianleiwu reviewed

View reviewed changes

setup.py Show resolved Hide resolved

jchen351 added 5 commits

October 21, 2024 15:36


          Merge branch 'main' into Cjian/cuda_pip

87c51fb


          Update python code

c95dbce


          update python lint to 3.12

990752e


          update python lint to 3.12

e15e3e0


          Revert lint python to 3.10

5e70dd0

snnn requested a review from jywu-msft

October 22, 2024 01:16

Member

snnn commented Oct 23, 2024

There are some test failures, Please fix them. We will remove the "orttraining-linux-gpu-ci-pipeline". The others still need to be taking care of.

tianleiwu reviewed

View reviewed changes

onnxruntime/python/onnxruntime_cuda_temp_env.py Outdated Show resolved Hide resolved

jchen351 added 2 commits

November 6, 2024 13:25


          Merge branch 'main' into Cjian/cuda_pip

3bf6817


          Use the regex to match .so , .so.nn where nn is a digital number

faa6e3a

github-advanced-security bot found potential problems

View reviewed changes

onnxruntime/python/onnxruntime_cuda_temp_env.py Fixed Show fixed Hide fixed


          Import missing re

2efad16

github-advanced-security bot found potential problems

View reviewed changes

onnxruntime/python/onnxruntime_cuda_temp_env.py Fixed Show fixed Hide fixed

github-actions bot reviewed

View reviewed changes

Contributor

github-actions bot left a comment

You can commit the suggested changes from lintrunner.

onnxruntime/python/onnxruntime_cuda_temp_env.py Outdated Show resolved Hide resolved

jchen351 added 3 commits

November 6, 2024 14:47


          Adding nvidia-curand and nvidia-cuda-runtime

f4f9d35


          Fix typo

c025719


          lintrunner -a

c7d0951

tianleiwu reviewed

View reviewed changes

onnxruntime/python/onnxruntime_cuda_temp_env.py Outdated Show resolved Hide resolved

tianleiwu reviewed

View reviewed changes

onnxruntime/python/onnxruntime_cuda_temp_env.py Outdated Show resolved Hide resolved


          Adding 2 second output to os.walk()

5fea4d4

jchen351 requested a review from tianleiwu

November 7, 2024 22:31

jchen351 closed this

jchen351 added 2 commits

December 16, 2024 12:31


          Update "libnvrtc.so.11", toi "libnvrtc.so.11.2"

dff876c


          #This is not a mistake, it links to more specific version like libnvr…

5923d1b

…tc.so.11.8.89 etc.

tianleiwu reviewed

View reviewed changes

onnxruntime/__init__.py Outdated Show resolved Hide resolved


          lintrunner -a

cf612fc

jchen351 requested a review from tianleiwu

December 17, 2024 02:05

tianleiwu reviewed

View reviewed changes

onnxruntime/__init__.py Outdated Show resolved Hide resolved

tianleiwu reviewed

View reviewed changes

onnxruntime/__init__.py Outdated Show resolved Hide resolved

jchen351 added 2 commits

December 18, 2024 18:58


          change cuda_version() to cuda version

d0ffbaf


          Update save_build_and_package_info to allow to be used in non trainin…

ad0cf6b

…g package

jchen351 requested a review from tianleiwu

December 19, 2024 03:53

tianleiwu reviewed

View reviewed changes

setup.py Outdated

-                          if cuda_version:
-                              f.write(f"cuda_version = '{cuda_version}'\n")
+                          # cudart_versions are integers
+                          cudart_versions = find_cudart_versions(build_env=True)

Contributor

tianleiwu Dec 20, 2024 •

edited

Loading

find_cudart_versions only works on Linux. I think we can add a check of linux before calling find_cudart_versions to avoid a warning message in Windows.

tianleiwu reviewed

View reviewed changes

onnxruntime/__init__.py



		# Load nvidia libraries from site-packages/nvidia if the package is onnxruntime-gpu
		if cuda_version is not None and cuda_version != "":

Contributor

tianleiwu Dec 20, 2024 •

edited

Loading

In my test, cuda_version is still empty string. It is imported from onnxruntime.capi.onnxruntime_validation in line 73. That class only outputs cuda_version for training as below:

onnxruntime/onnxruntime/python/onnxruntime_validation.py

Lines 100 to 102 in 29bccad

    
           cuda_version = "" 
        
           if has_ortmodule:

We can remove the line of if has_ortmodule there.

Member

snnn Dec 30, 2024

Does it mean the following code usually won't get executed?


          check linux before calling find_cudart_versions. Also remove if has_o…

fad62cb

…rtmodule

jchen351 requested a review from tianleiwu

December 22, 2024 02:33

github-advanced-security bot found potential problems

View reviewed changes

onnxruntime/python/onnxruntime_validation.py

+                      try:  # noqa: SIM105
+                          from .build_and_package_info import cuda_version
+                      except Exception:

Check notice

Code scanning / CodeQL

Empty except Note

'except' clause does nothing but pass and there is no explanatory comment.

tianleiwu previously approved these changes

View reviewed changes

snnn requested a review from Copilot

December 30, 2024 23:05

Copilot AI reviewed

View reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

Member

snnn commented Dec 30, 2024

@gedoensmax , any comment?

Member

snnn commented Dec 31, 2024

@jchen351, I tried to run nightly pipelines with your changes but there were some failures. Would you please update your branch with main, so that I can re-run the pipelines again to check if the problem still exists? Before merging this PR, we should generate some test packages and manually test them locally.


          Merge branch 'main' into Cjian/cuda_pip

f2aa262

Member

snnn commented Jan 9, 2025

@jchen351 , I tried the new package, but it didn't work.

# pip install onnxruntime-gpu[cuda_dlls] --pre --index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/
Looking in indexes: https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/
Collecting onnxruntime-gpu[cuda_dlls]
  Downloading https://aiinfra.pkgs.visualstudio.com/2692857e-05ef-43b4-ba9c-ccf1c22c437c/_packaging/7982ae20-ed19-4a35-a362-a96ac99897b7/pypi/download/onnxruntime-gpu/1.21.dev20250108002/onnxruntime_gpu-1.21.0.dev20250108002-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (291.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 291.9/291.9 MB 11.5 MB/s eta 0:00:00
Requirement already satisfied: flatbuffers in /usr/local/lib/python3.10/dist-packages (from onnxruntime-gpu[cuda_dlls]) (24.12.23)
Requirement already satisfied: protobuf in /usr/local/lib/python3.10/dist-packages (from onnxruntime-gpu[cuda_dlls]) (5.29.3)
Requirement already satisfied: numpy>=1.21.6 in /usr/local/lib/python3.10/dist-packages (from onnxruntime-gpu[cuda_dlls]) (2.2.1)
Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from onnxruntime-gpu[cuda_dlls]) (1.13.3)
Requirement already satisfied: coloredlogs in /usr/local/lib/python3.10/dist-packages (from onnxruntime-gpu[cuda_dlls]) (15.0.1)
Requirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from onnxruntime-gpu[cuda_dlls]) (24.2)
Requirement already satisfied: humanfriendly>=9.1 in /usr/local/lib/python3.10/dist-packages (from coloredlogs->onnxruntime-gpu[cuda_dlls]) (10.0)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from sympy->onnxruntime-gpu[cuda_dlls]) (1.3.0)
Installing collected packages: onnxruntime-gpu
Successfully installed onnxruntime-gpu-1.21.0.dev20250108002

Could you please verify?

tianleiwu reviewed

View reviewed changes

onnxruntime/__init__.py

+                          )
+                  else:
+                      logging.info(f"Unsupported platform: {platform.system()}")
+                  check_and_load_cuda_libs(nvidia_path, cuda_libs)

Contributor

tianleiwu Feb 11, 2025

Please move these code to a function like preload_cuda_libs() and let user call it explicitly (By default, they are not called).

Example usage:

   import onnxruntime
   onnxruntime.preload_cuda_libs()

tianleiwu dismissed their stale review

February 11, 2025 22:57

see new comment

This was referenced Feb 12, 2025

Add extra requires for cuda/cudnn DLLs to onnxruntime-gpu python package #23659

Merged

[CUDA] Preload dependent DLLs #23674

Merged

tianleiwu added a commit that referenced this pull request


          Add extra requires for cuda/cudnn DLLs to onnxruntime-gpu python pack…

444606d

…age (#23659)

### Description
Add extra requires for cuda/cudnn DLLs to onnxruntime-gpu python
package.

During building wheel, make sure to add cuda version parameters to build
command line like `--cuda_version 12.8`.

Note that we only add extra requires for cuda 12 for now. If a package
is built with cuda 11, no extra requires will be added.

Examples to install extra DLLs from wheel:
```
pip install onnxruntime_gpu-1.21.0-cp310-cp310-linux_x86_64.whl[cuda,cudnn]
```

If install cudnn DLLs but not cuda DLLs:
```
pip install onnxruntime_gpu-1.21.0-cp310-cp310-linux_x86_64.whl[cudnn]
```

Example section in METADATA file of dist-info:
```
Provides-Extra: cuda
Requires-Dist: nvidia-cuda-nvrtc-cu12~=12.0; extra == "cuda"
Requires-Dist: nvidia-cuda-runtime-cu12~=12.0; extra == "cuda"
Requires-Dist: nvidia-cufft-cu12~=11.0; extra == "cuda"
Requires-Dist: nvidia-curand-cu12~=10.0; extra == "cuda"
Provides-Extra: cudnn
Requires-Dist: nvidia-cudnn-cu12~=9.0; extra == "cudnn"
...
```

### Motivation and Context

Jian had a PR: #22506. This
adds only part of the change. Extra change include updating the windows
gpu python packaging pipeline to pass cuda version to the build command
line.

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

tianleiwu added a commit that referenced this pull request


          [CUDA] Preload dependent DLLs (#23674)

c7aa9a7

### Description

Changes:
(1) Pass --cuda_version in packaging pipeline to build wheel command
line so that cuda_version can be saved. Note that cuda_version is also
required for generating extra_require for
#23659.
(2) Update steup.py and onnxruntime_validation.py to save cuda version
to capi/build_and_package_info.py.
(3) Add a helper function to preload dependent DLLs (MSVC, CUDA, CUDNN)
in `__init__.py`. First we will try to load DLLs from nvidia site
packages, then try load remaining DLLs with default path settings.

```
import onnxruntime
onnxruntime.preload_dlls()
```

To show loaded DLLs, set `verbose=True`. It is also possible to disable
loading some types of DLLs like:
```
onnxruntime.preload_dlls(cuda=False, cudnn=False, msvc=False, verbose=True)
```

#### PyTorch and onnxruntime in Windows

When working with pytorch, onnxruntime will reuse the CUDA and cuDNN
DLLs loaded by pytorch as long as CUDA and cuDNN major versions are
compatible. Preload DLLs actually might cause issues (see example 2 and
3 below) in Windows.

Example 1: onnxruntime and torch can work together easily. 
```
>>> import torch
>>> import onnxruntime
>>> session = onnxruntime.InferenceSession("model.onnx", providers=["CUDAExecutionProvider"])
>>> onnxruntime.preload_dlls(cuda=False, cudnn=False, msvc=False, verbose=True)
----List of loaded DLLs----
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\curand64_10.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\cufft64_11.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\cudnn_heuristic64_9.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\cudnn_engines_precompiled64_9.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\cudnn_ops64_9.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\cudnn_adv64_9.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\cublasLt64_12.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\cublas64_12.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\nvrtc64_120_0.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\nvrtc-builtins64_124.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\cudnn_engines_runtime_compiled64_9.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\cudnn_cnn64_9.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\cudnn_graph64_9.dll
D:\anaconda3\envs\py310\Lib\site-packages\numpy.libs\msvcp140-d64049c6e3865410a7dda6a7e9f0c575.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\cudart64_12.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\cudnn64_9.dll
D:\anaconda3\envs\py310\msvcp140.dll
D:\anaconda3\envs\py310\msvcp140_1.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\cufftw64_11.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\caffe2_nvrtc.dll
D:\anaconda3\envs\py310\vcruntime140_1.dll
D:\anaconda3\envs\py310\vcruntime140.dll
>>> session.get_providers()
['CUDAExecutionProvider', 'CPUExecutionProvider']
```

Example 2: Use preload_dlls after `import torch` is not necessary.
Unfortunately, it seems that multiple DLLs of same filename are loaded.
They can be used in parallel but not ideal since more memory is used.
```
>>> import torch
>>> import onnxruntime
>>> onnxruntime.preload_dlls(verbose=True)
----List of loaded DLLs----
D:\anaconda3\envs\py310\Lib\site-packages\nvidia\cufft\bin\cufft64_11.dll
D:\anaconda3\envs\py310\Lib\site-packages\nvidia\cublas\bin\cublas64_12.dll
D:\anaconda3\envs\py310\Lib\site-packages\nvidia\cublas\bin\cublasLt64_12.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\curand64_10.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\cufft64_11.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\cudnn_heuristic64_9.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\cudnn_engines_precompiled64_9.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\cudnn_ops64_9.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\cudnn_adv64_9.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\cublasLt64_12.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\cublas64_12.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\nvrtc64_120_0.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\nvrtc-builtins64_124.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\cudnn_engines_runtime_compiled64_9.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\cudnn_cnn64_9.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\cudnn_graph64_9.dll
D:\anaconda3\envs\py310\Lib\site-packages\nvidia\cudnn\bin\cudnn_graph64_9.dll
D:\anaconda3\envs\py310\Lib\site-packages\nvidia\cuda_runtime\bin\cudart64_12.dll
D:\anaconda3\envs\py310\Lib\site-packages\numpy.libs\msvcp140-d64049c6e3865410a7dda6a7e9f0c575.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\cudart64_12.dll
D:\anaconda3\envs\py310\Lib\site-packages\nvidia\cudnn\bin\cudnn64_9.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\cudnn64_9.dll
D:\anaconda3\envs\py310\msvcp140_1.dll
D:\anaconda3\envs\py310\msvcp140.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\cufftw64_11.dll
D:\anaconda3\envs\py310\Lib\site-packages\torch\lib\caffe2_nvrtc.dll
D:\anaconda3\envs\py310\vcruntime140_1.dll
D:\anaconda3\envs\py310\vcruntime140.dll
```

Example 3: Use preload_dlls before `import torch` might cause torch
import error in Windows. Later we may provide an option to load DLLs
from torch directory to avoid this issue.
```
>>> import onnxruntime
>>> onnxruntime.preload_dlls(verbose=True)
----List of loaded DLLs----
D:\anaconda3\envs\py310\Lib\site-packages\nvidia\cufft\bin\cufft64_11.dll
D:\anaconda3\envs\py310\Lib\site-packages\nvidia\cublas\bin\cublas64_12.dll
D:\anaconda3\envs\py310\Lib\site-packages\nvidia\cublas\bin\cublasLt64_12.dll
D:\anaconda3\envs\py310\Lib\site-packages\nvidia\cudnn\bin\cudnn_graph64_9.dll
D:\anaconda3\envs\py310\Lib\site-packages\nvidia\cuda_runtime\bin\cudart64_12.dll
D:\anaconda3\envs\py310\Lib\site-packages\numpy.libs\msvcp140-d64049c6e3865410a7dda6a7e9f0c575.dll
D:\anaconda3\envs\py310\Lib\site-packages\nvidia\cudnn\bin\cudnn64_9.dll
D:\anaconda3\envs\py310\msvcp140.dll
D:\anaconda3\envs\py310\vcruntime140_1.dll
D:\anaconda3\envs\py310\msvcp140_1.dll
D:\anaconda3\envs\py310\vcruntime140.dll
>>> import torch
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\anaconda3\envs\py310\lib\site-packages\torch\__init__.py", line 137, in <module>
    raise err
OSError: [WinError 127] The specified procedure could not be found. Error loading "D:\anaconda3\envs\py310\lib\site-packages\torch\lib\cudnn_adv64_9.dll" or one of its dependencies.
```

#### PyTorch and onnxruntime in Linux

In Linux, since pytorch uses nvidia site packages for CUDA and cuDNN
DLLs. Preload DLLs consistently loads same set of DLLs, and it could
help maintaining.

```
>>> import onnxruntime
>>> onnxruntime.preload_dlls(verbose=True)
----List of loaded DLLs----
/anaconda3/envs/py310/lib/python3.10/site-packages/nvidia/cudnn/lib/libcudnn.so.9
/anaconda3/envs/py310/lib/python3.10/site-packages/nvidia/cudnn/lib/libcudnn_graph.so.9
/anaconda3/envs/py310/lib/python3.10/site-packages/nvidia/cuda_runtime/lib/libcudart.so.12
/anaconda3/envs/py310/lib/python3.10/site-packages/nvidia/cufft/lib/libcufft.so.11
/anaconda3/envs/py310/lib/python3.10/site-packages/nvidia/curand/lib/libcurand.so.10
/anaconda3/envs/py310/lib/python3.10/site-packages/nvidia/cuda_nvrtc/lib/libnvrtc.so.12
/anaconda3/envs/py310/lib/python3.10/site-packages/nvidia/cublas/lib/libcublas.so.12
/anaconda3/envs/py310/lib/python3.10/site-packages/nvidia/cublas/lib/libcublasLt.so.12
>>> import torch
>>> torch.rand(3, 3).cuda()
tensor([[0.4619, 0.0279, 0.2092],
        [0.0416, 0.6782, 0.5889],
        [0.9988, 0.9092, 0.7982]], device='cuda:0')
>>> session = onnxruntime.InferenceSession("model.onnx", providers=["CUDAExecutionProvider"])
>>> session.get_providers()
['CUDAExecutionProvider', 'CPUExecutionProvider']
```

```
>>> import torch
>>> import onnxruntime
>>> session = onnxruntime.InferenceSession("model.onnx", providers=["CUDAExecutionProvider"])
>>> onnxruntime.preload_dlls(cuda=False, cudnn=False, msvc=False, verbose=True)
----List of loaded DLLs----
/cuda12.8/targets/x86_64-linux/lib/libnvrtc.so.12.8.61
/cudnn9.7/lib/libcudnn_graph.so.9.7.0
/anaconda3/envs/py310/lib/python3.10/site-packages/nvidia/cublas/lib/libcublasLt.so.12
/anaconda3/envs/py310/lib/python3.10/site-packages/nvidia/cublas/lib/libcublas.so.12
/anaconda3/envs/py310/lib/python3.10/site-packages/nvidia/curand/lib/libcurand.so.10
/anaconda3/envs/py310/lib/python3.10/site-packages/nvidia/cufft/lib/libcufft.so.11
/anaconda3/envs/py310/lib/python3.10/site-packages/nvidia/cudnn/lib/libcudnn.so.9
/anaconda3/envs/py310/lib/python3.10/site-packages/nvidia/cuda_runtime/lib/libcudart.so.12
```

Without preloading DLLs, onnxruntime will load CUDA and cuDNN DLLs based
on `LD_LIBRARY_PATH`. Torch will reuse the same DLLs loaded by
onnxruntime:
```
>>> import onnxruntime
>>> session = onnxruntime.InferenceSession("model.onnx", providers=["CUDAExecutionProvider"])
>>> onnxruntime.preload_dlls(cuda=False, cudnn=False, msvc=False, verbose=True)
----List of loaded DLLs----
/cuda12.8/targets/x86_64-linux/lib/libnvrtc.so.12.8.61
/cuda12.8/targets/x86_64-linux/lib/libcufft.so.11.3.3.41
/cuda12.8/targets/x86_64-linux/lib/libcurand.so.10.3.9.55
/cuda12.8/targets/x86_64-linux/lib/libcublas.so.12.8.3.14
/cuda12.8/targets/x86_64-linux/lib/libcublasLt.so.12.8.3.14
/cudnn9.7/lib/libcudnn_graph.so.9.7.0
/cudnn9.7/lib/libcudnn.so.9.7.0
/cuda12.8/targets/x86_64-linux/lib/libcudart.so.12.8.57
>>> import torch
>>> onnxruntime.preload_dlls(cuda=False, cudnn=False, msvc=False, verbose=True)
----List of loaded DLLs----
/cuda12.8/targets/x86_64-linux/lib/libnvrtc.so.12.8.61
/cuda12.8/targets/x86_64-linux/lib/libcufft.so.11.3.3.41
/cuda12.8/targets/x86_64-linux/lib/libcurand.so.10.3.9.55
/cuda12.8/targets/x86_64-linux/lib/libcublas.so.12.8.3.14
/cuda12.8/targets/x86_64-linux/lib/libcublasLt.so.12.8.3.14
/cudnn9.7/lib/libcudnn_graph.so.9.7.0
/cudnn9.7/lib/libcudnn.so.9.7.0
/cuda12.8/targets/x86_64-linux/lib/libcudart.so.12.8.57
>>> torch.rand(3, 3).cuda()
tensor([[0.2233, 0.9194, 0.8078],
        [0.0906, 0.2884, 0.3655],
        [0.6249, 0.2904, 0.4568]], device='cuda:0')
>>> onnxruntime.preload_dlls(cuda=False, cudnn=False, msvc=False, verbose=True)
----List of loaded DLLs----
/cuda12.8/targets/x86_64-linux/lib/libnvrtc.so.12.8.61
/cuda12.8/targets/x86_64-linux/lib/libcufft.so.11.3.3.41
/cuda12.8/targets/x86_64-linux/lib/libcurand.so.10.3.9.55
/cuda12.8/targets/x86_64-linux/lib/libcublas.so.12.8.3.14
/cuda12.8/targets/x86_64-linux/lib/libcublasLt.so.12.8.3.14
/cudnn9.7/lib/libcudnn_graph.so.9.7.0
/cudnn9.7/lib/libcudnn.so.9.7.0
/cuda12.8/targets/x86_64-linux/lib/libcudart.so.12.8.57
```

### Motivation and Context
In many reported issues of import onnxruntime failure, the root cause is
dependent DLLs missing or not in path. This change will make it easier
to resolve those issues.

This is based on Jian's PR
#22506 with extra change to
load msvc dlls.

#23659 can be used to
install CUDA/cuDNN dlls to site packages. Example command line after
next official release 1.21:
```
pip install onnxruntime-gpu[cuda,cudnn]
```

If user installed pytorch in Linux, those DLLs are usually installed
together with torch.

theHamsta reviewed

View reviewed changes

onnxruntime/__init__.py

+                  cuda_version_ = tuple(map(int, cuda_version.split(".")))
+                  # Get the site-packages path where nvidia packages are installed
+                  site_packages_path = site.getsitepackages()[-1]

theHamsta Feb 17, 2025

Shouldn't we check all sitepackages directories? We might also check whether things like nvidia.cudnn and nvidia.cudnn.__path__ are importable (import nvidia.cudnn or importlib)

tianleiwu closed this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

snnn snnn left review comments

theHamsta theHamsta left review comments

baijumeswani baijumeswani left review comments

tianleiwu tianleiwu left review comments

github-actions[bot] github-actions[bot] left review comments

Copilot code review Copilot Copilot left review comments

jywu-msft Awaiting requested review from jywu-msft

Labels

None yet