-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configurable flags for the backend compiler #1617
Comments
Extra: a configurable path to |
Update: I also notice that pytorch seems to have tried vendoring ptxas at a point, and ceased doing so since triton became their dependency. This is probably good: if pytorch just asks some function from The ptxas discussion can be moved to #1618 |
FWIW, I met the related issue when I use triton (2.2.0) in a conda environment. The cuda toolkit is installed in the conda env (rather than in the system), so the compiler can't find the cuda library default linking path. Below is the stacktrace
I need to pass the cuda library to the compiler, and what you need is to add the cuda library path (for me it's triton/python/triton/common/build.py Line 89 in c9ab448
The path can be automatically found with: def conda_cuda_dir():
conda_path = os.environ['CONDA_PREFIX']
return os.path.join(conda_path, "lib", "stubs") This specific issue is fixed in main branch, where there is a env var |
Hi! I see that
openai/triton
requires a working toolchain at run-time, including a CUDAToolkit and libpython installations for the host platform. Currently, triton attempts to guess the correct compiler flags on its own: https://github.com/openai/triton/blob/deb2c71fb4f912a5298003fa3fc789885b726607/python/triton/common/build.py#L77-L82This includes inferring the library locations: https://github.com/openai/triton/blob/deb2c71fb4f912a5298003fa3fc789885b726607/python/triton/common/build.py#L19-L22
What this means, in practice, is that
openai/triton
is taking on a job that is usually performed by tools like CMake, and that certain care is to be taken when deployingopenai/triton
. The current flag inference logic is platform-specific and, of course, it isn't expected to be universal either. But we probably should work out a solution on how to make it configurable, so that e.g. distributions can set up their environments to meet triton's expectations.Some concrete examples of issues that arise:
libcuda.so
user-space driver is deployed in a special location,/run/opengl-driver/lib
, andwhereis
wouldn't produce any reasonable output because/lib
and/usr/lib
do not exit. In python3Packages.torch: 1.13.1 -> 2.0.0 NixOS/nixpkgs#222273 we end up patchingtriton/compiler.py
to pass the correct-L
flag to the compiler: https://github.com/NixOS/nixpkgs/blob/e4474334415ac41efb5fda33d4cc8f312397ef05/pkgs/development/python-modules/openai-triton/default.nix#L128-L147. We also have to work around triton trying to vendor a copy of ptxaspytorch/pytorch
there is a number of confused issues about broken-lcuda
and#include <Python.h>
An off-the-shelf way of making libpython and cuda flags configurable would be
pkg-config
, although I'd feel weird and conflicted about setting up pkg-config at run-time side by side with pytorch. I also note that this situation is somewhat similar to that oftorch.utils.cpp_extension
, which also attempts to guess build flags at run-timeThe text was updated successfully, but these errors were encountered: