-
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bugfix][CI/Build] Fix docker build where CUDA archs < 7.0 are being detected #9254
[Bugfix][CI/Build] Fix docker build where CUDA archs < 7.0 are being detected #9254
Conversation
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good by me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the fix!
I saw one typo in a comment but I think we just land this to get the fix in.
# | ||
# For cuda we want to be able to control which architectures we compile for on | ||
# a per-file basis in order to cut down on compile time. So here we extract | ||
# the set of architectures we want to compile for and remove the from the | ||
# CMAKE_CUDA_FLAGS so that they are not applied globally. | ||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# | |
# For cuda we want to be able to control which architectures we compile for on | |
# a per-file basis in order to cut down on compile time. So here we extract | |
# the set of architectures we want to compile for and remove the from the | |
# CMAKE_CUDA_FLAGS so that they are not applied globally. | |
# | |
# | |
# For cuda we want to be able to control which architectures we compile for on | |
# a per-file basis in order to cut down on compile time. So here we extract | |
# the set of architectures we want to compile for and remove them from the | |
# CMAKE_CUDA_FLAGS so that they are not applied globally. | |
# |
…detected (vllm-project#9254) Signed-off-by: Alvant <alvasian@yandex.ru>
…detected (vllm-project#9254) Signed-off-by: Amit Garg <mitgarg17495@gmail.com>
…detected (vllm-project#9254) Signed-off-by: Sumit Dubey <sumit.dubey2@ibm.com>
…detected (vllm-project#9254) Signed-off-by: Maxime Fournioux <55544262+mfournioux@users.noreply.github.com>
FIX #9226
In this dockerfile the archs being detected by torch is
5.0;8.0;8.6;8.9;9.0;9.0a
we should ensure we prune off5.0
for the kernels where we build for all target archs. This PR does that by pre-filtering theCUDA_ARCHS
(target archs) by theCUDA_SUPPORTED_ARCHS
:(PR also includes some logging improvements / comments)
Verified using: