Skip to content

Commit

Permalink
nightly.sh: build with local compute capability (#6608)
Browse files Browse the repository at this point in the history
The defaults are 35 and 52, and XLA's NCCL build fails on CUDA12 because it doesn't support 35.

Fix it by setting the appropriate environment variable to the local capability. An alternative would be to set many capabilities, but that would increase build times.
  • Loading branch information
cota authored Feb 26, 2024
1 parent a9a788d commit 4327d24
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion benchmarks/nightly.sh
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,10 @@ if [[ ${IS_FRESH_RUN?} ]]; then

# Set up pytorch/xla
cd pytorch/xla
XLA_CUDA=1 python setup.py develop
# Query local compute capability. If that fails, assign a sane default.
LOCAL_CAP=compute_$(nvidia-smi --query-gpu=compute_cap --format=csv | \
tail -1 | sed 's/\.//g' | grep -E '^[0-9]{2}$' || echo '80')
XLA_CUDA=1 TF_CUDA_COMPUTE_CAPABILITIES=${LOCAL_CAP:?} python setup.py develop
cd ../..

# Set up torchbench deps.
Expand Down

0 comments on commit 4327d24

Please sign in to comment.