nightly.sh: build with local compute capability (#6608)

The defaults are 35 and 52, and XLA's NCCL build fails on CUDA12 because it doesn't support 35. Fix it by setting the appropriate environment variable to the local capability. An alternative would be to set many capabilities, but that would increase build times.
pytorch · Feb 26, 2024 · 4327d24 · 4327d24
1 parent a9a788d
commit 4327d24
Showing 1 changed file with 4 additions and 1 deletion.
diff --git a/benchmarks/nightly.sh b/benchmarks/nightly.sh
@@ -96,7 +96,10 @@ if [[ ${IS_FRESH_RUN?} ]]; then
 
   # Set up pytorch/xla
   cd pytorch/xla
-  XLA_CUDA=1 python setup.py develop
+  # Query local compute capability. If that fails, assign a sane default.
+  LOCAL_CAP=compute_$(nvidia-smi --query-gpu=compute_cap --format=csv | \
+    tail -1 | sed 's/\.//g' | grep -E '^[0-9]{2}$' || echo '80')
+  XLA_CUDA=1 TF_CUDA_COMPUTE_CAPABILITIES=${LOCAL_CAP:?} python setup.py develop
   cd ../..
 
   # Set up torchbench deps.