Skip to content

Commit

Permalink
ptxas: Build with O2 instead of O3
Browse files Browse the repository at this point in the history
 This helps mitigate a performance regression in nvcc>11.6.
nvcc 11.8 still performs worse than 11.6, but it's not that bad now

See #712

__original_commit__ = fairinternal/xformers@42d55eb5f438ec6907836fbd22056a50076f14d5
  • Loading branch information
danthe3rd authored and xFormers Bot committed Apr 24, 2023
1 parent 936da0a commit 1c73b40
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -227,6 +227,10 @@ def get_extensions():
"-U__CUDA_NO_HALF_CONVERSIONS__",
"--extended-lambda",
"-D_ENABLE_EXTENDED_ALIGNED_STORAGE",
# Workaround for a regression with nvcc > 11.6
# See https://github.com/facebookresearch/xformers/issues/712
"--ptxas-options=-O2",
"--ptxas-options=-allow-expensive-optimizations=true",
] + get_extra_nvcc_flags_for_build_type()
if os.getenv("XFORMERS_ENABLE_DEBUG_ASSERTIONS", "0") != "1":
nvcc_flags.append("-DNDEBUG")
Expand Down

0 comments on commit 1c73b40

Please sign in to comment.