You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[Bug]: Device capability check incorrectly sets cuDNN benchmark on cards it should not or on multi-GPU systems, causes non-deterministic results
#12879
Closed
1 task done
catboxanon opened this issue
Aug 31, 2023
· 0 comments
· Fixed by #12924
This check above sets enables cuDNN benchmark on cards it should not as the title describes. The linked PR was made with the intention to only apply to 16XX series cards, but the CUDA compute capability of 7.5 applies to more than just the 16XX series. https://developer.nvidia.com/cuda-gpus
This is particular problematic because 1) I have a 2080TI, which supports fp16 operations just fine, so this enables this option when it shouldn't, and 2) I don't actually run the webui with my 2080TI, and instead with my 3090, as I have two GPUs installed (this occurs because the check looks for any() GPU that has this compute cabability).
Steps to reproduce the problem
Run the webui with any non-16XX card but one that has a compute cabability of 7.5 installed. Note that results between webui restarts will never be 1:1 because of the non-deterministic results.
What should have happened?
This benchmark option should only be enabled for 16XX cards, and only if they are actually in use by the webui (either as the default or the one determined by --device-id).
I worked with voldy on debugging this on Discord to find the root cause but I'm opening an issue here for more visibility and to better track it.
Also, funnily enough, I didn't see it until now, but #9359 is related to this. Since the current code doesn't fulfill the PR's original purpose of applying to only 16XX cards and when they're actually in use (this is what made it so hard to track down) this really should be considered a bug and not a feature request.
The text was updated successfully, but these errors were encountered:
catboxanon
changed the title
[Bug]: Device capability check incorrectly sets cuDNN benchmark on cards it should not
[Bug]: Device capability check incorrectly sets cuDNN benchmark on cards it should not or on multi-GPU systems, causes non-deterministic results
Aug 31, 2023
Is there an existing issue for this?
What happened?
stable-diffusion-webui/modules/devices.py
Lines 61 to 64 in 5ef669d
This check above sets enables cuDNN benchmark on cards it should not as the title describes. The linked PR was made with the intention to only apply to 16XX series cards, but the CUDA compute capability of 7.5 applies to more than just the 16XX series. https://developer.nvidia.com/cuda-gpus
As noted in the PyTorch docs, enabling this also makes results non-deterministic. https://pytorch.org/docs/stable/notes/randomness.html#cuda-convolution-benchmarking
This is particular problematic because 1) I have a 2080TI, which supports fp16 operations just fine, so this enables this option when it shouldn't, and 2) I don't actually run the webui with my 2080TI, and instead with my 3090, as I have two GPUs installed (this occurs because the check looks for
any()
GPU that has this compute cabability).Steps to reproduce the problem
Run the webui with any non-16XX card but one that has a compute cabability of 7.5 installed. Note that results between webui restarts will never be 1:1 because of the non-deterministic results.
What should have happened?
This benchmark option should only be enabled for 16XX cards, and only if they are actually in use by the webui (either as the default or the one determined by
--device-id
).Sysinfo
sysinfo.txt
What browsers do you use to access the UI ?
Mozilla Firefox
Console logs
Additional information
I worked with voldy on debugging this on Discord to find the root cause but I'm opening an issue here for more visibility and to better track it.
Also, funnily enough, I didn't see it until now, but #9359 is related to this. Since the current code doesn't fulfill the PR's original purpose of applying to only 16XX cards and when they're actually in use (this is what made it so hard to track down) this really should be considered a bug and not a feature request.
The text was updated successfully, but these errors were encountered: