Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blocksparse crashes in the encoder benchmark #24

Closed
blefaudeux opened this issue Oct 22, 2021 · 1 comment · Fixed by #25
Closed

Blocksparse crashes in the encoder benchmark #24

blefaudeux opened this issue Oct 22, 2021 · 1 comment · Fixed by #25
Assignees
Labels
bug Something isn't working ongoing

Comments

@blefaudeux
Copy link
Contributor

🐛 Bug

Running python3 xformers/benchmarks/benchmark_encoder.py --activations relu --plot -emb 256 -bs 32 -heads 16 -mlp MLP -a blocksparse with latest Triton crashes on an illegal memory access.

The dedicated blocksparse matmul benchmark runs, same for the unit test

Expected behavior

Well, not crashing

Environment

PyTorch version: 1.9.1+cu111
Is debug build: False
CUDA used to build PyTorch: 11.1
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.1 LTS (x86_64)
GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Clang version: Could not collect
CMake version: version 3.16.3
Libc version: glibc-2.31

Python version: 3.8 (64-bit runtime)
Python platform: Linux-5.4.0-52-generic-x86_64-with-glibc2.17
Is CUDA available: True
CUDA runtime version: 10.1.243
GPU models and configuration: 
GPU 0: Quadro GP100
GPU 1: Quadro GP100

Nvidia driver version: 450.80.02
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.19.5
[pip3] pytorch-sphinx-theme==0.0.24
[pip3] torch==1.9.1+cu111
[pip3] torch-tb-profiler==0.2.1
[pip3] torchaudio==0.9.1
[pip3] torchvision==0.10.1+cu111
[conda] numpy                     1.19.5                   pypi_0    pypi
[conda] pytorch-sphinx-theme      0.0.24                   pypi_0    pypi
[conda] torch                     1.9.1+cu111              pypi_0    pypi
[conda] torch-tb-profiler         0.2.1                    pypi_0    pypi
[conda] torchaudio                0.9.1                    pypi_0    pypi
[conda] torchvision               0.10.1+cu111             pypi_0    pypi

Additional context

Looks like something which can happen depending on the layout, not super clear why

@blefaudeux blefaudeux self-assigned this Oct 22, 2021
@blefaudeux blefaudeux added bug Something isn't working ongoing labels Oct 22, 2021
@blefaudeux
Copy link
Contributor Author

not a stride issue, by default our q and k do have a strange pattern (due to the head splitting), but passing in a dummy tensor with standard decreasing strides leads to the same mem access crash

xwhan pushed a commit to xwhan/xformers that referenced this issue Feb 8, 2022
[chore] update to python 3.8 and cuda 11.1 on CI
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working ongoing
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant