Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault for Python 3.9 for video #3367

Closed
seemethere opened this issue Feb 10, 2021 · 10 comments
Closed

Segmentation fault for Python 3.9 for video #3367

seemethere opened this issue Feb 10, 2021 · 10 comments

Comments

@seemethere
Copy link
Member

seemethere commented Feb 10, 2021

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

Build torchvision for the latest version using python 3.9 with the latest pytorch nightly, run pytest -v test/test_video.py.

Output:

vision39 ❯ pytest -v test/test_video.py
================================================================================ test session starts =================================================================================
platform linux -- Python 3.9.1, pytest-6.2.2, py-1.10.0, pluggy-0.12.0 -- /home/eliuriegas/miniconda3/envs/vision39/bin/python
cachedir: .pytest_cache
rootdir: /home/eliuriegas/work/vision
plugins: cov-2.11.1
collected 3 items                                                                                                                                                                    

test/test_video.py::TestVideo::test_metadata PASSED                                                                                                                            [ 33%]
test/test_video.py::TestVideo::test_read_video_tensor Fatal Python error: Segmentation fault

Current thread 0x00007f618a834740 (most recent call first):
  File "/home/eliuriegas/work/vision/torchvision/io/__init__.py", line 123 in __next__
  File "/home/eliuriegas/work/vision/test/test_video.py", line 296 in test_read_video_tensor
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/unittest/case.py", line 550 in _callTestMethod
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/unittest/case.py", line 593 in run
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/unittest/case.py", line 653 in __call__
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/_pytest/unittest.py", line 321 in runtest
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/_pytest/runner.py", line 162 in pytest_runtest_call
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/pluggy/manager.py", line 78 in <lambda>
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/pluggy/manager.py", line 87 in _hookexec
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/pluggy/hooks.py", line 289 in __call__
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/_pytest/runner.py", line 255 in <lambda>
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/_pytest/runner.py", line 311 in from_call
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/_pytest/runner.py", line 254 in call_runtest_hook
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/_pytest/runner.py", line 215 in call_and_report
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/_pytest/runner.py", line 126 in runtestprotocol
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/_pytest/runner.py", line 109 in pytest_runtest_protocol
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/pluggy/manager.py", line 78 in <lambda>
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/pluggy/manager.py", line 87 in _hookexec
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/pluggy/hooks.py", line 289 in __call__
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/_pytest/main.py", line 348 in pytest_runtestloop
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/pluggy/manager.py", line 78 in <lambda>
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/pluggy/manager.py", line 87 in _hookexec
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/pluggy/hooks.py", line 289 in __call__
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/_pytest/main.py", line 323 in _main
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/_pytest/main.py", line 269 in wrap_session
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/_pytest/main.py", line 316 in pytest_cmdline_main
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/pluggy/manager.py", line 78 in <lambda>
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/pluggy/manager.py", line 87 in _hookexec
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/pluggy/hooks.py", line 289 in __call__
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/_pytest/config/__init__.py", line 162 in main
  File "/home/eliuriegas/miniconda3/envs/vision39/lib/python3.9/site-packages/_pytest/config/__init__.py", line 185 in console_main
  File "/home/eliuriegas/miniconda3/envs/vision39/bin/pytest", line 11 in <module>
zsh: segmentation fault (core dumped)  pytest -v test/test_video.py

Expected behavior

No segmentation fault

Environment

Please copy and paste the output from our
environment collection script
(or fill out the checklist below manually).

You can get the script and run it with:

wget https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py
# For security purposes, please check the contents of collect_env.py before running it.
python collect_env.py
vision39 ❯ python -m torch.utils.collect_env
Collecting environment information...
PyTorch version: 1.8.0
Is debug build: False
CUDA used to build PyTorch: 11.2
ROCM used to build PyTorch: N/A

OS: Fedora release 33 (Thirty Three) (x86_64)
GCC version: (GCC) 10.2.1 20201125 (Red Hat 10.2.1-9)
Clang version: 11.0.0 (Fedora 11.0.0-2.fc33)
CMake version: version 3.18.4

Python version: 3.9 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: Quadro P4000
Nvidia driver version: 460.32.03
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.20.0
[pip3] torch==1.8.0
[pip3] torchvision==0.9.0a0+81d47c3
[conda] blas                      2.108                       mkl    conda-forge
[conda] blas-devel                3.9.0                     8_mkl    conda-forge
[conda] cudatoolkit               11.2.0               h73cb219_7    conda-forge
[conda] ffmpeg                    4.3                  hf484d3e_0    pytorch
[conda] libblas                   3.9.0                     8_mkl    conda-forge
[conda] libcblas                  3.9.0                     8_mkl    conda-forge
[conda] liblapack                 3.9.0                     8_mkl    conda-forge
[conda] liblapacke                3.9.0                     8_mkl    conda-forge
[conda] mkl                       2020.4             h726a3e6_304    conda-forge
[conda] mkl-devel                 2020.4             ha770c72_305    conda-forge
[conda] mkl-include               2020.4             h726a3e6_304    conda-forge
[conda] numpy                     1.20.0           py39hdbf815f_0    conda-forge
[conda] pytorch                   1.8.0           py3.9_cuda112_cudnn8.1.0_0    pytorch-test
[conda] torchvision               0.9.0a0+81d47c3           dev_0    <develop>

Additional context

This was found as part of the effort to introduce Python 3.9 to the main branch here: #3341

cc @bjuncek @fmassa @andfoy

@andfoy
Copy link
Contributor

andfoy commented Feb 10, 2021

Which FFmpeg did you use?

@seemethere
Copy link
Member Author

seemethere commented Feb 10, 2021

Which FFmpeg did you use?

Should be this one:

[conda] ffmpeg                    4.3                  hf484d3e_0    pytorch

@fmassa
Copy link
Member

fmassa commented Feb 10, 2021

cc @bjuncek as the segmentation fault happens in the new VideoReader implementation

@andfoy
Copy link
Contributor

andfoy commented Feb 11, 2021

I think @seemethere installed the faulty 4.3 FFmpeg version, i.e., the one that had that AVX/SSE bug in assembly

@bjuncek
Copy link
Contributor

bjuncek commented Feb 11, 2021

I think that's the issue - I've just build tv with 3.9 and my legacy 4.2 from and it passes for me.
I'll try the one from pytorch channel as well just to make sure

@fmassa
Copy link
Member

fmassa commented Feb 11, 2021

@seemethere for reference, in #2650 (comment) we found out that the 4.3 version of FFmpeg (which was present in conda unfortunately). Using FFmpeg 4.2 fixes the issue.

@bjuncek
Copy link
Contributor

bjuncek commented Feb 25, 2021

If everyone agrees, I'll document this in #3460 and close this issue.

@bjuncek bjuncek closed this as completed Feb 25, 2021
@NicolasHug
Copy link
Member

@prabhat00155 should we reopen this? Or create a new issue?

Looks like #3460 which was supposed to supersede this one hasn't been addressed yet, and it would be nice to have an issue that tracks the 3.9 problems since we deactivated the builds in #4417

@bjuncek
Copy link
Contributor

bjuncek commented Sep 16, 2021

@NicolasHug I think we should either augment the #3460 or start a new issue;
segfault here is 90% due to the broken ffmpeg version. Segfaults we've started seeing now seem altogether different (and am not 100% sure what they're about - I'll be syncing with Prabhat in a bit to double check

@prabhat00155
Copy link
Contributor

@NicolasHug I have created an issue to track it here: #4430

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants