Fixed missing audio with video_reader backend #3934

prabhat00155 · 2021-05-27T13:11:44Z

Resolves #3890.

video_path = "data/WUzgd7C1pWA.mp4"
set_video_backend('video_reader')
print(f'set backend: {get_video_backend()}')

visual, audio, info = read_video(video_path, pts_unit='pts')
print('Visual:', visual.shape, 'Audio:', audio.shape, info)

visual, audio, info = read_video(video_path, pts_unit='sec')
print('Visual:', visual.shape, 'Audio:', audio.shape, info)
---
Visual: torch.Size([327, 256, 340, 3]) Audio: torch.Size([523264, 1]) {'video_fps': 29.970029830932617, 'audio_fps': 48000.0}
Visual: torch.Size([327, 256, 340, 3]) Audio: torch.Size([523264, 1]) {'video_fps': 29.970029830932617, 'audio_fps': 48000.0}

video_path = "data/WUzgd7C1pWA.mp4"
set_video_backend('video_reader')
print(f'set backend: {get_video_backend()}')

visual, audio, info = read_video(video_path, start_pts=1001, pts_unit='pts')
print('Visual:', visual.shape, 'Audio:', audio.shape, info)

visual, audio, info = read_video(video_path, start_pts=0.0333667, pts_unit='sec')
print('Visual:', visual.shape, 'Audio:', audio.shape, info)
---
set backend: video_reader
Visual: torch.Size([326, 256, 340, 3]) Audio: torch.Size([521216, 1]) {'video_fps': 29.970029830932617, 'audio_fps': 48000.0}
Visual: torch.Size([326, 256, 340, 3]) Audio: torch.Size([521216, 1]) {'video_fps': 29.970029830932617, 'audio_fps': 48000.0}


video_path = "data//WUzgd7C1pWA.mp4"
set_video_backend('video_reader')
print(f'set backend: {get_video_backend()}')

visual, audio, info = read_video(video_path, start_pts=1001, end_pts=2002, pts_unit='pts')
print('Visual:', visual.shape, 'Audio:', audio.shape, info)

visual, audio, info = read_video(video_path, start_pts=0.0333667, end_pts=0.1001000, pts_unit='sec')
print('Visual:', visual.shape, 'Audio:', audio.shape, info)
---
set backend: video_reader
Visual: torch.Size([2, 256, 340, 3]) Audio: torch.Size([2048, 1]) {'video_fps': 29.970029830932617, 'audio_fps': 48000.0}
Visual: torch.Size([3, 256, 340, 3]) Audio: torch.Size([3072, 1]) {'video_fps': 29.970029830932617, 'audio_fps': 48000.0}

datumbox

The changes look good. Do you think we should put a unit-test to test for this case?

prabhat00155 · 2021-05-27T16:11:55Z

The changes look good. Do you think we should put a unit-test to test for this case?

Yeah that makes sense, added the unit test.

datumbox

This failing test seems related. Marking as requires changes to avoid accidental merges. Let's check it out next week. :)

datumbox · 2021-06-11T15:01:52Z

Following the reported internal issue, I took this patch and applied it on latest master:

$ git diff
diff --git a/torchvision/io/_video_opt.py b/torchvision/io/_video_opt.py
index a34b023b..e92ac1bd 100644
--- a/torchvision/io/_video_opt.py
+++ b/torchvision/io/_video_opt.py
@@ -155,7 +155,7 @@ def _align_audio_frames(aframes, aframe_pts, audio_pts_range):
     e_idx = num_samples
     if start < audio_pts_range[0]:
         s_idx = int((audio_pts_range[0] - start) / step_per_aframe)
-    if end > audio_pts_range[1]:
+    if audio_pts_range[1] != -1 and end > audio_pts_range[1]:
         e_idx = int((audio_pts_range[1] - end) / step_per_aframe)
     return aframes[s_idx:e_idx, :]

Then opened the same file as on the bug report:

>>> from torchvision.io import read_video
>>> vid_path = "./---0tKA3iYI.mp4"
>>> vid, aud, meta = read_video(vid_path, 1, 2, pts_unit="sec")
>>> vid.shape
torch.Size([31, 360, 204, 3])
>>> aud.shape
torch.Size([1, 0])

Not sure if it's the same issue or a different one. Thoughts?

Edit:

The problem persists even after setting the backend. Might be unrelated to the fixes of the PR and instead be a separate problem:

>>> import torchvision
>>> torchvision.set_video_backend('video_reader')
>>> from torchvision.io import read_video
>>> vid_path = "./---0tKA3iYI.mp4"
>>> vid, aud, meta = read_video(vid_path, 1, 2, pts_unit="sec")
>>> vid.shape,  aud.shape
(torch.Size([31, 360, 204, 3]), torch.Size([1, 0]))

prabhat00155 · 2021-06-11T15:18:28Z

Following the reported internal issue, I took this patch and applied it on latest master:

$ git diff
diff --git a/torchvision/io/_video_opt.py b/torchvision/io/_video_opt.py
index a34b023b..e92ac1bd 100644
--- a/torchvision/io/_video_opt.py
+++ b/torchvision/io/_video_opt.py
@@ -155,7 +155,7 @@ def _align_audio_frames(aframes, aframe_pts, audio_pts_range):
     e_idx = num_samples
     if start < audio_pts_range[0]:
         s_idx = int((audio_pts_range[0] - start) / step_per_aframe)
-    if end > audio_pts_range[1]:
+    if audio_pts_range[1] != -1 and end > audio_pts_range[1]:
         e_idx = int((audio_pts_range[1] - end) / step_per_aframe)
     return aframes[s_idx:e_idx, :]

Then opened the same file as on the bug report:

>>> from torchvision.io import read_video
>>> vid_path = "./---0tKA3iYI.mp4"
>>> vid, aud, meta = read_video(vid_path, 1, 2, pts_unit="sec")
>>> vid.shape
torch.Size([31, 360, 204, 3])
>>> aud.shape
torch.Size([1, 0])

Not sure if it's the same issue or a different one. Thoughts?

Sorry, the current PR fixes missing audio with video_reader backend

set_video_backend('video_reader')

By default, we use pyav backend, which also returns empty audio frames: #3779. I'll fix that in a different PR.

datumbox

@prabhat00155 thanks for the PR, LGTM.

Edit:
Unfortunately I see a segmentation fault that looks relevant. See unittest_linux_cpu_py3.9

test failed

prabhat00155 · 2021-06-12T09:30:02Z

@prabhat00155 thanks for the PR, LGTM.

Edit:
Unfortunately I see a segmentation fault that looks relevant. See unittest_linux_cpu_py3.9

@datumbox The seg fault was caused by incompatible ffmpeg version, which was fixed in #4041. It should be fine now.

datumbox

LGTM, thanks!

Summary: * Fixed missing audio with video_reader backend * Added unit test Reviewed By: fmassa Differential Revision: D29264318 fbshipit-source-id: de95e0bd38d2f844c756652fe42de99b1ab32210

Fixed missing audio with video_reader backend

1684455

prabhat00155 added module: io module: video labels May 27, 2021

facebook-github-bot added the cla signed label May 27, 2021

datumbox reviewed May 27, 2021

View reviewed changes

datumbox approved these changes May 27, 2021

View reviewed changes

Added unit test

e051a41

datumbox self-requested a review May 27, 2021 18:24

datumbox suggested changes May 27, 2021

View reviewed changes

prabhat00155 mentioned this pull request Jun 7, 2021

Mismatch in audio frames returned by pyav and video reader #3986

Open

3 tasks

Merge branch 'master' into prabhat00155/fix_audio

786b61e

datumbox self-requested a review June 11, 2021 17:03

datumbox previously approved these changes Jun 11, 2021

View reviewed changes

Merge branch 'master' into prabhat00155/fix_audio

77ceb2a

prabhat00155 requested a review from datumbox June 12, 2021 09:27

datumbox approved these changes Jun 13, 2021

View reviewed changes

prabhat00155 added 2 commits June 15, 2021 08:44

Merge branch 'master' into prabhat00155/fix_audio

ad7b7fe

Merge branch 'master' into prabhat00155/fix_audio

ad26d82

prabhat00155 merged commit b74366b into pytorch:master Jun 15, 2021

prabhat00155 deleted the prabhat00155/fix_audio branch June 15, 2021 10:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed missing audio with video_reader backend #3934

Fixed missing audio with video_reader backend #3934

prabhat00155 commented May 27, 2021

datumbox left a comment

prabhat00155 commented May 27, 2021

datumbox left a comment

datumbox commented Jun 11, 2021 •

edited

Loading

prabhat00155 commented Jun 11, 2021 •

edited

Loading

datumbox left a comment •

edited

Loading

prabhat00155 commented Jun 12, 2021 •

edited

Loading

datumbox left a comment

Fixed missing audio with video_reader backend #3934

Fixed missing audio with video_reader backend #3934

Conversation

prabhat00155 commented May 27, 2021

datumbox left a comment

Choose a reason for hiding this comment

prabhat00155 commented May 27, 2021

datumbox left a comment

Choose a reason for hiding this comment

datumbox commented Jun 11, 2021 • edited Loading

prabhat00155 commented Jun 11, 2021 • edited Loading

datumbox left a comment • edited Loading

Choose a reason for hiding this comment

prabhat00155 commented Jun 12, 2021 • edited Loading

datumbox left a comment

Choose a reason for hiding this comment

datumbox commented Jun 11, 2021 •

edited

Loading

prabhat00155 commented Jun 11, 2021 •

edited

Loading

datumbox left a comment •

edited

Loading

prabhat00155 commented Jun 12, 2021 •

edited

Loading