Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update video reader to use new decoder #1978

Merged
merged 10 commits into from
Mar 17, 2020

Conversation

fmassa
Copy link
Member

@fmassa fmassa commented Mar 13, 2020

Synchronizing the new video reader with OSS torchvision

Copy link
Contributor

@putivsky putivsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@fmassa fmassa force-pushed the update_video_reader branch from 029b23c to a7e3d60 Compare March 16, 2020 14:26
Yuri Putivsky and others added 10 commits March 16, 2020 19:11
Summary:
Pull Request resolved: pytorch#1747

Pull Request resolved: pytorch#1746

Added the implementation of ffmpeg based decoder with functionality that can be used in VUE and TorchVision.

Reviewed By: fmassa

Differential Revision: D19358914

fbshipit-source-id: abb672f89bfaca6351dda2354f0d35cf8e47fa0f
…torch#1766)

Summary:
Pull Request resolved: pytorch#1766

Replaced FfmpegDecoder (incompativle with VUE) by base decoder (compatible with VUE).
Modified python utilities video_utils.py for internal simplification. Public interface got preserved.

Reviewed By: fmassa

Differential Revision: D19415903

fbshipit-source-id: 4d7a0158bd77bac0a18732fe4183fdd9a57f6402
Summary:
Pull Request resolved: pytorch#1852

Changed base decoder internals for a faster clip processing.

Reviewed By: stephenyan1231

Differential Revision: D19748379

fbshipit-source-id: 58a435f0a0b25545e7bd1a3edb0b1d558176a806
Summary:
Found and fix a bug in cropping algorithm (simple mistyping).
Also derived classes need access to some decoder class members, like initialization parameters - make it protected.

Reviewed By: stephenyan1231, fmassa

Differential Revision: D19895076

fbshipit-source-id: 691336c8e18526b085ae5792ac3546bc387a6db9
Summary:
Pull Request resolved: pytorch#1898

Include streams/samplers shouldn't depend on decoder headers. Add dependencies directly to the place where they are required.

Reviewed By: stephenyan1231

Differential Revision: D19911404

fbshipit-source-id: ef322a053708405c02cee4562b456b1602fb12fc
Summary: For Mothership we have found that asynchronous decoder provides a better performance.

Differential Revision: D20026194

fbshipit-source-id: 627b91844b4e3f917002031dd32cb19c239f4ba8
Summary:
Pull Request resolved: pytorch#1942

In D18720474, it introduces a bug in `read_video_from_memory` API. Thank weiyaowang for reporting it.

Reviewed By: weiyaowang

Differential Revision: D20270179

fbshipit-source-id: 66348c99a5ad1f9129b90e934524ddfaad59de03
)

Summary:
Pull Request resolved: pytorch#1924

Extend `video reader` decoder python API in Torchvision to support a new argument `video_max_dimension`. This enables the new video decoding use cases. When setting `video_width=0`, `video_height=0`, `video_min_dimension != 0`, and `video_max_dimension != 0`, we can rescale the video clips so that its spatial resolution (height, width) becomes
 - (video_min_dimension, video_max_dimension) if original height < original width
 - (video_max_dimension, video_min_dimension) if original height >= original width

This is useful at video model testing stage, where we perform fully convolution evaluation and take entire video frames without cropping as input. Previously, for instance we can only set `video_width=0`, `video_height=0`, `video_min_dimension = 128`, which will preserve aspect ratio. In production dataset, there are a small number of videos where aspect ratio is either extremely large or small, and when the shorter edge is rescaled to 128, the longer edge is still large. This will easily cause GPU memory OOM when we sample multiple video clips, and put them in a single minibatch.

Now, we can set (for instance) `video_width=0`, `video_height=0`, `video_min_dimension = 128` and `video_max_dimension = 171` so that the rescale resolution is either (128, 171) or (171, 128) depending on whether original height is larger than original width. Thus, we are less likely to have gpu OOM because the spatial size of video clips is determined.

Reviewed By: putivsky

Differential Revision: D20182529

fbshipit-source-id: f9c40afb7590e7c45e6908946597141efa35f57c
Summary:
Pull Request resolved: pytorch#1967

No-ops for torchvision diff, which fixes samplers.

Differential Revision: D20397218

fbshipit-source-id: 6dc4d04364f305fbda7ca4f67a25ceecd73d0f20
@fmassa fmassa force-pushed the update_video_reader branch from a7e3d60 to 185ce06 Compare March 16, 2020 18:12
@fmassa fmassa merged commit 32e1680 into pytorch:master Mar 17, 2020
@fmassa fmassa deleted the update_video_reader branch March 17, 2020 09:17
@fmassa
Copy link
Member Author

fmassa commented Mar 17, 2020

Thanks a lot @putivsky !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants