Respect `displaymatrix` and `sample_aspect_ratio` by default in `VideoFrame.to_ndarray()` and `VideoFrame.to_image()` #1676

lgeiger · 2024-12-05T15:15:55Z

lgeiger
Dec 5, 2024

To new users it can be hard to understand that VideoFrame.to_ndarray() and VideoFrame.to_image() do not respect sample_aspect_ratio and DISPLAYMATRIX.

This means if one wants to get an image or numpy array from av that's not distorted or rotated one users need to write custom helper functions to handle these cases. In our code base we need to use the following helper functions everytime we read a video frame.

def get_size(stream) -> tuple[int, int]:
    if not stream.sample_aspect_ratio or stream.sample_aspect_ratio == 1:
        return stream.width, stream.height
    if stream.sample_aspect_ratio > 1:
        return int(stream.width * stream.sample_aspect_ratio), stream.height
    return stream.width, int(stream.height / stream.sample_aspect_ratio)

def frame_to_ndarray(
    frame: av.VideoFrame, width: int, height: int, rotation: int
) -> np.ndarray:
    np_frame = frame.to_ndarray(format="rgb24", width=width, height=height)
    if rotation != 0:
        np_frame = np.rot90(np_frame, -rotation // 90)
    return np_frame

I think it would be much easier for users if this would be handled by VideoFrame.to_ndarray() and VideoFrame.to_image() directly for the cases where one doesn't explicitly pass in width or height keyword arguments. Ideally rotation would also be handled in the same way.

This will be a breaking change for users, let me know whether you think this would be useful

hmaarrfk · 2024-12-08T15:17:10Z

hmaarrfk
Dec 8, 2024

(user here) I would like to chime in and say that for video processing, applying rotations and flips is pretty disastrous to performance.

You can see this by running the code:

import numpy as np

a = np.zeros((1920, 1080, 3), dtype='uint8')

from tqdm import tqdm
for i in tqdm(range(1000)):
    np.ascontiguousarray(a.transpose(1, 0, 2))

On my computer, this runs at 130 fps. This is REALLY slow for "doing nothing". Eventually, in your displaying pipeline, you will have the opportunity to rotate, ideally at the end.

As a user, I would definitely want this to be opt-in rather than opt out, and ultimately, the "pyav-way" would be to leverage ffmpeg's filtering capabilities, (you should be able to create a avfilter_graph_create_filter with transpose in there to help you rotate things).

0 replies

lgeiger · 2024-12-08T21:56:49Z

lgeiger
Dec 8, 2024
Author

@hmaarrfk Thanks for chiming in! The goal of this issues was mainly to get a discussion started. My guess would be that the right behaviour probably depends on the exact use case. But as far as I know currently there is not even a way to read the correct rotation data via PyAV, or am I missing something?

With respect to your benchmarks, it seems like you're measuring primarily the impact of np.ascontiguousarray(). Rotation itself is a lot cheaper but I guess that depends if you require a C contiguous array or not:

In [3]: %timeit np.rot90(a)
2.1 μs ± 5.51 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [4]: %timeit np.ascontiguousarray(np.rot90(a))
7.3 ms ± 97.7 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

0 replies

hmaarrfk · 2024-12-08T23:18:07Z

hmaarrfk
Dec 8, 2024

it seems like you're measuring primarily the impact of np.ascontiguousarray().

The reason I did this is because this is an implicit operation that will likely happen in most OpenCV calls or calls to scikit-image functions.

The syntax [::1] means the array is C-contiguous.
https://github.com/scikit-image/scikit-image/blob/main/skimage/segmentation/_watershed_cy.pyx#L23

So it might "appear" that np.rot90 is "fast" but once you try to use the data, it will be "slow".

. But as far as I know currently there is not even a way to read the correct rotation data via PyAV, or am I missing something?

yes i think your PR will be useful! a great addition. I personally wish we also had a way to set the rotation! Would be very useful to me!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Respect `displaymatrix` and `sample_aspect_ratio` by default in `VideoFrame.to_ndarray()` and `VideoFrame.to_image()` #1676

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{title}}

Select a reply

Respect displaymatrix and sample_aspect_ratio by default in VideoFrame.to_ndarray() and VideoFrame.to_image() #1676

lgeiger Dec 5, 2024

Replies: 3 comments

hmaarrfk Dec 8, 2024

lgeiger Dec 8, 2024 Author

hmaarrfk Dec 8, 2024

Respect `displaymatrix` and `sample_aspect_ratio` by default in `VideoFrame.to_ndarray()` and `VideoFrame.to_image()` #1676

lgeiger
Dec 5, 2024

hmaarrfk
Dec 8, 2024

lgeiger
Dec 8, 2024
Author

hmaarrfk
Dec 8, 2024