Need to reorder the flow tensor when transpose_image is true? #57

tomyoung903 · 2024-11-20T07:46:40Z

in postprocess_size() in train_stage_1.py

when transpose_img is set to true, we need to reorder flows like this

flows[:, [0,1]] = flows[:, [1,0]]

This makes sure that flows[:, 0] corresponds to width.

Right?

MyNiuuu · 2024-11-20T10:38:14Z

Hi! Thank you for your interest to our work!

The function postprocess_size() is used in the function get_optical_flows() (https://github.com/MyNiuuu/MOFA-Video/blob/main/Training/train_stage1.py#L113), in which the Unimatch model is used for estimating optical flows. As the Unimatch model was trained on images where the width is greater than the height, we transpose the image in the preprocess_size function if its height exceeds its width. This ensures that the images fed into the model always maintain a width greater than height configuration, aligning with the data format used during model training for accurate flow estimation.

During the post-processing stage, in the postprocess_size() function, if the transpose_img flag is true, it indicates that the image was transposed during preprocessing. Therefore, we need to transpose the flow again to ensure that the flow's orientation matches the original image orientation. Note that this operation has already been done in the postprocess_size() function (https://github.com/MyNiuuu/MOFA-Video/blob/main/Training/train_stage1.py#L106).

Therefore, if we want to use the optical flow for our model in the subsequent codes, we do not need to worry about whether the image was transposed, as the flow directions have already been properly adjusted in the post-processing function.

tomyoung903 · 2024-11-20T11:30:52Z

No it's not about transposing the flow tensor.

It's about reordering the dimension that has size 2 flows[:, [0,1]] = flows[:, [1,0]] such that the first slice always corresponds to width and the second one corresponds to height.

Right now if the video is in portrait mode, flows[;, 0] corresponds to height. This cannot be right for later operations.

But i suppose this issue did not hurt your training because most webvid10M videos are landscape.

MyNiuuu · 2024-11-20T11:56:38Z

Oh, I understand your point.

I think you are right, we indeed need to perform flows[:, [0,1]] = flows[:, [1,0]] when the image is transposed for optical flow prediction.

I originally copied the optical flow prediction codes from Unimatch:

https://github.com/autonomousvision/unimatch/blob/master/evaluate_flow.py#L642

from line 714 to line 760.

It is a little wierd that the origin script do not reorder the predicted flow with flows[:, [0,1]] = flows[:, [1,0]], instead only transpose the flow in line 758.

Maybe I missed some parts?

tomyoung903 changed the title ~~Need to reorder the flow tensor when transpose_image is set to true?~~ Need to reorder the flow tensor when transpose_image is true? Nov 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need to reorder the flow tensor when transpose_image is true? #57

Need to reorder the flow tensor when transpose_image is true? #57

tomyoung903 commented Nov 20, 2024

MyNiuuu commented Nov 20, 2024

tomyoung903 commented Nov 20, 2024

MyNiuuu commented Nov 20, 2024

Need to reorder the flow tensor when transpose_image is true? #57

Need to reorder the flow tensor when transpose_image is true? #57

Comments

tomyoung903 commented Nov 20, 2024

MyNiuuu commented Nov 20, 2024

tomyoung903 commented Nov 20, 2024

MyNiuuu commented Nov 20, 2024