Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RobustVideoMatting model conversion #373

Open
livingbeams opened this issue Nov 16, 2023 · 0 comments
Open

RobustVideoMatting model conversion #373

livingbeams opened this issue Nov 16, 2023 · 0 comments

Comments

@livingbeams
Copy link

Issue Type

Support

OS

Windows

OS architecture

x86_64

Programming Language

Python

Framework

ONNX

Model name and Weights/Checkpoints URL

rvm_mobilenetv3_HxW.onnx
https://github.com/PINTO0309/PINTO_model_zoo

rvm_mobilenetv3_fp16.onnx
rvm_mobilenetv3_fp32.onnx
rvm_mobilenetv3.pth
https://github.com/PeterL1n/RobustVideoMatting

Description

First of all thanks for maintaining this amazing model conversion work.

I'm stuck trying to convert the RobustVideoMatting model and I would like to know if you could guide me.

Inside the file:
https://s3.ap-northeast-2.wasabisys.com/pinto-model-zoo/242_RobustVideoMatting/resources_mbnv3.tar.gz

I can find some fixed input versions (rvm_mobilenetv3_192x320, rvm_mobilenetv3_240x320, etc.)

I would like to generate such a model but with different input and r1i, r2i, r3i, r4i, sizes.

The original model has these input sizes:
src [batch_size,3,height,width]
r1i [batch_size,channels,height,width]
r2i [batch_size,channels,height,width]
r3i [batch_size,channels,height,width]
r4i [batch_size,channels,height,width]

I see that with the script "batchsize_clear.py" it is possible to change 'batch_size' to 'N' (I don't know if it should be 'N' or '1')

I cannot find out how to fix the "downsample_ratio" hyperparameter of the original model.

I see a file "rvm_mobilenetv3_HxW.onnx" that eliminates the "downsample_ratio" and sets generic input shapes.

I also see that with the script "set_static_shape.py" it is possible to change the input shapes, but it changes all the inputs ("src" and also the state tensors "rxi").

For example with W=1920 and H=1080 I get these input sizes:
src [1,3,1080,1920]
r1i [1,16,1080,1920]
r2i [1,20,1080,1920]
r3i [1,40,1080,1920]
r4i [1,64,1080,1920]

I am expecting sizes for the rxi inputs with different dimensions like:
r1i_dims = { 1, 16, 192, 320 };
r2i_dims = { 1, 20, 96, 160 };
r3i_dims = { 1, 40, 48, 80 };
r4i_dims = { 1, 64, 24, 40 };

I see also that the sizes for the outputs "fgr" anf "pha" are recalculated but for the rxo outputs:
fgr [1,3,1080,1920]
pha[1,1,1080,1920]
r1o [1,16,height,width]
r2o [1,20,height,width]
r3o [1,40,height,width]
r4o [1,64,height,width]

Could you please give me some clue as to how I could generate an onnx model with a certain size of "src" that correctly sets the sizes of the "rxi" inputs and also the outputs?

Best regards

Relevant Log Output

No response

URL or source code for simple inference testing code

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant