Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NEW Feature: DeTR Model to torchvision #6922

Open
ambujpawar opened this issue Nov 7, 2022 · 7 comments
Open

NEW Feature: DeTR Model to torchvision #6922

ambujpawar opened this issue Nov 7, 2022 · 7 comments

Comments

@ambujpawar
Copy link
Contributor

🚀 The feature

Adding the first Transformer-based detection model to Torchvision. A draft PR was submitted regarding the model: #5922
but a Github Issue regarding it did not exist. Therefore creating it now.

Paper: here
Official implementation: here

Pinging @xiaohu2015 and @deepwilson if you guys would like to contribute

Motivation, pitch

First Transformer-based detection model to Torchvision, which has been missing until now in torchvision.

Alternatives

No response

Additional context

No response

@oke-aditya
Copy link
Contributor

oke-aditya commented Nov 7, 2022

One problem I do see is that DETR at one step uses scipy.linear_assigment.

How are we going to do that in torchvision? Scipy is not a hard dependency for us

cc @datumbox

@ambujpawar
Copy link
Contributor Author

Thanks, I did not know that.
Perhaps, linear_sum_assignment from scipy.optimize has to be ported to torch?

@oke-aditya
Copy link
Contributor

Officially I don't think so. But we can try to port the scipy code

https://github.com/scipy/scipy/blob/v0.18.1/scipy/optimize/_hungarian.py

I can help you with DETR I have some experience on using it.

@oke-aditya
Copy link
Contributor

My other concern is the background class choice

In current torchvision models 0 label is considered as background. But I guess for DETR label 0 is not background.
I don't remember exactly. I need to see a bit.

@datumbox
Copy link
Contributor

datumbox commented Nov 8, 2022

I think given that the original repo uses scipy.optimize.linear_sum_assignment only for training, it's OK to keep it as optional dependency. This means that we can still make the model work without it during inference and thus be jit-scriptable. Currently there are no plans to adding it on core, and give that Scipy's implementation is in C++, it would be quite slow to do this in pure python.

Looping in @fmassa who is one of the authors of the paper in case he can provide a better recommendation.

@oke-aditya
Copy link
Contributor

oke-aditya commented Nov 8, 2022

Although it's possible to write custom ops. Like we have NMS etc in cuda and C++ in torchvision we would need to package this too. But let's hear from @fmassa who knows this best.

@deepwilson
Copy link

Are we still waiting for @fmassa ? We could start building the other code blocks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants