NEW Feature: DeTR Model to torchvision #6922

ambujpawar · 2022-11-07T14:42:32Z

🚀 The feature

Adding the first Transformer-based detection model to Torchvision. A draft PR was submitted regarding the model: #5922
but a Github Issue regarding it did not exist. Therefore creating it now.

Paper: here
Official implementation: here

Pinging @xiaohu2015 and @deepwilson if you guys would like to contribute

Motivation, pitch

First Transformer-based detection model to Torchvision, which has been missing until now in torchvision.

Alternatives

No response

Additional context

No response

oke-aditya · 2022-11-07T17:35:17Z

One problem I do see is that DETR at one step uses scipy.linear_assigment.

How are we going to do that in torchvision? Scipy is not a hard dependency for us

cc @datumbox

ambujpawar · 2022-11-07T22:04:19Z

Thanks, I did not know that.
Perhaps, linear_sum_assignment from scipy.optimize has to be ported to torch?

oke-aditya · 2022-11-08T05:17:42Z

Officially I don't think so. But we can try to port the scipy code

https://github.com/scipy/scipy/blob/v0.18.1/scipy/optimize/_hungarian.py

I can help you with DETR I have some experience on using it.

oke-aditya · 2022-11-08T05:18:55Z

My other concern is the background class choice

In current torchvision models 0 label is considered as background. But I guess for DETR label 0 is not background.
I don't remember exactly. I need to see a bit.

datumbox · 2022-11-08T09:33:37Z

I think given that the original repo uses scipy.optimize.linear_sum_assignment only for training, it's OK to keep it as optional dependency. This means that we can still make the model work without it during inference and thus be jit-scriptable. Currently there are no plans to adding it on core, and give that Scipy's implementation is in C++, it would be quite slow to do this in pure python.

Looping in @fmassa who is one of the authors of the paper in case he can provide a better recommendation.

oke-aditya · 2022-11-08T16:45:41Z

Although it's possible to write custom ops. Like we have NMS etc in cuda and C++ in torchvision we would need to package this too. But let's hear from @fmassa who knows this best.

deepwilson · 2022-11-30T16:00:31Z

Are we still waiting for @fmassa ? We could start building the other code blocks.

datumbox mentioned this issue Nov 7, 2022

[RFC] Batteries Included - Phase 3 #6323

Open

16 tasks

datumbox added the new feature label Nov 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NEW Feature: DeTR Model to torchvision #6922

NEW Feature: DeTR Model to torchvision #6922

ambujpawar commented Nov 7, 2022

oke-aditya commented Nov 7, 2022 •

edited

Loading

ambujpawar commented Nov 7, 2022

oke-aditya commented Nov 8, 2022

oke-aditya commented Nov 8, 2022

datumbox commented Nov 8, 2022

oke-aditya commented Nov 8, 2022 •

edited

Loading

deepwilson commented Nov 30, 2022

NEW Feature: DeTR Model to torchvision #6922

NEW Feature: DeTR Model to torchvision #6922

Comments

ambujpawar commented Nov 7, 2022

🚀 The feature

Motivation, pitch

Alternatives

Additional context

oke-aditya commented Nov 7, 2022 • edited Loading

ambujpawar commented Nov 7, 2022

oke-aditya commented Nov 8, 2022

oke-aditya commented Nov 8, 2022

datumbox commented Nov 8, 2022

oke-aditya commented Nov 8, 2022 • edited Loading

deepwilson commented Nov 30, 2022

oke-aditya commented Nov 7, 2022 •

edited

Loading

oke-aditya commented Nov 8, 2022 •

edited

Loading