-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CombinedNonMaxSuppression is not supported in ONNX #1337
Comments
Hi @jan-golda to convert this op, we can either 1) compose it out of ONNX ops or 2) implement it as a custom op. From your experimenting in TF is seems like you are finding making a compositions of ops to be expensive. We have previously made relatively complicated ops out of compositions of ops with decent performance, but sometimes it can't be done. In this case, I suspect we can get an efficient composition. Are you using any loops in your implementation, or are you using only batched tensor ops + Gather? |
Hi @TomWildenhain-Microsoft |
Should be possible to add support for CombinedNonMaxSuppression since the onnx nms op supports batch. |
@guschmue nice to hear that! Do you think it would be possible to implement that in the near future? |
Yep, I'm working on it! Hopefully by end of the week. |
What is the dimension of the scores tensor for your model? TF has 2 different behaviors (are you sharing boxes across classes?) |
Sorry for the late reply! There are two separate places in the code where this op is used. Below you will find some example shapes for these two places:
So the answer is no - I am not sharing the boxes across classes since the third dimension of boxes always equals the third dimension of scores |
Well... just finished implementing it for the other version: #1376. Non-sharing is a little harder since ONNX does share boxes across classes for NMS. I could just make the boxes shared and zero-out the score for all but one class, but that will be a lot of zeros (90 per box, so 90 * 4 * 1000 * 90 = 32 million (probably too large). Place 2 should work fine with the current implementation since there is only 1 class. For place 1, what is the max_outputs and max outputs per class? |
max_output_size_per_class=1000,
max_total_size=100 |
For experimental purposes, can you try testing the performance of the CombinedNonMaxSuppression implementation I've done so far? Making boxes non-shared will be a decent bit harder but have similar perf so it would be nice to know if the perf is sufficient. Just add a slice before your CombinedNonMaxSuppression to cut the dim from 90 to 1 and see how the perf of the conversion to ONNX compares to TF. If the perf is not good, we may have to use a custom op or try a different implementation approach. |
assume this is resolved. |
Pure ONNX Multi-Class NonMaximumSuppression, CombinedNonMaxSuppression. |
What a cool job!☺ |
@hwangdeyu I used cv::dnn::NMSbox. I can share my work. but this is onnxruntime C++ not tflite. python
onnxruntime C++
it work fine. |
Describe the bug
I am trying to convert MaskRCNN in TensorFlow 2 to ONNX but it is failing due to
CombinedNonMaxSuppression
op being not supported in ONNX.Urgency
It is blocking the use of MaskRCNN in ONNX.
System information
To Reproduce
Try to convert any model that uses
tf.image.combined_non_max_suppression
Expected behaviour
A model should be converted without a fail on
CombinedNonMaxSuppression
op.Additional context
This was already reported in #847 for YOLO, and I have tried to apply the workaround from there - replace
CombinedNonMaxSuppression
with aNonMaxSuppression
accompanied by a set of ops that were meant to replace the "Combined" part.I have tried to get it to work for a few days, but in the case of MaskRCNN, this seems to be more complicated than in the case of YOLO. I had to apply the
NonMaxSuppression
for each class in each sample from batch separately, then pad it, select the top results for a class, then select the top results for a box, retrieve information about classes and scores, pad it again and finally gather results across the batch.There is a reason why this op was added to TF, as recreating it from scratch is quite complicated, therefore I would like to ask if you could add the support for it in the ONNX.
Moreover, when I was running the model with partially applied changes I was observing significant performance drop when running it with automatic mixed-precision - in short: replacing the
CombinedNonMaxSuppression
has a noticeable impact on the original TF model, which is not ideal.The text was updated successfully, but these errors were encountered: