-
-
Notifications
You must be signed in to change notification settings - Fork 16.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change broadcast Add/Mul to element-wise Add/Mul in Detect layer #4811
Comments
👋 Hello @SamFC10, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution. If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you. If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available. For business inquiries or professional support requests please visit https://ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com. RequirementsPython>=3.6.0 with all requirements.txt installed including PyTorch>=1.7. To get started: $ git clone https://github.com/ultralytics/yolov5
$ cd yolov5
$ pip install -r requirements.txt EnvironmentsYOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
StatusIf this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), validation (val.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit. |
@SamFC10 thanks for the explanation! Expanding these may seem simple, but be advised input shapes are constantly changing, so if you expand you must check shapes and redefine for every batch, which will slow things down and is not required for pytorch inference or training. The fastest and easiest way to incorporate your ideas into the official codebase is to submit a Pull Request (PR) implementing your idea, and if applicable providing before and after profiling/inference/training results to help us understand the improvement your feature provides. This allows us to directly see the changes in the code and to understand how they affect workflows and performance. Please see our ✅ Contributing Guide to get started. |
@SamFC10 can you verify that expanding works with DNN? self.grid[i] = self.grid[i].expand(bs, self.na, -1, -1, -1)
self.anchor_grid[i] = self.anchor_grid[i].expand(bs, -1, ny, nx, -1) |
Causes an error during creation of onnx model
|
@SamFC10 got it! Please submit a PR with the fix that works for DNN and we will take a look at it there. There's probably always going to be added ops that should be balanced against improved exportability, maybe we can introduce an --expand flag to Detect() that is true only for ONNX export. |
Hi @SamFC10, I tried your fix but I still have some issues with the integration of YOLOv5 into OpenCV DNN. Could you please share the code you use for inference? Thank you very much in advance. |
@GioFic95 Make sure you are using my fork of yolov5 as my fixes haven't been merged yet. Checkout
Code : import numpy as np
import cv2
inp = np.random.rand(1, 3, 640, 640).astype(np.float32)
net = cv2.dnn.readNetFromONNX('yolov5s.onnx')
net.setInput(inp)
out = net.forward()
print(out.shape) returns (1, 25200, 85) with both the branches. |
@SamFC10 I checked that I have the same output shape as you, but nonetheless the results obtained via OpenCV aren't the same I obtain via PyTorch. That is, applying this code to the output in OpenCV ONNX (in C++) I get these results: While the original results, obtained directly with the trained model are the following: Moreover, these are the results obtained using detect.py with the model exported using your repo (the same used in OpenCV ONNX): Do you have any advice on how to solve the issue or hypothesis about its reason to suggest? |
@GioFic95
1. Inference using ONNXRuntime
2. Inference using OpenCV DNNTo use opencv instead of onnxruntime in - check_requirements(('onnx', 'onnxruntime'))
- import onnxruntime
- session = onnxruntime.InferenceSession(w, None)
+ net = cv2.dnn.readNetFromONNX(w) Line 147 - pred = torch.tensor(session.run([session.get_outputs()[0].name], {session.get_inputs()[0].name: img}))
+ net.setInput(img)
+ pred = torch.tensor(net.forward()) Again using
No visible difference
I suspect there is something wrong in post-processing in your C++ code (see this #708 (comment)). I'm not an expert in C++, so can't point out where exactly is the mistake. To check this, maybe try the opposite of what I did. In your C++ code, use onnxruntime instead of opencv and use the exported onnx model from the master repository. If the outputs are still wrong, then post-processing steps has some bugs. |
@SamFC10 might be nice to have a |
@SamFC10 dnn (0.554s) is slower than onnxruntime(0.318) in same onnx file |
@msly Yes opencv inference is slower than onnxruntime (difference of around 50ms - 100ms on my device). The goal of this issue and related PR is to not improve inference speed, but rather making the onnx export of yolov5 compatible with various other backends and not limit it to onnxruntime. |
Removed TODO after PR #4833 merged. |
@SamFC10 I've opened a new PR #5136 to add DNN inference to detect.py using your example here:
But I am running into a bug on (venv) glennjocher@Glenns-iMac yolov5 % python detect.py --weights yolov5s.onnx --dnn
detect: weights=['yolov5s.onnx'], source=data/images, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=True
YOLOv5 🚀 v5.0-509-g9d75e42 torch 1.9.1 CPU
[ERROR:0] global /private/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/pip-req-build-vy_omupv/opencv/modules/dnn/src/onnx/onnx_importer.cpp (2127) handleNode DNN/ONNX: ERROR during processing node with 2 inputs and 1 outputs: [Unsqueeze]:(390)
Traceback (most recent call last):
File "/Users/glennjocher/PycharmProjects/yolov5/detect.py", line 306, in <module>
main(opt)
File "/Users/glennjocher/PycharmProjects/yolov5/detect.py", line 301, in main
run(**vars(opt))
File "/Users/glennjocher/PycharmProjects/yolov5/venv/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "/Users/glennjocher/PycharmProjects/yolov5/detect.py", line 92, in run
net = cv2.dnn.readNetFromONNX(w)
cv2.error: OpenCV(4.5.3) /private/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/pip-req-build-vy_omupv/opencv/modules/dnn/src/onnx/onnx_importer.cpp:2146: error: (-2:Unspecified error) in function 'handleNode'
> Node [Unsqueeze]:(390) parse error: OpenCV(4.5.3) /private/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/pip-req-build-vy_omupv/opencv/modules/dnn/src/onnx/onnx_importer.cpp:1551: error: (-215:Assertion failed) node_proto.input_size() == 1 in function 'handleNode' I created the ONNX model simply with |
Need to use latest version i.e. opencv-python is still on 4.5.3 so we need to wait till the latest version is released which contains the fix. Once it is released, Meanwhile, the following onnx models should also work with OpenCV 4.5.3:
|
@SamFC10 thanks! I've added your comments to the PR and a new commented line to check the >=4.5.4 requirement, will uncomment line once version is released. # check_requirements(('opencv-python>=4.5.4',)) |
🚀 Feature
Motivation
ONNX model produced by export.py is not compatible for inference (even with --simplify) in OpenCV's DNN module, as mentioned in these issues #4471 opencv/opencv#20072.
The problematic nodes are 2 broadcast add and mul nodes in the final detect layer. OpenCV's DNN module cannot handle these broadcast operations currently leading to errors.
Pitch
The add node comes from the broadcast add of self.grid
yolov5/models/yolo.py
Line 66 in 621b6d5
and the mul node from the broadcast mul of self.anchor_grid
yolov5/models/yolo.py
Line 67 in 621b6d5
Both grid and anchor_grid are constant, so I suggest expanding these tensors to their respective input sizes using pytorch's expand or repeat operation so that an elementwise operation is used. I have tried modifying the Detect to expand these tensors, but there are additional nodes added in the final onnx model.
I request @glenn-jocher or another contributor to take a look at this so that the exported yolov5 onnx model can be used in opencv for faster inference.
The text was updated successfully, but these errors were encountered: