Cuda编程加速图像预处理

项目简介

基于 cuda 和 opencv 环境
目标：
- 单独使用，以加速图像处理操作；
- 结合 TensorRT 使用，进一步加快推理速度

加速效果

这里对比 Deeplabv3+ 使用 cuda 预处理前后的 tensorrt 推理速度
未使用cuda图像预处理的代码，可参考作者的另一个 tensorrt 的项目：

Deeplabv3+	FP32	FP16	INT8
C++图像预处理	22 ms	12 ms	10 ms
CUDA图像预处理	15 ms	5 ms	3 ms

对比 YOLOv5-v5.0 使用 cuda 预处理前后的 tensorrt 推理速度

YOLOv5-v5.0	FP32	FP16	INT8
C++图像预处理	12 ms	8 ms	6 ms
CUDA图像预处理	6 ms	3 ms	3 ms

YOLOv5 TensorRT 推理代码源自作者其他的项目 C++预处理 CUDA预处理

文件说明

project dir
    ├── bgr2rgb  # 实现BGR转RGB的cuda加速
    |   ├── Makefile
    |   └── bgr2rgb.cu
    ├── bilinear  # 实现双线性插值的cuda加速
    |   ├── Makefile
    |   └── resize.cu
    ├── hwc2chw  # 实现通道维度前置的cuda加速
    |   ├── Makefile
    |   └── transpose.cu
    ├── normalize  # 实现归一化的cuda加速
    |   ├── Makefile
    |   └── normal.cu
    ├── preprocess  # 汇总以上的图像处理（不是简单的拼接），实现常用的图像预处理，之后输入到网络当中
    |   ├── Makefile
    |   └── preprocess.cu
    ├── union_tensorrt  # 将上述的图像预处理，结合TensorRT一起使用，对比推理加速效果
    |   ├── Makefile
    |   ├── preprocess.cu
    |   ├── preprocess.h
    |   └── trt_infer.cpp  # 用于模型推理
    └── lena.jpg  # 用于测试的图片

使用说明

图像加速单一操作：

对于目录：bgr2rgb、bilinear、hwc2chw、normalize，实现单一功能上的图像操作加速
使用测试：

cd <dir name>
make
./<bin file> <image path>

example:
cd bgr2rgb
make
./bgr2rgb ../lena.jpg

备注：如果 cuda 或 opencv 安装目录与 Makefile 中的不同，记得切换成自己的

常规图像预处理

在推理之前，图像通常需经过 Resize、BGR to RGB、HWC to CHW、Normalize
使用测试：

cd preprocess
make
./preprocess ../lena.jpg  # 即可对图像完成上述全部操作

结合 TensorRT 使用

使用方式：

1）根据作者的另一个 tensorrt 的项目，构建好环境，下载分割数据集，并训练Deeplabv3+网络

2）进入到目录：Deeplabv3+/TensorRT/C++/api_model/

3）将本项目的union_tensorrt目录下的文件放入上述目录中（或替换原文件）

4）依次执行以下命令来使用TensorRT推理

python pth2wts.py
make
./trt_infer

5）得到以下结果，则说明运行成功，同目录下会生成分割结果图像

Loading weights: ./para.wts
Succeeded building backbone!
Succeeded building aspp!
Succeeded building decoder!
Succeeded building total network!
Succeeded building serialized engine!
Succeeded building engine!
Succeeded saving .plan file!
Total image num is: 8 inference total cost is: 105ms average cost is: 19ms

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
bgr2rgb		bgr2rgb
bilinear		bilinear
hwc2chw		hwc2chw
letterbox		letterbox
normalize		normalize
preprocess		preprocess
union_tensorrt		union_tensorrt
.gitignore		.gitignore
LICENSE		LICENSE
README-en.md		README-en.md
README.md		README.md
lena.jpg		lena.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cuda编程加速图像预处理

项目简介

加速效果

文件说明

使用说明

图像加速单一操作：

常规图像预处理

结合 TensorRT 使用

About

Releases

Packages

Languages

License

emptysoal/cuda-image-preprocess

Folders and files

Latest commit

History

Repository files navigation

Cuda编程加速图像预处理

项目简介

加速效果

文件说明

使用说明

图像加速单一操作：

常规图像预处理

结合 TensorRT 使用

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages