GitHub - jizhuoran/sj_convolution: Convolution Neural Network Inference on Mobile GPUs with OpenCL

This repo is a demo of the technical report of

HNMTP Conv: Optimize Convolution Algorithm for Single-Image Convolution Neural Network Inference on Mobile GPUs

It achieves 14.6x speedup than the most popular im2col convolution algorithm, and 2.1x speedup than the fastest existing convolution algorithm (direct convolution) as far as we know.

Dependency:

OpenCL, clBLAS, OpenBLAS(for result checking)

How to use:

mkdir build && cd build cmake .. make -j16

This is just a prototype to illustrate the idea.

Code refactoring is on-going.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
cmake		cmake
utils		utils
.DS_Store		.DS_Store
._.DS_Store		._.DS_Store
CMakeLists.txt		CMakeLists.txt
README.md		README.md
main.cpp		main.cpp
sj_conv.cpp		sj_conv.cpp
sj_conv.hpp		sj_conv.hpp
sj_convolution_codegen.cpp		sj_convolution_codegen.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

jizhuoran/sj_convolution

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages