[x86] Depthwise conv2d #6745

lxwlaq · 2021-08-23T04:34:43Z

add depthwise 3×3s1p1 3×3s2p1 optimize

conv	old	new	rate
1×32×112×112 stride=1	0.205	0.157	23.4%
1×64×112×112 stride=2	0.232	0.182	21.5%
1×128×56×56 stride=1	0.210	0.162	22.8%
1×128×56×56 stride=2	0.114	0.100	12.2%

model	old	new	rate
MobileNetV1	14.69	12.34	15.9%
MobileNetV2	11.8	9.12	22.7%
MobileNetV3_large	18.82	17.83	5.3%
MobileNetV3_small	9.58	9.36	2.3%

paddle-bot-old · 2021-08-23T04:34:57Z

Thanks for your contribution!

chenjiaoAngel · 2021-08-27T08:20:19Z

lite/backends/x86/math/CMakeLists.txt

@@ -87,12 +87,16 @@ if (WITH_AVX AND AVX_FOUND)
  math_library (interpolate AVX2 TRUE DEPS math_function)
  math_library (power DEPS AVX2 TRUE DEPS avx_mathfuns)
  math_library (rnn AVX2 TRUE)
+  math_library (conv_depthwise_direct AVX2 TRUE)


加上性能优化后的数据，例如：

chenjiaoAngel · 2021-08-27T09:49:40Z

lite/backends/x86/math/conv_depthwise_3x3.cc

+          __m128i mask = _mm_setr_epi32(0x80000000, 0x80000000, 0x80000000, 0);
+          if (j + 1 == col) {
+            __m256 rmaski_ = _mm256_loadu_ps(rmask_i);
+            i0 = _mm256_mul_ps(i0, rmaski_);


可以用_mm256_maskload_ps 实现，有效数据load

或者用_mm256_blend_ps实现有效数据选择

lite/backends/x86/math/conv_depthwise_3x3.cc

chenjiaoAngel · 2021-08-27T09:54:51Z

lite/backends/x86/math/conv_depthwise_3x3.cc

+                                            0x80000000);
+          if (j + 1 == col) {
+            __m256 rmaski_ = _mm256_loadu_ps(rmask_i);
+            i0 = _mm256_mul_ps(i0, rmaski_);


chenjiaoAngel · 2021-08-27T09:55:31Z

lite/backends/x86/math/conv_depthwise_direct.cc

+}  // namespace math
+}  // namespace x86
+}  // namespace lite
+}  // namespace paddl


chenjiaoAngel · 2021-08-27T09:57:17Z

lite/kernels/x86/conv_depthwise.cc

+    int ow = o_dims[3];
+    int oc = o_dims[1];
+
+    lite::x86::math::conv_depthwise_direct(


这个是不是可以直接调用具体实现，减少嵌套调用。
如：if stride == 1
conv_depthwise_3x3s1_p1_direct(din,
dout,
num,
ch_out,
h_out,
w_out,
ch_in,
h_in,
w_in,
weights,
bias,
pad,
flag_bias,
act_param);

…into depthwise_conv2d

chenjiaoAngel

LGTM

lxwlaq added 4 commits August 20, 2021 14:43

add x86 depthwise_conv 3x3s1p1 3x3s2p1 test=develop

b1fd731

merge from develop test=develop

4da6c5e

Fix pre-commit error test=develop

08e2df6

Fix lite/kernels/x86/CMakeLists.txt confict test=develop

71c805c

lxwlaq closed this Aug 23, 2021

lxwlaq reopened this Aug 23, 2021

fix lite/kernels/x86/CMakeLists.txt conflict test=develop

e27c25f

chenjiaoAngel reviewed Aug 27, 2021

View reviewed changes

lxwlaq added 3 commits August 28, 2021 12:09

optimize mask about left and right processing test=develop

351522a

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle-Lite …

5037411

…into depthwise_conv2d

fix pre-commit

6716de2

lxwlaq closed this Aug 29, 2021

fix lite/kernels/x86/conv_depthwise.cc bug

e3530e1

lxwlaq reopened this Aug 29, 2021

fix CMakeLists.txt bug

d627821

lxwlaq closed this Sep 2, 2021

lxwlaq reopened this Sep 2, 2021

lxwlaq and others added 3 commits September 2, 2021 16:03

fix pre-commit bug

c4bc9ef

fix pre-commmit bug

1061670

Merge branch 'develop' into depthwise_conv2d

4a3bffa

lxwlaq closed this Sep 7, 2021

lxwlaq reopened this Sep 7, 2021

lxwlaq closed this Sep 7, 2021

lxwlaq reopened this Sep 7, 2021

fix conflict

7a86db4

chenjiaoAngel approved these changes Sep 10, 2021

View reviewed changes

chenjiaoAngel merged commit fa7ad7b into PaddlePaddle:develop Sep 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[x86] Depthwise conv2d #6745

[x86] Depthwise conv2d #6745

lxwlaq commented Aug 23, 2021 •

edited

Loading

paddle-bot-old bot commented Aug 23, 2021

chenjiaoAngel Aug 27, 2021

chenjiaoAngel Aug 27, 2021

chenjiaoAngel Aug 27, 2021

chenjiaoAngel Aug 27, 2021

chenjiaoAngel Aug 27, 2021

chenjiaoAngel Aug 27, 2021

chenjiaoAngel Aug 27, 2021

chenjiaoAngel left a comment

[x86] Depthwise conv2d #6745

[x86] Depthwise conv2d #6745

Conversation

lxwlaq commented Aug 23, 2021 • edited Loading

paddle-bot-old bot commented Aug 23, 2021

chenjiaoAngel Aug 27, 2021

Choose a reason for hiding this comment

chenjiaoAngel Aug 27, 2021

Choose a reason for hiding this comment

chenjiaoAngel Aug 27, 2021

Choose a reason for hiding this comment

chenjiaoAngel Aug 27, 2021

Choose a reason for hiding this comment

chenjiaoAngel Aug 27, 2021

Choose a reason for hiding this comment

chenjiaoAngel Aug 27, 2021

Choose a reason for hiding this comment

chenjiaoAngel Aug 27, 2021

Choose a reason for hiding this comment

chenjiaoAngel left a comment

Choose a reason for hiding this comment

lxwlaq commented Aug 23, 2021 •

edited

Loading