[RELAY][PASS] CombineParallelConv2D #2089

vinx13 · 2018-11-12T03:53:53Z

This pass replace convolutions that share the same input node and the same arguments (except that the number of output channels can be different) with a single convolution. The weight of the new 2d convolution is the concatenation of the original weights.
The original conv2d nodes are replaced with strided_slice ops to get a slices of the output of the new conv2d.

This prevents launching multiple kernels in networks with multiple convolution branches, such as Inception block.

use strided_slice instead of take
fold following bias, relu, bn

Algorithm

Find parallel branches like

      data
    /      \
conv2d    conv2d
|            |
op           op
|            |

where op is elemwise or broadcast op.
2. Group branches by kernel shape and attrs of conv2d.
3. Combine conv2d in the same group and possibly combine subsequent ops.
4. Use strided_slice to split the output of the combined op.

Please review @tqchen @jroesch @ZihengJiang @masahi

src/relay/pass/expr_subst.h

src/relay/pass/fold_conv2d.cc

tqchen · 2018-11-12T06:22:09Z

@vinx13 can you also elaborate a bit on possible use-cases of this pass?

masahi · 2018-11-12T08:06:25Z

I guess a bigger kernel is better for performance.

merrymercy · 2018-11-12T08:35:00Z

TensorRT does this optimization https://devblogs.nvidia.com/deploying-deep-learning-nvidia-tensorrt/#attachment_6827
It can benefit inception like networks.

src/relay/pass/fold_conv2d.cc

masahi · 2018-11-12T12:39:13Z

@vinx13 What happens if convolutions are followed by elemwise ops? (bias, batch norm etc, which should be fused into a single op) Elemwise ops that follow a convolution can be different for each child convolution branch.

Can the folded convolution still be fused with elemwise ops?

vinx13 · 2018-11-12T13:05:23Z

After this pass, the original convolutions will be replaced with Take. Elemwise ops will follow Take (each branch will follow a different Take).
Take is an injective op, so we need a way to fuse convolution and take.

masahi · 2018-11-12T13:20:22Z

hmm interesting. I understand that we can at least fuse elemwise ops into each Take. But even if we allow fusing convolution with Take, I don't see how we can fuse multiple Take with a single folded conv op.

Code itself looks good though. I understood how it works.

merrymercy · 2018-11-12T19:04:53Z

So we should also fold the following bias, relu and bn to keep the final graph easy for fusion. These elementwise ops should be placed before Take.

masahi · 2018-11-12T22:22:38Z

yeah, but element wise ops that follow convolution can be different for each child convolution branch (I don't know about inception network). So I think there will be less chance of folding?

Since there are multiple convolution branches, not everything can be fused anyway. If we fold element wise ops as well, we are left with multiple Take ops that cannot be fused with anything.

I think we might as well stop fusing (or "Realize" in a NNVM term) at the folded convolution op, and let Take ops be fused with element wise ops.

tqchen · 2018-11-12T22:24:00Z

I am working on a new fusion pass in the new relay IR and hopefully we can followup on this topic there

tqchen · 2018-11-13T01:14:13Z

Some specific followup comments:

Let us also think a bit more about naming, usually fold means removing(e.g. constant folding). Maybe a more appropriate name is CombineParallelConv2D?
Let use avoid using take, but instead use slice operator to slice the outputs
Let us also fold followup bias, relu and bn when possible as suggested by @merrymercy

tqchen · 2018-11-13T16:13:06Z

src/relay/pass/pattern_util.h

@@ -135,6 +150,20 @@ inline Constant MakeConstantScalar(DataType dtype, T value) {
  return ConstantNode::make(arr);
 }

+template<typename T, typename = typename std::enable_if<std::is_integral<T>::value>::type>


Let us use slice instead of take to get the final result

tqchen · 2018-11-13T16:21:53Z

The new set of changes LGTM, @vinx13 can you also add support for fusing followup ops and a testcase?

tqchen · 2018-11-14T03:21:10Z

strided_slice is added in #2094

tqchen · 2018-11-20T17:17:53Z

@MarisaKirisame @merrymercy can you please take another look and https://docs.tvm.ai/contribute/code_review.html#approve-and-request-changes-explicitly

tqchen · 2018-11-21T17:52:02Z

@vinx13 please rebase against the master

tqchen · 2018-11-22T18:42:04Z

Thanks, @MarisaKirisame @masahi @vinx13 this is merged!

MarisaKirisame reviewed Nov 12, 2018

View reviewed changes

src/relay/pass/expr_subst.h Outdated Show resolved Hide resolved

src/relay/pass/fold_conv2d.cc Outdated Show resolved Hide resolved

vinx13 force-pushed the feature/fold_conv2d_pass branch from 4d24076 to 0705a3f Compare November 12, 2018 05:58

vinx13 force-pushed the feature/fold_conv2d_pass branch 2 times, most recently from 5c5c8dc to 2accc32 Compare November 12, 2018 06:36

masahi reviewed Nov 12, 2018

View reviewed changes

src/relay/pass/fold_conv2d.cc Outdated Show resolved Hide resolved

masahi reviewed Nov 12, 2018

View reviewed changes

src/relay/pass/fold_conv2d.cc Outdated Show resolved Hide resolved

masahi reviewed Nov 12, 2018

View reviewed changes

src/relay/pass/fold_conv2d.cc Outdated Show resolved Hide resolved

yzhliu added the status: review in progress label Nov 13, 2018

vinx13 force-pushed the feature/fold_conv2d_pass branch 2 times, most recently from 03891e8 to a4078ba Compare November 13, 2018 11:54

tqchen changed the title ~~[RELAY][PASS] Add FoldConv2D pass~~ [RELAY][PASS] CombineParallelConv2D Nov 13, 2018

tqchen requested changes Nov 13, 2018

View reviewed changes

tqchen added the status: need update need update based on feedbacks label Nov 13, 2018

tqchen mentioned this pull request Nov 13, 2018

[RELAY][OP] strided_slice #2094

Merged

tqchen removed the status: review in progress label Nov 14, 2018

vinx13 force-pushed the feature/fold_conv2d_pass branch 3 times, most recently from 2365049 to d0599e8 Compare November 14, 2018 11:36

tqchen added status: need review and removed status: need update need update based on feedbacks labels Nov 20, 2018

vinx13 added 14 commits November 22, 2018 10:26

Add FoldConv2D pass

ca7a659

Fix test on i386

c1b73c1

Update comments

febef4e

Fix style

aaade9b

Minor enhancement

0318eef

Use proper index in data_layout and weight_layout

002b4e5

Rename to CombineParallelConv2D

f923a16

Use unordered_map

1934fd2

Get channels info from type instead of attrs

c638625

Replace take with strided_slice

8186479

Combine subsequent elemwise/broadcast ops

74ca32d

Minor improvement

9ee9396

Fix size_t issue for i386

9aafe00

Fix indent

4302789

vinx13 force-pushed the feature/fold_conv2d_pass branch 2 times, most recently from e2f2cd1 to a0f8331 Compare November 22, 2018 02:35

Remove unused header

bed2900

vinx13 force-pushed the feature/fold_conv2d_pass branch from a0f8331 to bed2900 Compare November 22, 2018 02:36

tqchen merged commit 53ac89e into apache:master Nov 22, 2018

tqchen added status: accepted and removed status: need review labels Nov 22, 2018

FrozenGene pushed a commit to FrozenGene/tvm that referenced this pull request Dec 27, 2018

[RELAY][PASS] CombineParallelConv2D (apache#2089)

de004c5

ZihengJiang mentioned this pull request Feb 1, 2019

TVM 0.5 Release Note #2448

Closed

wweic pushed a commit to neo-ai/tvm that referenced this pull request Feb 20, 2019

[RELAY][PASS] CombineParallelConv2D (apache#2089)

e200361

wweic pushed a commit to neo-ai/tvm that referenced this pull request Feb 20, 2019

[RELAY][PASS] CombineParallelConv2D (apache#2089)

44f5134

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RELAY][PASS] CombineParallelConv2D #2089

[RELAY][PASS] CombineParallelConv2D #2089

vinx13 commented Nov 12, 2018 •

edited

Loading

tqchen commented Nov 12, 2018

masahi commented Nov 12, 2018 •

edited

Loading

merrymercy commented Nov 12, 2018

masahi commented Nov 12, 2018 •

edited

Loading

vinx13 commented Nov 12, 2018 •

edited

Loading

masahi commented Nov 12, 2018

merrymercy commented Nov 12, 2018

masahi commented Nov 12, 2018

tqchen commented Nov 12, 2018

tqchen commented Nov 13, 2018 •

edited

Loading

tqchen Nov 13, 2018

tqchen commented Nov 13, 2018

tqchen commented Nov 14, 2018

tqchen commented Nov 20, 2018

tqchen commented Nov 21, 2018

tqchen commented Nov 22, 2018

[RELAY][PASS] CombineParallelConv2D #2089

[RELAY][PASS] CombineParallelConv2D #2089

Conversation

vinx13 commented Nov 12, 2018 • edited Loading

Algorithm

tqchen commented Nov 12, 2018

masahi commented Nov 12, 2018 • edited Loading

merrymercy commented Nov 12, 2018

masahi commented Nov 12, 2018 • edited Loading

vinx13 commented Nov 12, 2018 • edited Loading

masahi commented Nov 12, 2018

merrymercy commented Nov 12, 2018

masahi commented Nov 12, 2018

tqchen commented Nov 12, 2018

tqchen commented Nov 13, 2018 • edited Loading

tqchen Nov 13, 2018

Choose a reason for hiding this comment

tqchen commented Nov 13, 2018

tqchen commented Nov 14, 2018

tqchen commented Nov 20, 2018

tqchen commented Nov 21, 2018

tqchen commented Nov 22, 2018

vinx13 commented Nov 12, 2018 •

edited

Loading

masahi commented Nov 12, 2018 •

edited

Loading

masahi commented Nov 12, 2018 •

edited

Loading

vinx13 commented Nov 12, 2018 •

edited

Loading

tqchen commented Nov 13, 2018 •

edited

Loading