deprecate DepthwiseConv once we have groups in standard conv #1667

CarloLucibello · 2021-07-15T05:51:34Z

DepthwiseConv is a grouped convolution with groups = channels_in

The text was updated successfully, but these errors were encountered:

DhairyaLGandhi · 2021-07-16T10:14:00Z

Tensorflow has both, and it seems like that would be a nice distinction to maintain.

CarloLucibello · 2021-07-16T10:50:23Z

we could keep the layer/function but just as a wrapper around Conv/conv. This way we will fix #459

DhairyaLGandhi · 2021-07-16T11:47:47Z

Why not add the required groups support to the dims object now that we have the infrastructure to do so? Wrapping it over Conv sounds inelegant. We have a better depthwise kernel for handling that case too.

darsnack · 2021-07-16T13:17:54Z

I think the suggestion is that both Conv and DepthwiseConv call NNlib.conv with the groups specified in dims accordingly.

ToucheSir · 2021-07-16T16:45:49Z

Am I correct to think that Conv and DepthwiseConv can have the exact same forward pass now that FluxML/NNlib.jl#289 is merged? Then DepthwiseConv could become a constructor or convenience function instead of its own layer.

darsnack · 2021-07-16T17:12:55Z

You are correct, I think they could be tweaked to have the same code. And I was going to suggest it as a convenience constructor at first, but I think keeping it as its own layer might be nice for two reasons: a) future optimizations (minor) and b) seeing DepthwiseConv print out when examining model structure is useful. (b) also applies to easier dispatch. We could still share code while maintaining DepthwiseConv distinctly at the layer level.

DhairyaLGandhi · 2021-07-16T18:13:12Z

How does CUDNN handle the depthwise conv

CarloLucibello · 2021-07-16T18:26:48Z

It doesn't, one uses groups = channels_in in the standard convolution (the only one cudnn exposes)

ToucheSir · 2021-07-16T18:33:06Z

Which is covered by FluxML/NNlibCUDA.jl#9, no additional machinery required :)

My main issue with having a separate DepthwiseConv is that it gives 2 ways to represent the same thing. Since we can dispatch on the number of groups now, is there a need for a separate named type? Flux has generally erred on the side of not having layers that are a more specific version of another layer, so this would be an exception.

CarloLucibello · 2021-07-16T18:43:37Z

We could want a separate layer for printing, discoverability, and to avoid breakage.

I would also add DepthwiseSeparableConv as its own layer or as an option to DepthwiseConv.
DepthwiseSeparableConv is DepthwiseConv followed by a Conv with a 1x1 kernel.

ToucheSir · 2021-07-16T18:50:32Z

So if someone creates a Conv with groups == input channels, do we still print it out as a Conv? Certainly we still want a way for users to write DepthwiseConv and get a layer back, my point is that the distinction is not so clear cut.

darsnack · 2021-07-16T22:07:17Z

I agree that it is not straightforward, but I am inclined to think that if someone creates a Conv with groups == input channels, then either
a) they aren't mentally thinking of this as a DepthwiseConv otherwise they would have just used it, or
b) they already know of the equivalence, and what prints out will not be surprising to them.

I think what's more likely is that if someone who doesn't know the equivalence calls DepthwiseConv and a Conv prints out, then they will be confused.

DhairyaLGandhi · 2021-07-16T22:21:04Z

Having a depthwise conv seems strictly better. People who don't know/ don't want to know the groups setting can use DepthwiseConv and forget it. To those who understand the equivalence can choose between the two and probably still should prefer the DepthwiseConv for clarity of code. We should add a note to the DepthwiseConv layer stating that it's a special case of Conv + groups. Helps both categories of users. Conv already refers the depthwise case, which means we are doing a decent job of helping with discovering the feature.

ToucheSir · 2021-07-16T22:51:58Z

I think what's more likely is that if someone who doesn't know the equivalence calls DepthwiseConv and a Conv prints out, then they will be confused.

My thought was to actually leverage the type system here. Since Julia is smart enough to print out aliases when they're defined, creating a Conv with the correct number of groups could show up as a DepthwiseConv if either we parameterize the Conv type with the channels and groups or add a bit of logic to show.

This assumes the only reason to have a separate type over a constructor is for better display of course. If so, I'm of the opinion that big compound names generally hide an abstraction waiting to be let out. For example, if we add DepthwiseSeparableConv do we add SpatialSeparableConv as well?

DhairyaLGandhi · 2021-07-17T04:20:12Z

creating a Conv with the correct number of groups could show up as a DepthwiseConv

The value of having it isn't in the printing, it's in making code readable. Adding checks to return (or print) mismatched types would also needlessly complicate Flux, which I am strictly against.

We already have overloads for conv for CUDA, I don't see why not extend the same courtesy to depthwiseconv.

We haven't heard complaints over its existence, and I doubt anyone is going to be mad either, especially if it's made GPU friendly.

ToucheSir · 2021-07-17T06:04:38Z

The value of having it isn't in the printing, it's in making code readable.

Are you referring to user code or library code? Because for user code, nothing changes. You call DepthwiseConv, and it gives you back something that does a DepthwiseConv. With printing support, it'd even show as a DepthwiseConv instead of a plain Conv.

If this is about internal complexity, which I assume is what you mean by

Adding checks to return (or print) mismatched types would also needlessly complicate Flux, which I am strictly against.

Then I think it's worth looking at what changes would have to be made. This:

struct Conv{N,M,F,A,V}
...
end

Becomes

struct Conv{InC,OutC,Groups,N,M,F,A,V} # type params left unabbreviated for clarity
...
end

const DepthwiseConv = Conv{InC,OutC,InC}

Since we don't presently have an easy way of querying the number of channels or groups from a Conv layer, this could be killing 2 birds with one stone?

Here are the all of the potential hangups I can think of:

We don't want to add more type params to Conv. I have at best a hazy grasp of what the consequences would be, but I assume more unbound type params is not ideal.
This defines one type that can be printed 2 different ways. There is a precedent for this with Vector, but one could imagine programmatically constructing a CNN and being confused by why no Convs are showing when the model is printed.
Having a dedicated DepthwiseConv implementation makes it easier to support DepthwiseSeparableConv. e.g. using dispatch or using another (Conv) layer as "transform" like has been proposed in Fix Norm Layers, Again #1509 (comment).

DhairyaLGandhi · 2021-07-22T12:56:22Z

Also ref FluxML/NNlibCUDA.jl#22 (comment)

ToucheSir transferred this issue from FluxML/NNlib.jl Jul 16, 2021

CarloLucibello mentioned this issue Mar 31, 2022

remove DepthwiseConv type in favor of Conv #1921

Merged

CarloLucibello closed this as completed in #1921 Apr 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

deprecate DepthwiseConv once we have groups in standard conv #1667

deprecate DepthwiseConv once we have groups in standard conv #1667

CarloLucibello commented Jul 15, 2021

DhairyaLGandhi commented Jul 16, 2021

CarloLucibello commented Jul 16, 2021

DhairyaLGandhi commented Jul 16, 2021

darsnack commented Jul 16, 2021

ToucheSir commented Jul 16, 2021

darsnack commented Jul 16, 2021

DhairyaLGandhi commented Jul 16, 2021

CarloLucibello commented Jul 16, 2021

ToucheSir commented Jul 16, 2021

CarloLucibello commented Jul 16, 2021 •

edited

Loading

ToucheSir commented Jul 16, 2021

darsnack commented Jul 16, 2021

DhairyaLGandhi commented Jul 16, 2021

ToucheSir commented Jul 16, 2021

DhairyaLGandhi commented Jul 17, 2021

ToucheSir commented Jul 17, 2021

DhairyaLGandhi commented Jul 22, 2021

deprecate DepthwiseConv once we have groups in standard conv #1667

deprecate DepthwiseConv once we have groups in standard conv #1667

Comments

CarloLucibello commented Jul 15, 2021

DhairyaLGandhi commented Jul 16, 2021

CarloLucibello commented Jul 16, 2021

DhairyaLGandhi commented Jul 16, 2021

darsnack commented Jul 16, 2021

ToucheSir commented Jul 16, 2021

darsnack commented Jul 16, 2021

DhairyaLGandhi commented Jul 16, 2021

CarloLucibello commented Jul 16, 2021

ToucheSir commented Jul 16, 2021

CarloLucibello commented Jul 16, 2021 • edited Loading

ToucheSir commented Jul 16, 2021

darsnack commented Jul 16, 2021

DhairyaLGandhi commented Jul 16, 2021

ToucheSir commented Jul 16, 2021

DhairyaLGandhi commented Jul 17, 2021

ToucheSir commented Jul 17, 2021

DhairyaLGandhi commented Jul 22, 2021

CarloLucibello commented Jul 16, 2021 •

edited

Loading