Replies: 1 comment 1 reply
-
The option 3 seems to build upon option 2 or I dont understand what does striding in the layout has to specially do with convNd. Between option 2 & 3, I suppose it boils down to effort difference. If we can afford, option 2 seems nicer. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
We've got client requests for 1D and 3D convolutions in the next release.
From what I can tell, there're two reasonable ways to do this
The short-term hack
One observation we could make is that conv2d is just conv3d with a unspecified depth dimension of 1 (and similarly, conv1d is conv2d with a width of 1).
We could therefore go through and transition
rock.conv2d
torock.conv3d
, adjust the backwards data and backwards wight kernels too. Then, we'd rewrite 2D convolution into 3D convolution, either in rock-conv-to-gemm or by modifying all our clients (tosa-to-rock, rocmlir-gen, and so on) to insert the fake dimension.convNd, the semi-disruptive way
We could keep our general scheme for
rock.conv2d
(which we'd just renamerock.conv
) where you specify the layout of the filter, input, and output tensors as a series of strings in an attribute.Then, in conv2gemm, we'd just identify how many non-{batch,group,channel} dimensions we have and loop over said dimensions in order to construct things like the gemmK dimension.
This has the advantage that it'd only really require rewriting conv2gemm ... along with a lot of the auxilliary glue code (like the convDims struct) to make it take a general number of height/width/depth/... dimensions.
This is probably more implementable than the third solution here and less of an annoying hack than the first one.
convNd while getting rid of layout attributes
The proposal is that we do #1140 and pin
rock.conv2d
to some fixed logical layout, like NCHW. Then, we'd have general purpose code for working out things like what order the C, Y, X / K, H, W dimensions should be concatenated in.Doing this'll allow us to ditch a lot of the support code the second option requires us to modify and simplify things like conv2gemm, at the cost of moving some logic into rocmlir-gen for handling things like
-transA false
or-fil_layout kgcyx
.But, once we've standardized on, for example, NGCHW convolutions, that trailing set of dimensions can be any length we like, and we could expect our code to handle it.
My thoughts
I'm leaning to the third option (finally do generalized problem key) because it's been a long issue and not doing it can cause bad tuning caching for MIGraphX, but, if we don't have the time, I like the second option (where we extend the layout attributes) over making conv2d into conv3d.
Beta Was this translation helpful? Give feedback.
All reactions