-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Optimize transpose operator with MKL-DNN #14545
Conversation
@mxnet-label-bot add [MKLDNN, Backend, pr-awaiting-review] |
|
||
CHECK_EQ(inputs.size(), 1U); | ||
CHECK_EQ(outputs.size(), 1U); | ||
if (SupportMKLDNNTranspose(param, inputs[0])) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about move CHECK_EQ(req, kWriteTo) << "Transpose does not support inplace";
here? Then we have opportunity to provide fallback support instead of crash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid there is no way to fallback. The check is copied from the original implementation:
https://github.com/apache/incubator-mxnet/blob/master/src/operator/tensor/matrix_op-inl.h#L311
I will move the check to here and make the error happens on an early stage.
|
||
public: | ||
MKLDNNTransposeForward(const TransposeParam& param, | ||
const OpReqType &req, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems req
is redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Will remove that.
if (data.IsMKLDNNData()) { | ||
this->data_->set_data_handle(data.GetMKLDNNData()->get_data_handle()); | ||
} else { | ||
this->data_->set_data_handle(data.data().dptr<float>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems this code can be reused for other dtype, so we'd better avoid explictly using float
here. How about use template dtype or simply use this->data_->set_data_handle(data.GetMKLDNNData()->get_data_handle());
? The later one should always work even if data isn't MKLDNN data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will change that. Although, almost all MKL-DNN fp operators are restricted by checking data.dtype() == mshadow::kFloat32
. We need revisit those checks one day we want to support fp64.
…to enable-transpose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the special test cases for MKL-DNN trasnpose?
/*! | ||
* \file mkldnn_transpose.cc | ||
* \brief | ||
* \author |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add your name
const NDArray &data) { | ||
auto data_ndim = data.shape().ndim(); | ||
|
||
if (data_ndim > 4 || data.dtype() != mshadow::kFloat32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does transpose work for INT8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should work but it's not tested and verified. So here I limited the dimensionality and data type just like what we did for other MKL-DNN operators. BTW, if we want to use INT8 transpose in a quantized network, probably we need a quantized transpose operator to accept and output an additional scale argument.
dst_fmt.layout_desc.blocking.strides[1][axes[i]] = 1; | ||
|
||
total_stride *= shape[axes[i]]; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add the explanation of what the logic inside for these index setting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comments for that.
Normal use cases should be covered by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
…to enable-transpose
@pengzhao-intel It mitigates the performance issue of transpose in #14496 but doesn't help #14563. Possibly #14563 is not caused by transpose regression. @apeforest @fhieber @samskalicky
|
It's really good! Regarding #14563, does the transpose run into MKLDNN? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, agree with @TaoLv on the summary of the related issues above.
@TaoLv can you retrigger the build? one test is failing. seems to be a flaky test as that test is failing on master as well |
…to enable-transpose
…to enable-transpose
very great improvement. Merging now. |
* add mkldnn transpose * general transpose * support mkldnn format * fix lint * address comments * add unit test * add comments * retrigger CI
* add mkldnn transpose * general transpose * support mkldnn format * fix lint * address comments * add unit test * add comments * retrigger CI
Description
Take shapes from GluonNLP BERT base model as an example:
Before optimization:
After optimization:
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments