enable all activations in MKLDNN. #10089

zheng-da · 2018-03-13T20:59:07Z

Description

Previously, some activation types in MKLDNN aren't used because there was a precision problem.
This is to enable all activations in MKLDNN.

Checklist

Essentials

Passed code style checking (make lint)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

marcoabreu · 2018-03-13T21:39:11Z

src/operator/nn/mkldnn/mkldnn_act.cc

@@ -45,11 +45,9 @@ namespace op {
 bool SupportMKLDNNAct(const ActivationParam& param) {
  // We only enable ReLU for now. It seems other activations have some precision


Remove comment

Remove comment why?

The comment explains why the operators have been disabled. This PR re-enables them and thus the comment is obsolete

marcoabreu · 2018-03-13T21:39:39Z

How have the precision problems been resolved? Is there a test?

zheng-da · 2018-03-14T00:32:48Z

it seems the precision problem hasn't been fixed in mkldnn. I'm notify them of this problem. hopefully, it can be fixed soon.

marcoabreu · 2018-03-14T00:38:40Z

I see, thanks a lot! So we'll wait to merge this until Intel fixed the problem or how would you propose to move forward?

pengzhao-intel · 2018-03-14T00:46:53Z

@zheng-da thanks a lot. I will follow up with our team :)

zheng-da · 2018-03-14T00:47:54Z

I wanted to test if MKLDNN activation is working now. We can close the PR for now and reopen it after the bug in MKLDNN is fixed, or just keep it open. Either way is fine.

pengzhao-intel · 2018-03-14T05:27:46Z

Look at the code, two implementations are the difference for the soft_relu. So, we get the different results.

mxnet:
https://github.com/apache/incubator-mxnet/blob/c9ec3118688c233a66ad847003a9e8d2d09e5952/src/operator/mshadow_op.h#L136

/*! \brief SoftReLU, also known as softplus activation */
struct softrelu : public mxnet_op::tunable {
  template<typename DType>
  MSHADOW_XINLINE static DType Map(DType a) {
    // Avoid overflow of exp for large inputs.
    // Thresholds 20.0 is chosen such that softrelu(a) = a
    // for a > 20 using floating precision
    if (a > DType(20.0f)) {
      return a;
    } else {
      return DType(math::log1p(math::exp(a)));
    }
  }
};

MXNET_UNARY_MATH_OP(softrelu_grad, -math::expm1(-a));

mkldnn:
https://github.com/intel/mkl-dnn/blob/f5218ff4fd2d16d13aada2e632afd18f2514fee3/tests/gtests/test_eltwise.cpp#L101

template <typename T>
T soft_relu_fwd(T s) {
    return logf(1 + ::expf(s));
}

template <typename T>
T soft_relu_bwd(T dd, T s) {
    return dd / (1 + ::expf(-s));
}

zheng-da · 2018-03-14T05:58:25Z

but the error happens here: https://github.com/apache/incubator-mxnet/blob/master/tests/python/gpu/test_operator_gpu.py#L1111
activation with sigmoid fails.

pengzhao-intel · 2018-03-14T06:03:52Z

@zheng-da tests/python/unittest/test_loss.py fail too which used softrelu.
I will look the case you pointed.

======================================================================

FAIL: test_loss.test_bce_loss

Traceback (most recent call last):

File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest
self.test(*self.arg)
File "/work/mxnet/tests/python/unittest/common.py", line 157, in test_new
orig_test(*args, **kwargs)
File "/work/mxnet/tests/python/unittest/test_loss.py", line 100, in test_bce_loss
assert mod.score(data_iter, eval_metric=mx.metric.Loss())[0][1] < 0.01
AssertionError:

-------------------- >> begin captured logging << --------------------

piiswrong · 2018-03-22T05:57:53Z

@zheng-da @pengzhao-intel ping

zheng-da · 2018-03-22T06:13:13Z

@pengzhao-intel is there any update from the Intel MKLDNN team?

pengzhao-intel · 2018-03-22T06:25:08Z

@zheng-da @piiswrong sorry I missed the first ping. I will raise the priority for this issue and update to you soon.

pengzhao-intel · 2018-03-30T05:37:06Z

@zheng-da @piiswrong Fixed the issue in the local and PR on the road by @jinhuang415

enable all activations.

3476395

zheng-da requested a review from cjolivier01 as a code owner March 13, 2018 20:59

marcoabreu reviewed Mar 13, 2018

View reviewed changes

Update mkldnn_act.cc

3bf6d96

zheng-da mentioned this pull request Mar 30, 2018

Fix MKLDNN sigmoid/softrelu issue #10336

Merged

5 tasks

zheng-da closed this Mar 30, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enable all activations in MKLDNN. #10089

enable all activations in MKLDNN. #10089

zheng-da commented Mar 13, 2018

marcoabreu Mar 13, 2018

cjolivier01 Mar 13, 2018

marcoabreu Mar 13, 2018 •

edited

Loading

marcoabreu commented Mar 13, 2018

zheng-da commented Mar 14, 2018

marcoabreu commented Mar 14, 2018

pengzhao-intel commented Mar 14, 2018

zheng-da commented Mar 14, 2018

pengzhao-intel commented Mar 14, 2018

zheng-da commented Mar 14, 2018

pengzhao-intel commented Mar 14, 2018 •

edited

Loading

piiswrong commented Mar 22, 2018

zheng-da commented Mar 22, 2018

pengzhao-intel commented Mar 22, 2018

pengzhao-intel commented Mar 30, 2018

		@@ -45,11 +45,9 @@ namespace op {
		bool SupportMKLDNNAct(const ActivationParam& param) {
		// We only enable ReLU for now. It seems other activations have some precision

enable all activations in MKLDNN. #10089

enable all activations in MKLDNN. #10089

Conversation

zheng-da commented Mar 13, 2018

Description

Checklist

Essentials

marcoabreu Mar 13, 2018

Choose a reason for hiding this comment

cjolivier01 Mar 13, 2018

Choose a reason for hiding this comment

marcoabreu Mar 13, 2018 • edited Loading

Choose a reason for hiding this comment

marcoabreu commented Mar 13, 2018

zheng-da commented Mar 14, 2018

marcoabreu commented Mar 14, 2018

pengzhao-intel commented Mar 14, 2018

zheng-da commented Mar 14, 2018

pengzhao-intel commented Mar 14, 2018

zheng-da commented Mar 14, 2018

pengzhao-intel commented Mar 14, 2018 • edited Loading

piiswrong commented Mar 22, 2018

zheng-da commented Mar 22, 2018

pengzhao-intel commented Mar 22, 2018

pengzhao-intel commented Mar 30, 2018

marcoabreu Mar 13, 2018 •

edited

Loading

pengzhao-intel commented Mar 14, 2018 •

edited

Loading