Optimize ERF with MKL math function #5

TaoLv · 2019-03-03T05:09:08Z

Description

(Brief description on what this PR is about)

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

Feature1, tests, (and when applicable, API doc)
Feature2, tests, (and when applicable, API doc)

Comments

If this change is a backward incompatible change, why must this change be made.
Interesting edge cases to note here

eric-haibin-lin · 2019-03-08T06:11:16Z

tests/python/gpu/test_operator_gpu.py

+@with_seed()
+def test_math():
+    ops = ['log', 'erf', 'square']
+    check_value= True


nit: space after check_value

eric-haibin-lin · 2019-03-08T06:12:14Z

tests/python/gpu/test_operator_gpu.py

+    ops = ['log', 'erf', 'square']
+    check_value= True
+    lshape = 1000
+    rshapes = [1, 10, 100, 1000, 10000]


Is there any reason why you picked 5 shapes?

The test case is initialy to evaluate the scalability of mkl vml functions on different size of tensors, since it is not proper to track the time cost on calculation, I will modify the shape to 1d, 2d, 3d tensors

eric-haibin-lin · 2019-03-08T06:14:56Z

tests/python/gpu/test_operator_gpu.py

+def math_square(shape, dtype, check_value):
+    np_x = np.random.rand(shape[0], shape[1])
+    x = mx.nd.array(np_x, dtype=dtype)
+    mx.nd.waitall()


Why do we need to frequently add waitall and wait_to_read in tests?

The test case is initially to calcuatee the time cost of the calculations on different size tensors, wwwill remove the waitall, wait_to_read fron the test case as the test case targting the correctness of the calculation.

eric-haibin-lin · 2019-03-08T06:17:43Z

src/operator/tensor/elemwise_unary_op.h

@@ -35,9 +35,10 @@
 #include "../mxnet_op.h"
 #include "../elemwise_op_common.h"
 #include "../../ndarray/ndarray_function.h"
+
 #if MSHADOW_USE_MKL == 1


I have not set USE_MKL before. Just curious: is blas=MKL also tested in mxnet CI?

MSHADOW_USE_MKL is widely used in mshadow and mxnet to indicate MKL is used as BLAS library. Yes, USE_BLAS=mkl is built and tested in CI:
https://github.com/apache/incubator-mxnet/blob/master/ci/docker/runtime_functions.sh#L375
https://github.com/apache/incubator-mxnet/blob/master/ci/docker/runtime_functions.sh#L553

eric-haibin-lin · 2019-03-08T06:21:59Z

src/operator/tensor/elemwise_unary_op.h

+ *  With this macro means mxnet compile with MKL to accelerate math function with mkl. 
+ *  Will Register FCompute with UnaryOp::MKL_Compute() to compelet the math function. 
+ */
+#define MXNET_MKL_OPERATOR_REGISTER_UNARY(__name$)                  \


How is this different from MXNET_OPERATOR_REGISTER_UNARY?

The implementation of these two macros are same, duplicates code removed.

eric-haibin-lin · 2019-03-08T06:35:14Z

src/operator/tensor/elemwise_unary_op_basic.cc

+
+)code" ADD_FILELINE)
+.set_attr<nnvm::FGradient>("FGradient", ElemwiseGradUseIn{"_backward_square"});
+#else


Hmmm this does not scale to 100 ops. We need to think about better ways.

Yes. Need revisit the design to make it scalable for both operator registration and the kernel launching everywhere in mxnet.

…to erf-opt

Fix review comments

pengxin99 and others added 4 commits March 3, 2019 13:04

mkl_func test with erf&log op, build success~

f0c7264

fix lint and build issues

9311777

Try to add support to sparse array

a79f7db

fix build

015fd0a

eric-haibin-lin reviewed Mar 8, 2019

View reviewed changes

TaoLv and others added 8 commits April 17, 2019 20:38

Merge branch 'master' of https://github.com/apache/incubator-mxnet in…

495ce36

…to erf-opt

add functions

672be6a

Fix review comments

c69a25c

remove unecessary code

2c5c20c

Update test case

b1b6355

minor fix

f96c34a

move the position of MKL_Compute

06c51e9

Merge pull request #6 from juliusshufan/erf

acd7b56

Fix review comments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize ERF with MKL math function #5

Optimize ERF with MKL math function #5

TaoLv commented Mar 3, 2019

eric-haibin-lin Mar 8, 2019

eric-haibin-lin Mar 8, 2019

juliusshufan Mar 9, 2019

eric-haibin-lin Mar 8, 2019

juliusshufan Mar 9, 2019

eric-haibin-lin Mar 8, 2019

TaoLv Mar 8, 2019

eric-haibin-lin Mar 8, 2019

juliusshufan Mar 10, 2019

eric-haibin-lin Mar 8, 2019

TaoLv Mar 8, 2019

Optimize ERF with MKL math function #5

Are you sure you want to change the base?

Optimize ERF with MKL math function #5

Conversation

TaoLv commented Mar 3, 2019

Description

Checklist

Essentials

Changes

Comments

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment