Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

MXNET_MKLDNN_DEBUG=1 produces errors #10026

Closed
marcoabreu opened this issue Mar 7, 2018 · 9 comments
Closed

MXNET_MKLDNN_DEBUG=1 produces errors #10026

marcoabreu opened this issue Mar 7, 2018 · 9 comments

Comments

@marcoabreu
Copy link
Contributor

marcoabreu commented Mar 7, 2018

http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-9995/32/pipeline/483

Setting MXNET_MKLDNN_DEBUG=1 as environment variable will produce the following error in tests. This happens across all configurations and seeds. I do not think that this is a test failure.

======================================================================

ERROR: test_gluon_model_zoo.test_models

----------------------------------------------------------------------

Traceback (most recent call last):

  File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest

    self.test(*self.arg)

  File "/work/mxnet/tests/python/unittest/common.py", line 157, in test_new

    orig_test(*args, **kwargs)

  File "/work/mxnet/tests/python/unittest/test_gluon_model_zoo.py", line 50, in test_models

    model(mx.nd.random.uniform(shape=data_shape)).wait_to_read()

  File "/work/mxnet/python/mxnet/ndarray/ndarray.py", line 1650, in wait_to_read

    check_call(_LIB.MXNDArrayWaitToRead(self.handle))

  File "/work/mxnet/python/mxnet/base.py", line 149, in check_call

    raise MXNetError(py_str(_LIB.MXGetLastError()))

MXNetError: [17:10:12] src/operator/nn/mkldnn/mkldnn_base.cc:395: Check failed: similar 



Stack trace returned 10 entries:

[bt] (0) /work/mxnet/python/mxnet/../../lib/libmxnet.so(dmlc::StackTrace[abi:cxx11]()+0x5b) [0x7f06ccf3745b]

[bt] (1) /work/mxnet/python/mxnet/../../lib/libmxnet.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x28) [0x7f06ccf38478]

[bt] (2) /work/mxnet/python/mxnet/../../lib/libmxnet.so(mxnet::OpCheck::Run(std::function<void (nnvm::NodeAttrs const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)>, nnvm::NodeAttrs const&, mxnet::OpContext const&, std::vector<mxnet::NDArray, std::allocator<mxnet::NDArray> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::NDArray, std::allocator<mxnet::NDArray> > const&)+0x3ca8) [0x7f06ccf54198]

[bt] (3) /work/mxnet/python/mxnet/../../lib/libmxnet.so(+0x2a910d9) [0x7f06cf55a0d9]

[bt] (4) /work/mxnet/python/mxnet/../../lib/libmxnet.so(std::_Function_handler<void (mxnet::RunContext), mxnet::imperative::PushFComputeEx(std::function<void (nnvm::NodeAttrs const&, mxnet::OpContext const&, std::vector<mxnet::NDArray, std::allocator<mxnet::NDArray> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::NDArray, std::allocator<mxnet::NDArray> > const&)> const&, nnvm::Op const*, nnvm::NodeAttrs const&, mxnet::Context const&, std::vector<mxnet::engine::Var*, std::allocator<mxnet::engine::Var*> > const&, std::vector<mxnet::engine::Var*, std::allocator<mxnet::engine::Var*> > const&, std::vector<mxnet::Resource, std::allocator<mxnet::Resource> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&)::{lambda(mxnet::RunContext)#1}>::_M_invoke(std::_Any_data const&, mxnet::RunContext&&)+0x7c) [0x7f06cf77608c]

[bt] (5) /work/mxnet/python/mxnet/../../lib/libmxnet.so(+0x3148fdb) [0x7f06cfc11fdb]

[bt] (6) /work/mxnet/python/mxnet/../../lib/libmxnet.so(mxnet::engine::ThreadedEngine::ExecuteOprBlock(mxnet::RunContext, mxnet::engine::OprBlock*)+0xcb5) [0x7f06cfc0b1a5]

[bt] (7) /work/mxnet/python/mxnet/../../lib/libmxnet.so(std::_Function_handler<void (std::shared_ptr<dmlc::ManualEvent>), mxnet::engine::ThreadedEnginePerDevice::PushToExecute(mxnet::engine::OprBlock*, bool)::{lambda()#1}::operator()() const::{lambda(std::shared_ptr<dmlc::ManualEvent>)#1}>::_M_invoke(std::_Any_data const&, std::shared_ptr<dmlc::ManualEvent>&&)+0xd9) [0x7f06cfc1d309]

[bt] (8) /work/mxnet/python/mxnet/../../lib/libmxnet.so(std::thread::_Impl<std::_Bind_simple<std::function<void (std::shared_ptr<dmlc::ManualEvent>)> (std::shared_ptr<dmlc::ManualEvent>)> >::_M_run()+0x4a) [0x7f06cfc1c43a]

[bt] (9) /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xb8c80) [0x7f06d7ca4c80]





-------------------- >> begin captured stdout << ---------------------

ResNetV1(

  (features): HybridSequential(

    (0): Conv2D(None -> 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)

    (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

    (2): Activation(relu)

    (3): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(1, 1), ceil_mode=False)

    (4): HybridSequential(

      (0): BasicBlockV1(

        (body): HybridSequential(

          (0): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

          (2): Activation(relu)

          (3): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (4): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

      )

      (1): BasicBlockV1(

        (body): HybridSequential(

          (0): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

          (2): Activation(relu)

          (3): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (4): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

      )

    )

    (5): HybridSequential(

      (0): BasicBlockV1(

        (body): HybridSequential(

          (0): Conv2D(64 -> 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

          (2): Activation(relu)

          (3): Conv2D(128 -> 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (4): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

        (downsample): HybridSequential(

          (0): Conv2D(64 -> 128, kernel_size=(1, 1), stride=(2, 2), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

      )

      (1): BasicBlockV1(

        (body): HybridSequential(

          (0): Conv2D(128 -> 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

          (2): Activation(relu)

          (3): Conv2D(128 -> 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (4): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

      )

    )

    (6): HybridSequential(

      (0): BasicBlockV1(

        (body): HybridSequential(

          (0): Conv2D(128 -> 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

          (2): Activation(relu)

          (3): Conv2D(256 -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (4): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

        (downsample): HybridSequential(

          (0): Conv2D(128 -> 256, kernel_size=(1, 1), stride=(2, 2), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

      )

      (1): BasicBlockV1(

        (body): HybridSequential(

          (0): Conv2D(256 -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

          (2): Activation(relu)

          (3): Conv2D(256 -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (4): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

      )

    )

    (7): HybridSequential(

      (0): BasicBlockV1(

        (body): HybridSequential(

          (0): Conv2D(256 -> 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

          (2): Activation(relu)

          (3): Conv2D(512 -> 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (4): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

        (downsample): HybridSequential(

          (0): Conv2D(256 -> 512, kernel_size=(1, 1), stride=(2, 2), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

      )

      (1): BasicBlockV1(

        (body): HybridSequential(

          (0): Conv2D(512 -> 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

          (2): Activation(relu)

          (3): Conv2D(512 -> 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (4): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

      )

    )

    (8): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True)

  )

  (output): Dense(512 -> 1000, linear)

)

ResNetV1(

  (features): HybridSequential(

    (0): Conv2D(None -> 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)

    (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

    (2): Activation(relu)

    (3): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(1, 1), ceil_mode=False)

    (4): HybridSequential(

      (0): BasicBlockV1(

        (body): HybridSequential(

          (0): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

          (2): Activation(relu)

          (3): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (4): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

      )

      (1): BasicBlockV1(

        (body): HybridSequential(

          (0): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

          (2): Activation(relu)

          (3): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (4): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

      )

      (2): BasicBlockV1(

        (body): HybridSequential(

          (0): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

          (2): Activation(relu)

          (3): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (4): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

      )

    )

    (5): HybridSequential(

      (0): BasicBlockV1(

        (body): HybridSequential(

          (0): Conv2D(64 -> 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

          (2): Activation(relu)

          (3): Conv2D(128 -> 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (4): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

        (downsample): HybridSequential(

          (0): Conv2D(64 -> 128, kernel_size=(1, 1), stride=(2, 2), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

      )

      (1): BasicBlockV1(

        (body): HybridSequential(

          (0): Conv2D(128 -> 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

          (2): Activation(relu)

          (3): Conv2D(128 -> 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (4): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

      )

      (2): BasicBlockV1(

        (body): HybridSequential(

          (0): Conv2D(128 -> 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

          (2): Activation(relu)

          (3): Conv2D(128 -> 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (4): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

      )

      (3): BasicBlockV1(

        (body): HybridSequential(

          (0): Conv2D(128 -> 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

          (2): Activation(relu)

          (3): Conv2D(128 -> 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (4): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

      )

    )

    (6): HybridSequential(

      (0): BasicBlockV1(

        (body): HybridSequential(

          (0): Conv2D(128 -> 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

          (2): Activation(relu)

          (3): Conv2D(256 -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (4): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

        (downsample): HybridSequential(

          (0): Conv2D(128 -> 256, kernel_size=(1, 1), stride=(2, 2), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

      )

      (1): BasicBlockV1(

        (body): HybridSequential(

          (0): Conv2D(256 -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

          (2): Activation(relu)

          (3): Conv2D(256 -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (4): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

      )

      (2): BasicBlockV1(

        (body): HybridSequential(

          (0): Conv2D(256 -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

          (2): Activation(relu)

          (3): Conv2D(256 -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (4): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

      )

      (3): BasicBlockV1(

        (body): HybridSequential(

          (0): Conv2D(256 -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

          (2): Activation(relu)

          (3): Conv2D(256 -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (4): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

      )

      (4): BasicBlockV1(

        (body): HybridSequential(

          (0): Conv2D(256 -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

          (2): Activation(relu)

          (3): Conv2D(256 -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (4): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

      )

      (5): BasicBlockV1(

        (body): HybridSequential(

          (0): Conv2D(256 -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

          (2): Activation(relu)

          (3): Conv2D(256 -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (4): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

      )

    )

    (7): HybridSequential(

      (0): BasicBlockV1(

        (body): HybridSequential(

          (0): Conv2D(256 -> 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

          (2): Activation(relu)

          (3): Conv2D(512 -> 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (4): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

        (downsample): HybridSequential(

          (0): Conv2D(256 -> 512, kernel_size=(1, 1), stride=(2, 2), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

      )

      (1): BasicBlockV1(

        (body): HybridSequential(

          (0): Conv2D(512 -> 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

          (2): Activation(relu)

          (3): Conv2D(512 -> 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (4): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

      )

      (2): BasicBlockV1(

        (body): HybridSequential(

          (0): Conv2D(512 -> 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

          (2): Activation(relu)

          (3): Conv2D(512 -> 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)

          (4): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None)

        )

      )

    )

    (8): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True)

  )

  (output): Dense(512 -> 1000, linear)

)



--------------------- >> end captured stdout << ----------------------

-------------------- >> begin captured logging << --------------------

common: INFO: Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=1825457337 to reproduce.

common: INFO: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1579343143 to reproduce.

--------------------- >> end captured logging << ---------------------
@marcoabreu
Copy link
Contributor Author

@pengzhao-intel
Copy link
Contributor

pengzhao-intel commented Mar 22, 2018

@marcoabreu @cjolivier01 This is a very useful function to verify the correctness of MKL-DNN OP.
After debugging, there are two types of failures in DEBUG mode.

1. Numerical precision for convolution @cjolivier01
The current both rtol and atol in SimilarArray are 1e-3.
In the fail case, the two data are 0.67509 and 0.677384. Because these two numbers are small, the atol is larger than 1e-3. But I think this difference is acceptable.
I suggest that we change the atol to 1e-2.

https://github.com/apache/incubator-mxnet/blob/46e47cbc6183d2812a2e405851f0b209383e72ad/src/operator/nn/mkldnn/mkldnn_base.cc#L391

[15:28:36] src/operator/nn/mkldnn/mkldnn_base.cc:346: data1[i]: 0.675079 data2[i]: 0.677384
[15:28:36] src/operator/nn/mkldnn/mkldnn_base.cc:347: atol + rtol * std::abs(data2[i]):0.00167738

Abs(abs(data1[i] - data2[i] )= 0.002305

if (std::abs(data1[i] - data2[i]) > atol + rtol * std::abs(data2[i])) return false

2. The Flatten in gluon model should call symbolic API rather than implemented it by reshape @piiswrong
https://github.com/apache/incubator-mxnet/blob/46e47cbc6183d2812a2e405851f0b209383e72ad/python/mxnet/gluon/nn/basic_layers.py#L408
I suggest changing this to F.Flatten(x) so that the MKL-DNN flatten function will be used.

After I changed these two, all cases passed.

[patric@mlt-skx080 master]$ git diff
diff --git a/3rdparty/mkldnn b/3rdparty/mkldnn
--- a/3rdparty/mkldnn
+++ b/3rdparty/mkldnn
@@ -1 +1 @@
-Subproject commit f5218ff4fd2d16d13aada2e632afd18f2514fee3
+Subproject commit f5218ff4fd2d16d13aada2e632afd18f2514fee3-dirty
diff --git a/python/mxnet/gluon/nn/basic_layers.py b/python/mxnet/gluon/nn/basic_layers.py
index 3801c84..b7bba8a 100644
--- a/python/mxnet/gluon/nn/basic_layers.py
+++ b/python/mxnet/gluon/nn/basic_layers.py
@@ -405,7 +405,7 @@ class Flatten(HybridBlock):
         super(Flatten, self).__init__(**kwargs)

     def hybrid_forward(self, F, x):
-        return x.reshape((0, -1))
+        return F.Flatten(x)

     def __repr__(self):
         return self.__class__.__name__
diff --git a/src/operator/nn/mkldnn/mkldnn_base.cc b/src/operator/nn/mkldnn/mkldnn_base.cc
index 820cca1..c9582cd 100644
--- a/src/operator/nn/mkldnn/mkldnn_base.cc
+++ b/src/operator/nn/mkldnn/mkldnn_base.cc
@@ -388,7 +388,7 @@ void OpCheck::Run(mxnet::FCompute fn, const nnvm::NodeAttrs &attrs,
     if (req[i] == kNullOp)
       continue;
     MSHADOW_TYPE_SWITCH(outputs[i].dtype(), DType, {
-      bool similar = SimilarArray<DType>(outputs[i], outputs_[i], 1e-3, 1e-3);
+      bool similar = SimilarArray<DType>(outputs[i], outputs_[i], 1e-3, 1e-2);
       if (!similar) {
         LOG(ERROR) << attrs.op->name << " fails";
       }

3. The bug in x.reshap((0, -1)) @zheng-da
In theory, this implementation should work with MKL-DNN type as well. But it doesn't.
I think there is a bug. I am still debugging this now.

@pengzhao-intel
Copy link
Contributor

@marcoabreu We got the root cause of 3) in above comments. It's not the MKL-DNN implementation issues. Just need to improve the test method under MXNET_MKLDNN_DEBUG

@cjolivier01 Thanks for the nice functionality to check the results of MKL-DNN.
We found there is a special situation which needs to be improved as 3) in my above comments.
The reshape is executed to change the view of NDAarry but the real data don't change yet.
When come to debug mode, OpCheck tries to get the MKLDNN memory without checking if it is a view, so the case fails.

https://github.com/apache/incubator-mxnet/blob/bd9b9c8b76d68b2b7cd957dc0bd07fb4fbc29c4c/src/operator/nn/mkldnn/mkldnn_base.cc#L357

So, a possible fix for OpCheck.Init, @zheng-da please help take a review.

+    //auto mem = inputs_[i].GetMKLDNNData();
+    NDArray data = inputs_[i];
+    const TShape& ishape = inputs_[i].shape();
+    if (data.IsMKLDNNData() && data.IsView())
+        data = data.MKLDNNDataReshape(Shape2(ishape.ProdShape(0, ishape.ndim()-1),
+                                     ishape[ishape.ndim()-1]));
+    auto mem = data.GetMKLDNNData(); 

@zheng-da
Copy link
Contributor

When I wrote the test, I followed the python test. https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/test_utils.py#L470

When assert_almost_equal is called https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/test_utils.py#L1313, it uses 1e-3 for both rtol and atol. I didn't know why the test fails.

@zheng-da
Copy link
Contributor

As for modifying OpCheck.Init, you can do

  if (data.IsMKLDNNData() && data.IsView())
    data = in_data[fullc::kData].Reorder2Default();

Please see the example here: https://github.com/apache/incubator-mxnet/blob/master/src/operator/nn/mkldnn/mkldnn_fully_connected.cc#L95
This is how we should deal with reshaped MKLDNN NDArray. Unfortunately, we can't do in-place layout conversion in the NDArray, which caused a race condition.

@pengzhao-intel
Copy link
Contributor

@zheng-da I have looked into Reorder2Default, but it will also convert to the original shape rather than the new shape of 'reshape'.
And that's another point we want to improve later.

@azai91
Copy link
Contributor

azai91 commented Jul 13, 2018

@pengzhao-intel I believe when you call copyfrom it will convert the input memory shape into the same shape as the target. so if you Reorder2Default but then call copyfrom the mkldnn memory will be the new shape

NDArray data = inputs_[i];
inputs.emplace_back(data.shape(), ctx, false, data.dtype());
if (data.IsMKLDNNData() && data.IsView())
    data = in_data[fullc::kData].Reorder2Default();
auto mem = inputs_[i].GetMKLDNNData();
inputs[i].CopyFrom(*mem);

@azai91
Copy link
Contributor

azai91 commented Aug 7, 2018

#12069

@azai91
Copy link
Contributor

azai91 commented Aug 17, 2018

above PR addresses issue. @marcoabreu can you close?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants