Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

core dump when predict using multithreading #5912

Closed
gold-mango opened this issue Apr 20, 2017 · 3 comments
Closed

core dump when predict using multithreading #5912

gold-mango opened this issue Apr 20, 2017 · 3 comments

Comments

@gold-mango
Copy link

gold-mango commented Apr 20, 2017

I want to predict using multithreading, every thread load the same model to predict simultaneously (because one machine has four card).
But I find that single thread is ok, multi-thread will core. The core happen in nnvm::Op::UpdateAttrMap function(detailed information listed below), is there anybody knows what happend?

btw.
If I invoke MXPredCreatePartialOut serially by mutex, that is ok.

Environment info

Operating System: CentOS 6.3

Compiler: gcc 4.8.2

Package used (Python/R/Scala/Julia): c++

MXNet version: 0.9.3

Or if installed from source: yes

MXNet commit hash (git rev-parse HEAD):

If you are using python package, please provide

Python version and distribution:

If you are using R package, please provide

R sessionInfo():

Error Message:

Please paste the full error message, including stack trace.
#0 0x00007f1a42c8c3f7 in raise () from /opt/compiler/gcc-4.8.2/lib/libc.so.6
#1 0x00007f1a42c8d7d8 in abort () from /opt/compiler/gcc-4.8.2/lib/libc.so.6
#2 0x00007f1a42cca554 in __libc_message () from /opt/compiler/gcc-4.8.2/lib/libc.so.6
#3 0x00007f1a42ccfdbe in malloc_printerr () from /opt/compiler/gcc-4.8.2/lib/libc.so.6
#4 0x00007f1a42cd0a97 in _int_free () from /opt/compiler/gcc-4.8.2/lib/libc.so.6
#5 0x00007f1a4b443ef9 in std::_Function_base::_Base_manager<mxnet::op::{lambda(nnvm::NodeAttrs const&)#4}>::_M_destroy () at /home/opt/gcc-4.8.2.bpkg-r2/gcc-4.8.2.bpkg-r2/include/c++/4.8.2/functional:1926
#6 0x00007f1a4b44379e in std::_Function_base::_Base_manager<mxnet::op::{lambda(nnvm::NodeAttrs const&)#4}>::_M_manager () at /home/opt/gcc-4.8.2.bpkg-r2/gcc-4.8.2.bpkg-r2/include/c++/4.8.2/functional:1950
#7 0x00007f1a4a8c11f9 in std::_Function_base::~_Function_base() ()
at /home/opt/gcc-4.8.2.bpkg-r2/gcc-4.8.2.bpkg-r2/include/c++/4.8.2/functional:2030
#8 0x00007f1a4b1fcd50 in std::function<unsigned int ()(nnvm::NodeAttrs const&)>::~function() ()
at /home/opt/gcc-4.8.2.bpkg-r2/gcc-4.8.2.bpkg-r2/include/c++/4.8.2/functional:2174
#9 0x00007f1a4b4487a2 in std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int>::~pair() ()
at /home/opt/gcc-4.8.2.bpkg-r2/gcc-4.8.2.bpkg-r2/include/c++/4.8.2/bits/stl_pair.h:96
#10 0x00007f1a4b456caa in void std::_Destroy<std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int> >(std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int>) ()
at /home/opt/gcc-4.8.2.bpkg-r2/gcc-4.8.2.bpkg-r2/include/c++/4.8.2/bits/stl_construct.h:93
#11 0x00007f1a4b4557c4 in void std::_Destroy_aux::__destroy<std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int>
>(std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int>, std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int>) ()
at /home/opt/gcc-4.8.2.bpkg-r2/gcc-4.8.2.bpkg-r2/include/c++/4.8.2/bits/stl_construct.h:103
#12 0x00007f1a4b45371b in void std::_Destroy<std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int>>(std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int>, std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int>) ()
at /home/opt/gcc-4.8.2.bpkg-r2/gcc-4.8.2.bpkg-r2/include/c++/4.8.2/bits/stl_construct.h:126
#13 0x00007f1a4b450899 in void std::_Destroy<std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int>
, std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int> >(std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int>, std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int>, std::allocator<std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int> >&) ()
at /home/opt/gcc-4.8.2.bpkg-r2/gcc-4.8.2.bpkg-r2/include/c++/4.8.2/bits/stl_construct.h:151
#14 0x00007f1a4b453cd8 in std::vector<std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int>, std::allocator<std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int> > >::_M_fill_insert(__gnu_cxx::__normal_iterator<std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int>, std::vector<std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int>, std::allocator<std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int> > > >, unsigned long, std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int> const&) () at /home/opt/gcc-4.8.2.bpkg-r2/gcc-4.8.2.bpkg-r2/include/c++/4.8.2/bits/vector.tcc:517
#15 0x00007f1a4b450b60 in std::vector<std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int>, std::allocator<std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int> > >::insert(__gnu_cxx::__normal_iterator<std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int>
, std::vector<std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int>, std::allocator<std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int> > > >, unsigned long, std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int> const&) ()
at /home/opt/gcc-4.8.2.bpkg-r2/gcc-4.8.2.bpkg-r2/include/c++/4.8.2/bits/stl_vector.h:1024
#16 0x00007f1a4b44bee7 in std::vector<std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int>, std::allocator<std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int> > >::resize(unsigned long, std::pair<std::function<unsigned int ()(nnvm::NodeAttrs const&)>, int> const&) ()
at /home/opt/gcc-4.8.2.bpkg-r2/gcc-4.8.2.bpkg-r2/include/c++/4.8.2/bits/stl_vector.h:687
#17 0x00007f1a4b448b31 in nnvm::Op& nnvm::Op::set_attr<std::function<unsigned int ()(nnvm::NodeAttrs const&)> >(std::basic_string<char, std::char_traits, std::allocator > const&, std::function<unsigned int ()(nnvm::NodeAttrs const&)> const&, int)::{lambda(dmlc::any*)#1}::operator()(dmlc::any*) const ()
at /home/img/mxnet/mxnet/nnvm/include/nnvm/op.h:444
#18 0x00007f1a4b450c86 in std::_Function_handler<void ()(dmlc::any*), nnvm::Op& nnvm::Op::set_attr<std::function<unsi---Type to continue, or q to quit---
gned int ()(nnvm::NodeAttrs const&)> >(std::basic_string<char, std::char_traits, std::allocator > const&, std::function<unsigned int ()(nnvm::NodeAttrs const&)> const&, int)::{lambda(dmlc::any*)#1}>::_M_invoke(std::_Any_data const&, dmlc::any*) () at /home/opt/gcc-4.8.2.bpkg-r2/gcc-4.8.2.bpkg-r2/include/c++/4.8.2/functional:2071
#19 0x00007f1a4d263317 in nnvm::Op::UpdateAttrMap(std::basic_string<char, std::char_traits, std::allocator > const&, std::function<void ()(dmlc::any*)>) () from ./lib/libmxnet.so
#20 0x00007f1a4b448f7c in nnvm::Op& nnvm::Op::set_attr<std::function<unsigned int ()(nnvm::NodeAttrs const&)> >(std::basic_string<char, std::char_traits, std::allocator > const&, std::function<unsigned int ()(nnvm::NodeAttrs const&)> const&, int) () at /home/img/mxnet/mxnet/nnvm/include/nnvm/op.h:426
#21 0x00007f1a4b5f0565 in mxnet::op::RegisterLegacyOpProp() () at src/nnvm/legacy_op_util.cc:330
#22 0x00007f1a4b48d874 in MXListAllOpNames () at src/c_api/c_api_symbolic.cc:57
#23 0x00007f1a4b4a61b6 in MXPredCreatePartialOut () at src/c_api/c_predict_api.cc:88

@picctree
Copy link

do you solve it?

@yajiedesign
Copy link
Contributor

This issue is closed due to lack of activity in the last 90 days. Feel free to reopen if this is still an active issue. Thanks!

@PapaMadeleine2022
Copy link

meet the same error. Single thread is ok, but core dump when predict using multithreading:

#0  0x00000000013a7540 in  ()
#1  0x00007f2a5d2e0738 in std::pair<std::function<void (mxnet::OpStatePtr const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)>, int>* std::__uninitialized_copy<false>::__uninit_copy<std::pair<std::function<void (mxnet::OpStatePtr const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)>, int>*, std::pair<std::function<void (mxnet::OpStatePtr const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)>, int>*>(std::pair<std::function<void (mxnet::OpStatePtr const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)>, int>*, std::pair<std::function<void (mxnet::OpStatePtr const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)>, int>*, std::pair<std::function<void (mxnet::OpStatePtr const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)>, int>*) () at /search/odin/gongzhenting/work/ml-tools/Mxnet/incubator-mxnet_fp16/lib/libmxnet.so
#2  0x00007f2a5d2e0f78 in std::vector<std::pair<std::function<void (mxnet::OpStatePtr const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)>, int>, std::allocator<std::pair<std::function<void (mxnet::OpStatePtr const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)>, int> > >::_M_fill_insert(__gnu_cxx::__normal_iterator<std::pair<std::function<void (mxnet::OpStatePtr const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)>, int>*, std::vector<std::pair<std::function<void (mxnet::OpStatePtr const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)>, int>, std::allocator<std::pair<std::function<void (mxnet::OpStatePtr const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)>, int> > > >, unsigned long, std::pair<std::function<void (mxnet::OpStatePtr const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)>, int> const&) () at /search/odin/gongzhenting/work/ml-tools/Mxnet/incubator-mxnet_fp16/lib/libmxnet.so
#3  0x00007f2a5d2e2e8b in nnvm::Op& nnvm::Op::set_attr<std::function<void (mxnet::OpStatePtr const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)> >(std::string const&, std::function<void (mxnet::OpStatePtr const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)> const&, int)::{lambda(dmlc::any*)#1}::operator()(dmlc::any*) const () at /search/odin/gongzhenting/work/ml-tools/Mxnet/incubator-mxnet_fp16/lib/libmxnet.so
#4  0x00007f2a5d2d9d5f in nnvm::Op& nnvm::Op::set_attr<std::function<void (mxnet::OpStatePtr const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)> >(std::string const&, std::function<void (mxnet::OpStatePtr const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)> const&, int) () at /search/odin/gongzhenting/work/ml-tools/Mxnet/incubator-mxnet_fp16/lib/libmxnet.so
#5  0x00007f2a5d2d200d in mxnet::op::RegisterLegacyOpProp() () at /xxx/incubator-mxnet/lib/libmxnet.so
#6  0x00007f2a5d7bdf21 in MXListAllOpNames () at /xxx/incubator-mxnet/lib/libmxnet.so
#7  0x00007f2a5d7e66c8 in MXPredCreatePartialOut () at /xxx/incubator-mxnet/lib/libmxnet.so
#8  0x00007f2a5d7e9871 in MXPredCreate () at /xxx/incubator-mxnet/lib/libmxnet.so

how to fix it?
anyone can give some advises?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants