Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error of compilation ONNX BERT with tune information #7310

Closed
dlexplorer opened this issue Jan 19, 2021 · 2 comments
Closed

Error of compilation ONNX BERT with tune information #7310

dlexplorer opened this issue Jan 19, 2021 · 2 comments

Comments

@dlexplorer
Copy link
Contributor

commit [AutoScheduler] Add layout rewrite support for dense and batch matmul… breaks compilation of ONNX BERT with autotune information. Until this commit it was compiled successfully.
Some notes:

  1. The above commit changes the workload hash for dense, need to collect different statistic before it and after.
  2. I tried to collect tune statistic and compile networks with it for other networks having dense layer and have not seen issues (I tried mxnet mobilenet v2 and onnx inception v1. Both networks have dense, they participated in the tuning, no issues during compilation

The scenario and scripts for reproducing of the issue is uploaded to https://github.com/dlexplorer/tvm-autotune-bert
The callstack of failed assert during compilation:

Traceback (most recent call last):
  File "compile_bert_tuned_local.py", line 36, in <module>
    lib = relay.build(mod, target=target,
  File "tvm/python/tvm/relay/build_module.py", line 275, in build
    graph_json, mod, params = bld_mod.build(mod, target, target_host, params)
  File "tvm/python/tvm/relay/build_module.py", line 138, in build
    self._build(mod, target, target_host)
  File "tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 237, in __call__
    raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
  [bt] (8) tvm/build/libtvm.so(void tvm::relay::ExpandDataflow<tvm::relay::TypeInferencer::VisitExpr(tvm::RelayExpr const&)::{lambda(tvm::RelayExpr const&)#1}, tvm::relay::TypeInferencer::VisitExpr(tvm::RelayExpr const&)::{lambda(tvm::RelayExpr const&)#2}>(tvm::RelayExpr, tvm::relay::TypeInferencer::VisitExpr(tvm::RelayExpr const&)::{lambda(tvm::RelayExpr const&)#1}, tvm::relay::TypeInferencer::VisitExpr(tvm::RelayExpr const&)::{lambda(tvm::RelayExpr const&)#2})+0xc55) [0x7ff150eb3d55]
  [bt] (7) tvm/build/libtvm.so(tvm::relay::ExprFunctor<tvm::Type (tvm::RelayExpr const&)>::InitVTable()::{lambda(tvm::runtime::ObjectRef const&, tvm::relay::ExprFunctor<tvm::Type (tvm::RelayExpr const&)>*)#5}::_FUN(tvm::runtime::ObjectRef const&, tvm::relay::ExprFunctor<tvm::Type (tvm::RelayExpr const&)>*)+0x2c) [0x7ff150ea52fc]
  [bt] (6) tvm/build/libtvm.so(tvm::relay::TypeInferencer::VisitExpr_(tvm::relay::FunctionNode const*)+0x260) [0x7ff150eb51e0]
  [bt] (5) tvm/build/libtvm.so(tvm::relay::TypeInferencer::GetType(tvm::RelayExpr const&)+0xee) [0x7ff150eb460e]
  [bt] (4) tvm/build/libtvm.so(void tvm::relay::ExpandDataflow<tvm::relay::TypeInferencer::VisitExpr(tvm::RelayExpr const&)::{lambda(tvm::RelayExpr const&)#1}, tvm::relay::TypeInferencer::VisitExpr(tvm::RelayExpr const&)::{lambda(tvm::RelayExpr const&)#2}>(tvm::RelayExpr, tvm::relay::TypeInferencer::VisitExpr(tvm::RelayExpr const&)::{lambda(tvm::RelayExpr const&)#1}, tvm::relay::TypeInferencer::VisitExpr(tvm::RelayExpr const&)::{lambda(tvm::RelayExpr const&)#2})+0xc55) [0x7ff150eb3d55]
  [bt] (3) tvm/build/libtvm.so(tvm::relay::ExprFunctor<tvm::Type (tvm::RelayExpr const&)>::InitVTable()::{lambda(tvm::runtime::ObjectRef const&, tvm::relay::ExprFunctor<tvm::Type (tvm::RelayExpr const&)>*)#5}::_FUN(tvm::runtime::ObjectRef const&, tvm::relay::ExprFunctor<tvm::Type (tvm::RelayExpr const&)>*)+0x2c) [0x7ff150ea52fc]
  [bt] (2) tvm/build/libtvm.so(tvm::relay::TypeInferencer::VisitExpr_(tvm::relay::FunctionNode const*)+0x46) [0x7ff150eb4fc6]
  [bt] (1) tvm/build/libtvm.so(+0x4da348) [0x7ff150144348]
  [bt] (0) tvm/build/libtvm.so(+0x1090978) [0x7ff150cfa978]
  [bt] (8) tvm/build/libtvm.so(tvm::relay::TypeInferencer::VisitExpr_(tvm::relay::FunctionNode const*)+0x260) [0x7ff150eb51e0]
  [bt] (7) tvm/build/libtvm.so(tvm::relay::TypeInferencer::GetType(tvm::RelayExpr const&)+0xee) [0x7ff150eb460e]
  [bt] (6) tvm/build/libtvm.so(void tvm::relay::ExpandDataflow<tvm::relay::TypeInferencer::VisitExpr(tvm::RelayExpr const&)::{lambda(tvm::RelayExpr const&)#1}, tvm::relay::TypeInferencer::VisitExpr(tvm::RelayExpr const&)::{lambda(tvm::RelayExpr const&)#2}>(tvm::RelayExpr, tvm::relay::TypeInferencer::VisitExpr(tvm::RelayExpr const&)::{lambda(tvm::RelayExpr const&)#1}, tvm::relay::TypeInferencer::VisitExpr(tvm::RelayExpr const&)::{lambda(tvm::RelayExpr const&)#2})+0xc55) [0x7ff150eb3d55]
  [bt] (5) tvm/build/libtvm.so(tvm::relay::ExprFunctor<tvm::Type (tvm::RelayExpr const&)>::InitVTable()::{lambda(tvm::runtime::ObjectRef const&, tvm::relay::ExprFunctor<tvm::Type (tvm::RelayExpr const&)>*)#5}::_FUN(tvm::runtime::ObjectRef const&, tvm::relay::ExprFunctor<tvm::Type (tvm::RelayExpr const&)>*)+0x2c) [0x7ff150ea52fc]
  [bt] (4) tvm/build/libtvm.so(tvm::relay::TypeInferencer::VisitExpr_(tvm::relay::FunctionNode const*)+0x46) [0x7ff150eb4fc6]
  [bt] (3) tvm/build/libtvm.so(tvm::relay::TypeSolver::Solve()+0x447) [0x7ff150cfc2b7]
  [bt] (2) tvm/build/libtvm.so(tvm::runtime::TypedPackedFunc<bool (tvm::runtime::Array<tvm::Type, void> const&, int, tvm::Attrs const&, tvm::TypeReporter const&)>::AssignTypedLambda<bool (*)(tvm::runtime::Array<tvm::Type, void> const&, int, tvm::Attrs const&, tvm::TypeReporter const&)>(bool (*)(tvm::runtime::Array<tvm::Type, void> const&, int, tvm::Attrs const&, tvm::TypeReporter const&))::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}::operator()(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*) const+0x55f) [0x7ff15050d95f]
  [bt] (1) tvm/build/libtvm.so(bool tvm::relay::DenseRel<tvm::relay::DenseAttrs>(tvm::runtime::Array<tvm::Type, void> const&, int, tvm::Attrs const&, tvm::TypeReporter const&)+0x751) [0x7ff150b3ac91]
  [bt] (0) tvm/build/libtvm.so(+0xeaff28) [0x7ff150b19f28]
  File "tvm/src/relay/analysis/type_solver.cc", line 622
TVMError: 
---------------------------------------------------------------
An internal invariant was violated during the execution of TVM.
Please read TVM's error reporting guidelines.
More details can be found here: https://discuss.tvm.ai/t/error-reporting/7793.
---------------------------------------------------------------
  Check failed: false == false: [22:31:18] tvm/src/relay/op/nn/nn.h:73: 
---------------------------------------------------------------
An internal invariant was violated during the execution of TVM.
Please read TVM's error reporting guidelines.
More details can be found here: https://discuss.tvm.ai/t/error-reporting/7793.
---------------------------------------------------------------
  Check failed: static_cast<int>(weight->shape.size()) == 2 == false: 
@jwfromm
Copy link
Contributor

jwfromm commented Feb 22, 2021

@merrymercy have you encountered this issue? It seems like it's quite significant given the importance of BERT.

@merrymercy
Copy link
Member

I think this problem is similar to the problems in #7475 and #7501
We can fix it by taking a similar approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants