Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Build from source is broken on USE_MKLDNN=1 #13636

Closed
juliusshufan opened this issue Dec 13, 2018 · 9 comments
Closed

Build from source is broken on USE_MKLDNN=1 #13636

juliusshufan opened this issue Dec 13, 2018 · 9 comments

Comments

@juliusshufan
Copy link
Contributor

Description

Build from source currently does not work on mxnet repo on github.
Seems the build is okay with commit afb6703
@azai91 @anirudh2290

Environment info (Required)

GCC 4.8.5 CentOS 7.5.1804

What to do:
1. Git clone the code
2. Run make -j USE_MKLDNN=1 USE_OPENCV=1 USE_BLAS=mkl

Package used (Python/R/Scala/Julia): Python 3.6
(I'm using ...)

Build info (Required if built from source)

Compiler (gcc/clang/mingw/visual studio): GCC 4.8.5

MXNet commit hash:
5bcf2bd

Build config:
make -j USE_MKLDNN=1 USE_OPENCV=1 USE_BLAS=mkl

Error Message:

a - build/src/initialize.o
/usr/bin/ld: cannot find -lmklml_intel
/usr/bin/ld: cannot find -lmklml_gnu
collect2: error: ld returned 1 exit status
make: *** [lib/libmxnet.so] Error 1
make: *** Waiting for unfinished jobs....
/usr/bin/ld: cannot find -lmklml_intel
/usr/bin/ld: cannot find -lmklml_gnu
collect2: error: ld returned 1 exit status
make: *** [bin/im2rec] Error 1
~/workspace/validation/mxnet

Minimum reproducible example

(If you are using your own code, please provide a short script that reproduces the error. Otherwise, please provide link to the existing example.)

Steps to reproduce

(Paste the commands you ran that produced the error.)

What have you tried to solve it?

@anirudh2290
Copy link
Member

Thanks for reporting @juliusshufan . This test fails only on centos 7.5 and not ubuntu ? MKL build has been added to CI as part of: #13607 .
Do we need to add a centos stage too ? @xinyu-intel @pengzhao-intel

@vdantu
Copy link
Contributor

vdantu commented Dec 13, 2018

@mxnet-label-bot add [build, breaking]

@azai91
Copy link
Contributor

azai91 commented Dec 13, 2018

@Vikas89 @mseth10

@xinyu-intel
Copy link
Contributor

@anirudh2290 I think this is due to the ci system install mklml in /usr/lib before building mxnet with mkldnn. I'll open a PR to disable it.

@juliusshufan
Copy link
Contributor Author

thanks all, this issue now fixed, issue closed accordingly.

@mseth10
Copy link
Contributor

mseth10 commented Dec 15, 2018

@xinyu-intel @juliusshufan Do we have a PR adding CentOS MKL to the CI? Can you please share the status and let us know if you need any help on that. It's a blocker for us to get the static linking feature pushed.

@TaoLv
Copy link
Member

TaoLv commented Dec 15, 2018

@mseth10 Why need have CentOS MKL in CI? I think @juliusshufan just shared his environment information as the issue template required. It doesn't necessarily mean that the issue only happens on CentOS and MKL BLAS.

@mseth10
Copy link
Contributor

mseth10 commented Dec 17, 2018

@TaoLv The build issue on CentOS MKL occurred when we tried to statically link MKL-DNN #13503, which passed all the MKL CI tests on Ubuntu. We reverted this static linking PR #13540 temporarily, but would like to merge it back. Having MKL on CI for other OS would help us identify such build issues before merging. What are your thoughts on this?
@xinyu-intel Does your PR #13643 fix the build issue? Can we go ahead and re-revert the static linking PR after your PR is merged?

@xinyu-intel
Copy link
Contributor

xinyu-intel commented Dec 18, 2018

@mseth10 It's okay to add MKL env to CentOS if you want but, as I said before, last time #13503 has passed all the MKL CI tests on Ubuntu was caused by a pre-installed libmklml_intel.so in usr/lib. So, please kindly wait #13643 to fix the CI ENV and then you can go ahead to enable static linking.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants