Skip to content

Commit

Permalink
[MXNET-711] Website build and version dropdown update (apache#11892)
Browse files Browse the repository at this point in the history
* adding param for list of tags to display on website

* using new website display argument for artifact placement in version folder

* adding display logic

* remove restricted setting for testing

* update usage instructions

* reverted Jenkinsfile to use restricted nodes

[MXAPPS-581] Fixes for broken Straight Dope tests. (apache#11923)

* Update relative paths pointing to the data directory to point to the
  correct place in the testing temporary folder.

* Enable the notebooks that were previously broken because of relative
  file paths not pointing to the correct place.

* Move some notebooks we do not plan to test to the whitelist. These
  notebooks are not published in the Straight Dope book.

* Clean-up: Convert print statements to info/warn/error logging
  statements. Add some logging statements for better status.

Disable flaky test: test_spatial_transformer_with_type (apache#11930)

apache#11839

Add linux and macos MKLDNN Building Instruction (apache#11049)

* add linux and macos doc

* update doc

* Update MKL_README.md

* Update MKL_README.md

Add convolution code to verify mkldnn backend

* add homebrew link

* rename to MKLDNN_README

* add mkl verify

* trigger

* trigger

* set mac complier to gcc47

* add VS2017 support experimentally

* improve quality

* improve quality

* modify mac build instruction since prepare_mkldnn.sh has been rm

* trigger

* add some improvement

[MXNET-531] Add download util (apache#11866)

* add changes to example

* place the file to the util

* add retry scheme

* fix the retry logic

* change the DownloadUtil to Util

* Trigger the CI

[MXNET-11241] Avoid use of troublesome cudnnFind() results when grad_req='add' (apache#11338)

* Add tests that fail due to issue 11241

* Fix apache#11241 Conv1D throws CUDNN_STATUS_EXECUTION_FAILED

* Force algo 1 when grad_req==add with large c.  Expand tests.

* Shorten test runtimes.

Improving documentation and error messages for Async distributed training with Gluon (apache#11910)

* Add description about update on kvstore

* add async check for gluon

* only raise error if user set update_on_kvstore

* fix condition

* add async nightly test

* fix case when no kvstore

* add example for trainer creation in doc

[MXNET-641] fix R windows install docs (apache#11805)

* fix R windows install docs

* addressed PR comments

* PR comments

* PR comments

* fixed line wrappings

* fixed line wrappings

a hot fix for mkldnn link (apache#11939)

re-enabling randomized test_l2_normalization (apache#11900)

[MXNET-651] MXNet Model Backwards Compatibility Checker (apache#11626)

* Added MNIST-MLP-Module-API models to check model save and load_checkpoint methods

* Added LENET with Conv2D operator training file

* Added LENET with Conv2d operator inference file

* Added LanguageModelling with RNN training file

* Added LamguageModelling with RNN inference file

* Added hybridized LENET Gluon Model training file

* Added hybridized LENET gluon model inference file

* Added license headers

* Refactored the model and inference files and extracted out duplicate code in a common file

* Added runtime function for executing the MBCC files

* Added JenkinsFile for MBCC to be run as a nightly job

* Added boto3 install for s3 uploads

* Added README for MBCC

* Added license header

* Added more common functions from lm_rnn_gluon_train and inference files into common.py to clean up code

* Added scripts for training models on older versions of MXNet

* Added check for preventing inference script from crashing in case no trained models are found

* Fixed indentation issue

* Replaced Penn Tree Bank Dataset with Sherlock Holmes Dataset

* Fixed indentation issue

* Removed training in models and added smaller models. Now we are simply checking a forward pass in the model with dummy data.

* Updated README

* Fixed indentation error

* Fixed indentation error

* Removed code duplication in the training file

* Added comments for runtime_functions script for training files

* Merged S3 Buckets for storing data and models into one

* Automated the process to fetch MXNet versions from git tags

* Added defensive checks for the case where the data might not be found

* Fixed issue where we were performing inference on state model files

* Replaced print statements with logging ones

* Removed boto install statements and move them into ubuntu_python docker

* Separated training and uploading of models into separate files so that training runs in Docker and upload runs outside Docker

* Fixed pylint warnings

* Updated comments and README

* Removed the venv for training process

* Fixed indentation in the MBCC Jenkins file and also separated out training and inference into two separate stages

* Fixed indendation

* Fixed erroneous single quote

* Added --user flag to check for Jenkins error

* Removed unused methods

* Added force flag in the pip command to install mxnet

* Removed the force-re-install flag

* Changed exit 1 to exit 0

* Added quotes around the shell command

* added packlibs and unpack libs for MXNet builds

* Changed PythonPath from relative to absolute

* Created dedicated bucket with correct permission

* Fix for python path in training

* Changed bucket name to CI bucket

* Added set -ex to the upload shell script

* Now raising an exception if no models are found in the S3 bucket

* Added regex to train models script

* Added check for performing inference only on models trained on same major versions

* Added set -ex flags to shell scripts

* Added multi-version regex checks in training

* Fixed typo in regex

* Now we will train models for all the minor versions for a given major version by traversing the tags

* Added check for validating current_version

[MXNET-531] NeuralStyle Example for Scala (apache#11621)

* add initial neuralstyle and test coverage

* Add two more test and README

* kill comments

* patch on memory leaks fix

* fix formatting issues

* remove redundant files

* disable the Gan example for now

* add ignore method

* add new download scheme to match the changes
  • Loading branch information
aaronmarkham committed Aug 7, 2018
1 parent 4b3988e commit d7b0156
Show file tree
Hide file tree
Showing 51 changed files with 2,316 additions and 953 deletions.
301 changes: 301 additions & 0 deletions MKLDNN_README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,301 @@
# Build/Install MXNet with MKL-DNN

Building MXNet with [Intel MKL-DNN](https://github.com/intel/mkl-dnn) will gain better performance when using Intel Xeon CPUs for training and inference. The improvement of performance can be seen in this [page](https://mxnet.incubator.apache.org/faq/perf.html#intel-cpu). Below are instructions for linux, MacOS and Windows platform.

<h2 id="0">Contents</h2>

* [1. Linux](#1)
* [2. MacOS](#2)
* [3. Windows](#3)
* [4. Verify MXNet with python](#4)
* [5. Enable MKL BLAS](#5)
* [6. Support](#6)

<h2 id="1">Linux</h2>

### Prerequisites

```
sudo apt-get update
sudo apt-get install -y build-essential git
sudo apt-get install -y libopenblas-dev liblapack-dev
sudo apt-get install -y libopencv-dev
sudo apt-get install -y graphviz
```

### Clone MXNet sources

```
git clone --recursive https://github.com/apache/incubator-mxnet.git
cd incubator-mxnet
```

### Build MXNet with MKL-DNN

```
make -j $(nproc) USE_OPENCV=1 USE_MKLDNN=1 USE_BLAS=mkl USE_INTEL_PATH=/opt/intel
```

If you don't have full [MKL](https://software.intel.com/en-us/intel-mkl) library installed, you can use OpenBLAS by setting `USE_BLAS=openblas`.

<h2 id="2">MacOS</h2>

### Prerequisites

Install the dependencies, required for MXNet, with the following commands:

- [Homebrew](https://brew.sh/)
- gcc (clang in macOS does not support OpenMP)
- OpenCV (for computer vision operations)

```
# Paste this command in Mac terminal to install Homebrew
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
# install dependency
brew update
brew install pkg-config
brew install graphviz
brew tap homebrew/core
brew install opencv
brew tap homebrew/versions
brew install gcc49
brew link gcc49 #gcc-5 and gcc-7 also work
```

### Clone MXNet sources

```
git clone --recursive https://github.com/apache/incubator-mxnet.git
cd incubator-mxnet
```

### Enable OpenMP for MacOS

If you want to enable OpenMP for better performance, you should modify the Makefile in MXNet root dictionary:

Add CFLAGS '-fopenmp' for Darwin.

```
ifeq ($(USE_OPENMP), 1)
# ifneq ($(UNAME_S), Darwin)
CFLAGS += -fopenmp
# endif
endif
```

### Build MXNet with MKL-DNN

```
make -j $(sysctl -n hw.ncpu) CC=gcc-4.9 CXX=g++-4.9 USE_OPENCV=0 USE_OPENMP=1 USE_MKLDNN=1 USE_BLAS=apple USE_PROFILER=1
```

*Note: Temporarily disable OPENCV.*

<h2 id="3">Windows</h2>

We recommend to build and install MXNet yourself using [Microsoft Visual Studio 2015](https://www.visualstudio.com/vs/older-downloads/), or you can also try experimentally the latest [Microsoft Visual Studio 2017](https://www.visualstudio.com/downloads/).

**Visual Studio 2015**

To build and install MXNet yourself, you need the following dependencies. Install the required dependencies:

1. If [Microsoft Visual Studio 2015](https://www.visualstudio.com/vs/older-downloads/) is not already installed, download and install it. You can download and install the free community edition.
2. Download and Install [CMake 3](https://cmake.org/) if it is not already installed.
3. Download and install [OpenCV 3](http://sourceforge.net/projects/opencvlibrary/files/opencv-win/3.0.0/opencv-3.0.0.exe/download).
4. Unzip the OpenCV package.
5. Set the environment variable ```OpenCV_DIR``` to point to the ```OpenCV build directory``` (```C:\opencv\build\x64\vc14``` for example). Also, you need to add the OpenCV bin directory (```C:\opencv\build\x64\vc14\bin``` for example) to the ``PATH`` variable.
6. If you have Intel Math Kernel Library (MKL) installed, set ```MKL_ROOT``` to point to ```MKL``` directory that contains the ```include``` and ```lib```. If you want to use MKL blas, you should set ```-DUSE_BLAS=mkl``` when cmake. Typically, you can find the directory in
```C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2018\windows\mkl```.
7. If you don't have the Intel Math Kernel Library (MKL) installed, download and install [OpenBLAS](http://sourceforge.net/projects/openblas/files/v0.2.14/). Note that you should also download ```mingw64.dll.zip`` along with openBLAS and add them to PATH.
8. Set the environment variable ```OpenBLAS_HOME``` to point to the ```OpenBLAS``` directory that contains the ```include``` and ```lib``` directories. Typically, you can find the directory in ```C:\Program files (x86)\OpenBLAS\```.

After you have installed all of the required dependencies, build the MXNet source code:

1. Download the MXNet source code from [GitHub](https://github.com/apache/incubator-mxnet). Don't forget to pull the submodules:
```
git clone --recursive https://github.com/apache/incubator-mxnet.git
```

2. Copy file `3rdparty/mkldnn/config_template.vcxproj` to incubator-mxnet root.

3. Start a Visual Studio command prompt.

4. Use [CMake 3](https://cmake.org/) to create a Visual Studio solution in ```./build``` or some other directory. Make sure to specify the architecture in the
[CMake 3](https://cmake.org/) command:
```
mkdir build
cd build
cmake -G "Visual Studio 14 Win64" .. -DUSE_CUDA=0 -DUSE_CUDNN=0 -DUSE_NVRTC=0 -DUSE_OPENCV=1 -DUSE_OPENMP=1 -DUSE_PROFILER=1 -DUSE_BLAS=open -DUSE_LAPACK=1 -DUSE_DIST_KVSTORE=0 -DCUDA_ARCH_NAME=All -DUSE_MKLDNN=1 -DCMAKE_BUILD_TYPE=Release
```

5. In Visual Studio, open the solution file,```.sln```, and compile it.
These commands produce a library called ```libmxnet.dll``` in the ```./build/Release/``` or ```./build/Debug``` folder.
Also ```libmkldnn.dll``` with be in the ```./build/3rdparty/mkldnn/src/Release/```

6. Make sure that all the dll files used above(such as `libmkldnn.dll`, `libmklml.dll`, `libiomp5.dll`, `libopenblas.dll`, etc) are added to the system PATH. For convinence, you can put all of them to ```\windows\system32```. Or you will come across `Not Found Dependencies` when loading mxnet.

**Visual Studio 2017**

To build and install MXNet yourself using [Microsoft Visual Studio 2017](https://www.visualstudio.com/downloads/), you need the following dependencies. Install the required dependencies:

1. If [Microsoft Visual Studio 2017](https://www.visualstudio.com/downloads/) is not already installed, download and install it. You can download and install the free community edition.
2. Download and install [CMake 3](https://cmake.org/files/v3.11/cmake-3.11.0-rc4-win64-x64.msi) if it is not already installed.
3. Download and install [OpenCV](https://sourceforge.net/projects/opencvlibrary/files/opencv-win/3.4.1/opencv-3.4.1-vc14_vc15.exe/download).
4. Unzip the OpenCV package.
5. Set the environment variable ```OpenCV_DIR``` to point to the ```OpenCV build directory``` (e.g., ```OpenCV_DIR = C:\utils\opencv\build```).
6. If you don’t have the Intel Math Kernel Library (MKL) installed, download and install [OpenBlas](https://sourceforge.net/projects/openblas/files/v0.2.20/OpenBLAS%200.2.20%20version.zip/download).
7. Set the environment variable ```OpenBLAS_HOME``` to point to the ```OpenBLAS``` directory that contains the ```include``` and ```lib``` directories (e.g., ```OpenBLAS_HOME = C:\utils\OpenBLAS```).

After you have installed all of the required dependencies, build the MXNet source code:

1. Start ```cmd``` in windows.

2. Download the MXNet source code from GitHub by using following command:

```r
cd C:\
git clone --recursive https://github.com/apache/incubator-mxnet.git
```

3. Copy file `3rdparty/mkldnn/config_template.vcxproj` to incubator-mxnet root.

4. Follow [this link](https://docs.microsoft.com/en-us/visualstudio/install/modify-visual-studio) to modify ```Individual components```, and check ```VC++ 2017 version 15.4 v14.11 toolset```, and click ```Modify```.

5. Change the version of the Visual studio 2017 to v14.11 using the following command (by default the VS2017 is installed in the following path):

```r
"C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Auxiliary\Build\vcvars64.bat" -vcvars_ver=14.11
```

6. Create a build dir using the following command and go to the directory, for example:

```r
mkdir C:\build
cd C:\build
```

7. CMake the MXNet source code by using following command:

```r
cmake -G "Visual Studio 15 2017 Win64" .. -T host=x64 -DUSE_CUDA=0 -DUSE_CUDNN=0 -DUSE_NVRTC=0 -DUSE_OPENCV=1 -DUSE_OPENMP=1 -DUSE_PROFILER=1 -DUSE_BLAS=open -DUSE_LAPACK=1 -DUSE_DIST_KVSTORE=0 -DCUDA_ARCH_NAME=All -DUSE_MKLDNN=1 -DCMAKE_BUILD_TYPE=Release
```

8. After the CMake successfully completed, compile the the MXNet source code by using following command:

```r
msbuild mxnet.sln /p:Configuration=Release;Platform=x64 /maxcpucount
```

9. Make sure that all the dll files used above(such as `libmkldnn.dll`, `libmklml.dll`, `libiomp5.dll`, `libopenblas.dll`, etc) are added to the system PATH. For convinence, you can put all of them to ```\windows\system32```. Or you will come across `Not Found Dependencies` when loading mxnet.

<h2 id="4">Verify MXNet with python</h2>

```
cd python
sudo python setup.py install
python -c "import mxnet as mx;print((mx.nd.ones((2, 3))*2).asnumpy());"
Expected Output:
[[ 2. 2. 2.]
[ 2. 2. 2.]]
```

### Verify whether MKL-DNN works

After MXNet is installed, you can verify if MKL-DNN backend works well with a single Convolution layer.

```
import mxnet as mx
import numpy as np
num_filter = 32
kernel = (3, 3)
pad = (1, 1)
shape = (32, 32, 256, 256)
x = mx.sym.Variable('x')
w = mx.sym.Variable('w')
y = mx.sym.Convolution(data=x, weight=w, num_filter=num_filter, kernel=kernel, no_bias=True, pad=pad)
exe = y.simple_bind(mx.cpu(), x=shape)
exe.arg_arrays[0][:] = np.random.normal(size=exe.arg_arrays[0].shape)
exe.arg_arrays[1][:] = np.random.normal(size=exe.arg_arrays[1].shape)
exe.forward(is_train=False)
o = exe.outputs[0]
t = o.asnumpy()
```

You can open the `MKLDNN_VERBOSE` flag by setting environment variable:
```
export MKLDNN_VERBOSE=1
```
Then by running above code snippet, you probably will get the following output message which means `convolution` and `reorder` primitive from MKL-DNN are called. Layout information and primitive execution performance are also demonstrated in the log message.
```
mkldnn_verbose,exec,reorder,jit:uni,undef,in:f32_nchw out:f32_nChw16c,num:1,32x32x256x256,6.47681
mkldnn_verbose,exec,reorder,jit:uni,undef,in:f32_oihw out:f32_OIhw16i16o,num:1,32x32x3x3,0.0429688
mkldnn_verbose,exec,convolution,jit:avx512_common,forward_inference,fsrc:nChw16c fwei:OIhw16i16o fbia:undef fdst:nChw16c,alg:convolution_direct,mb32_g1ic32oc32_ih256oh256kh3sh1dh0ph1_iw256ow256kw3sw1dw0pw1,9.98193
mkldnn_verbose,exec,reorder,jit:uni,undef,in:f32_oihw out:f32_OIhw16i16o,num:1,32x32x3x3,0.0510254
mkldnn_verbose,exec,reorder,jit:uni,undef,in:f32_nChw16c out:f32_nchw,num:1,32x32x256x256,20.4819
```

<h2 id="5">Enable MKL BLAS</h2>

To make it convenient for customers, Intel introduced a new license called [Intel® Simplified license](https://software.intel.com/en-us/license/intel-simplified-software-license) that allows to redistribute not only dynamic libraries but also headers, examples and static libraries.

Installing and enabling the full MKL installation enables MKL support for all operators under the linalg namespace.

1. Download and install the latest full MKL version following instructions on the [intel website.](https://software.intel.com/en-us/mkl)

2. Run `make -j ${nproc} USE_BLAS=mkl`

3. Navigate into the python directory

4. Run `sudo python setup.py install`

### Verify whether MKL works

After MXNet is installed, you can verify if MKL BLAS works well with a single dot layer.

```
import mxnet as mx
import numpy as np
shape_x = (1, 10, 8)
shape_w = (1, 12, 8)
x_npy = np.random.normal(0, 1, shape_x)
w_npy = np.random.normal(0, 1, shape_w)
x = mx.sym.Variable('x')
w = mx.sym.Variable('w')
y = mx.sym.batch_dot(x, w, transpose_b=True)
exe = y.simple_bind(mx.cpu(), x=x_npy.shape, w=w_npy.shape)
exe.forward(is_train=False)
o = exe.outputs[0]
t = o.asnumpy()
```

You can open the `MKL_VERBOSE` flag by setting environment variable:
```
export MKL_VERBOSE=1
```
Then by running above code snippet, you probably will get the following output message which means `SGEMM` primitive from MKL are called. Layout information and primitive execution performance are also demonstrated in the log message.
```
Numpy + Intel(R) MKL: THREADING LAYER: (null)
Numpy + Intel(R) MKL: setting Intel(R) MKL to use INTEL OpenMP runtime
Numpy + Intel(R) MKL: preloading libiomp5.so runtime
MKL_VERBOSE Intel(R) MKL 2018.0 Update 1 Product build 20171007 for Intel(R) 64 architecture Intel(R) Advanced Vector Extensions 512 (Intel(R) AVX-512) enabled processors, Lnx 2.40GHz lp64 intel_thread NMICDev:0
MKL_VERBOSE SGEMM(T,N,12,10,8,0x7f7f927b1378,0x1bc2140,8,0x1ba8040,8,0x7f7f927b1380,0x7f7f7400a280,12) 8.93ms CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:40 WDiv:HOST:+0.000
```

<h2 id="6">Next Steps and Support</h2>

- For questions or support specific to MKL, visit the [Intel MKL](https://software.intel.com/en-us/mkl)

- For questions or support specific to MKL, visit the [Intel MKLDNN](https://github.com/intel/mkl-dnn)

- If you find bugs, please open an issue on GitHub for [MXNet with MKL](https://github.com/apache/incubator-mxnet/labels/MKL) or [MXNet with MKLDNN](https://github.com/apache/incubator-mxnet/labels/MKLDNN)
77 changes: 0 additions & 77 deletions MKL_README.md

This file was deleted.

2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ What's New
* [Version 0.8.0 Release](https://github.com/dmlc/mxnet/releases/tag/v0.8.0)
* [Updated Image Classification with new Pre-trained Models](./example/image-classification)
* [Python Notebooks for How to Use MXNet](https://github.com/dmlc/mxnet-notebooks)
* [MKLDNN for Faster CPU Performance](./MKL_README.md)
* [MKLDNN for Faster CPU Performance](./MKLDNN_README.md)
* [MXNet Memory Monger, Training Deeper Nets with Sublinear Memory Cost](https://github.com/dmlc/mxnet-memonger)
* [Tutorial for NVidia GTC 2016](https://github.com/dmlc/mxnet-gtc-tutorial)
* [Embedding Torch layers and functions in MXNet](https://mxnet.incubator.apache.org/faq/torch.html)
Expand Down
4 changes: 2 additions & 2 deletions ci/docker/install/ubuntu_python.sh
Original file line number Diff line number Diff line change
Expand Up @@ -29,5 +29,5 @@ wget -nv https://bootstrap.pypa.io/get-pip.py
python3 get-pip.py
python2 get-pip.py

pip2 install nose cpplint==1.3.0 pylint==1.8.3 'numpy<1.15.0,>=1.8.2' nose-timer 'requests<2.19.0,>=2.18.4' h5py==2.8.0rc1 scipy==1.0.1
pip3 install nose cpplint==1.3.0 pylint==1.8.3 'numpy<1.15.0,>=1.8.2' nose-timer 'requests<2.19.0,>=2.18.4' h5py==2.8.0rc1 scipy==1.0.1
pip2 install nose cpplint==1.3.0 pylint==1.8.3 'numpy<1.15.0,>=1.8.2' nose-timer 'requests<2.19.0,>=2.18.4' h5py==2.8.0rc1 scipy==1.0.1 boto3
pip3 install nose cpplint==1.3.0 pylint==1.8.3 'numpy<1.15.0,>=1.8.2' nose-timer 'requests<2.19.0,>=2.18.4' h5py==2.8.0rc1 scipy==1.0.1 boto3
Loading

0 comments on commit d7b0156

Please sign in to comment.