Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Cache CUDA modules when building Mxnet #17158

Closed
cyrusbehr opened this issue Dec 23, 2019 · 4 comments
Closed

Cache CUDA modules when building Mxnet #17158

cyrusbehr opened this issue Dec 23, 2019 · 4 comments

Comments

@cyrusbehr
Copy link

I am trying to build Mxnet with GPU support using the following options:

cmake -DUSE_CUDNN=1 -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda -DCMAKE_BUILD_TYPE=Release -DBLAS=open -DUSE_OPENCV=OFF -DUSE_CPP_PACKAGE=ON -DENABLE_CUDA_RTC=ON ..

Currently, the build takes quiet long (~3 hours). I've noticed that the majority of the time is spent building the Cuda modules.

For my use case, I'll be rebuilding the library often (as part of my CI pipeline) but would like to reduce the build time. Is there some way to run an incremental build which caches these Cuda modules (as they are relatively unchanging) and re-uses them between builds to speed things up?

@yajiedesign
Copy link
Contributor

You can try ccache in linux.

@leezu
Copy link
Contributor

leezu commented Dec 24, 2019

In addition to ccache, you can also

  1. Use ninja instead of make. For that specify cmake -DNinja [...] and use ninja instead of make -j.
  2. Use a single a single cuda compute architecture. With Use CMake standard library to handle CUDA #17031 you only need to set -DMXNET_CUDA_ARCH=7.0 (as an example), but until it is merged, you need to set multiple variables. See the commit message at Use CMake standard library to handle CUDA #17031 for the variables you currently need to set. If it's not urgent, you can wait a few days until Use CMake standard library to handle CUDA #17031 is merged.

@cyrusbehr
Copy link
Author

Excellent thank you, I will give these a try

@leezu
Copy link
Contributor

leezu commented Dec 30, 2019

@cyrusbehr FYI, #17031 is merged.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants