-
Notifications
You must be signed in to change notification settings - Fork 4.3k
CNTK Docker Containers
You can set up CNTK as a Docker Container on your Linux system. You can build and run CNTK using the same container and this is a recommended approach to reproduce our reference configuration.
First you need to install docker. It is highly recommended to follow the installation process in the official docker documentation. Versions that come with your Linux distribution might be outdated and will not work with nvidia-docker
(which you may need to install in addition to docker if you plan to build and run the GPU image from within the same container). You should also follow the instructions in the optional section titled creating a docker group.
The correspondent Docker files are in the CNTK Repository at https://github.com/Microsoft/CNTK/tree/master/Tools/docker
To build a docker image with CNTK and all its dependencies, simply clone the CNTK repository, navigate to CNTK/Tools/docker
and use the Dockerfile you want to build from (either CPU only or CPU+GPU). For example, to build CNTK's GPU docker image execute
docker build -t cntk CNTK-GPU-Image
If you receive errors that say Could not resolve 'archive.ubuntu.com'
you will need to provide docker with the IP addresses of your DNS servers.
First find the IP addresses of your DNS servers using, for example, the command
nm-tool
or the command
nmcli dev show
Let's say that the IPs of your DNS servers are a.b.c.d
and x.y.z.w
.
Then
- on Ubuntu 15.10 and later (or other Linux that uses systemd)
modify
/lib/systemd/system/docker.service
so that the docker daemon is started with the additional options--dns a.b.c.d --dns x.y.z.w
- on Ubuntu 15.04 and earlier (or other Linux that does not use systemd)
edit
/etc/default/docker
so that the lineis uncommented and contains the IP addresses of your DNS servers.#DOCKER_OPTS="--dns 8.8.8.8 --dns 8.8.4.4"
Note: some companies block public DNS servers such as 8.8.8.8
and 8.8.4.4. You can try using them but if the problem persists
you should try to use the DNS server IP addresses reported by nm-tool/nmcli
.
Restart the docker daemon via
sudo service docker restart
and delete any docker images that where created with the wrong DNS settings. To delete all docker images do
docker rmi $(docker images -q)
To delete all docker containers do
docker rm $(docker ps -a -q)
Now try again
docker build -t cntk CNTK-GPU-Image
If you have a GPU you'll want to test if you can access it through a docker container once you have built the image. Try this command:
docker run --rm cntk nvidia-smi
If it works, you are done. If it doesn't, it means that there is a mismatch between the CUDA version and/or drivers installed on your host and in your CNTK docker image. In particular, the mismatch is between the kernel-mode NVidia driver module and the user-mode module (which is a shared lib) and this happens if the version on the host does not exactly match the version in the container. Fortunately this is easy to fix. Just install nvidia-docker and use it exactly like docker (no need to rebuild the image).
nvidia-docker run --rm cntk nvidia-smi
This should work and enables CNTK to use the GPU from inside a docker container. If this does not work, search the Issues section on the [nvidia-docker GitHub] (https://github.com/NVIDIA/nvidia-docker/issues) -- many solutions are already documented. Note that if your /usr and /var directories are in different partitions, you will need some extra steps like [here] (https://github.com/NVIDIA/nvidia-docker/issues/211). To get an interactive shell to a container that will not be automatically deleted after you exit do
nvidia-docker run --name cntk_container1 -ti cntk bash
If you want to share your data and configurations between the host (your machine or VM) and the container in which you are using CNTK, use the -v option, e.g.
nvidia-docker run --name cntk_container1 -ti -v /project1/data:/data -v /project1/config:/config cntk bash
This will make /project1/data from the host visible as /data in the container, and /project1/config as /config. Such isolation reduces the chances of your containerized experiments overwriting or using wrong data.