Skip to content

CNTK Docker Containers

Andrew Wald edited this page Oct 13, 2016 · 37 revisions

You can set up CNTK as a Docker Container on your Linux system. You can build and run CNTK using the same container and this is a recommended approach to reproduce our reference configuration.

First you need to install docker. It is highly recommended to follow the installation process in the official docker documentation. Versions that come with your Linux distribution might be outdated and will not work with nvidia-docker (which you may need to install in addition to docker if you plan to build and run the GPU image from within the same container). You should also follow the instructions in the optional section titled creating a docker group.

The correspondent Docker files are in the CNTK Repository at https://github.com/Microsoft/CNTK/tree/master/Tools/docker

To build a docker image with CNTK and all its dependencies, simply clone the CNTK repository, navigate to CNTK/Tools/docker and use the Dockerfile you want to build from (either CPU only or CPU+GPU). For example, to build CNTK's GPU docker image execute

docker build -t cntk CNTK-GPU-Image

If you receive errors that say Could not resolve 'archive.ubuntu.com' you will need to provide docker with the IP addresses of your DNS servers. First find the IP addresses of your DNS servers using, for example, the command

nm-tool

or the command

nmcli dev show

Let's say that the IPs of your DNS servers are a.b.c.d and x.y.z.w. Then

  • on Ubuntu 15.10 and later (or other Linux that uses systemd) modify /lib/systemd/system/docker.service so that the docker daemon is started with the additional options --dns a.b.c.d --dns x.y.z.w
  • on Ubuntu 15.04 and earlier (or other Linux that does not use systemd) edit /etc/default/docker so that the line
    #DOCKER_OPTS="--dns 8.8.8.8 --dns 8.8.4.4"
    
    is uncommented and contains the IP addresses of your DNS servers.

Note: some companies block public DNS servers such as 8.8.8.8 and 8.8.4.4. You can try using them but if the problem persists you should try to use the DNS server IP addresses reported by nm-tool/nmcli.

Restart the docker daemon via

sudo service docker restart

and delete any docker images that where created with the wrong DNS settings. To delete all docker images do

docker rmi $(docker images -q)

To delete all docker containers do

docker rm $(docker ps -a -q)

Now try again

docker build -t cntk CNTK-GPU-Image

If you have a GPU you'll want to test if you can access it through a docker container once you have built the image. Try this command:

docker run --rm cntk nvidia-smi

If it works, you are done. If it doesn't, it means that there is a mismatch between the CUDA version and/or drivers installed on your host and in your CNTK docker image. In particular, the mismatch is between the kernel-mode NVidia driver module and the user-mode module (which is a shared lib) and this happens if the version on the host does not exactly match the version in the container. Fortunately this is easy to fix. Just install nvidia-docker and use it exactly like docker (no need to rebuild the image).

nvidia-docker run --rm cntk nvidia-smi

This should work and enables CNTK to use the GPU from inside a docker container. If this does not work, search the Issues section on the [nvidia-docker GitHub] (https://github.com/NVIDIA/nvidia-docker/issues) -- many solutions are already documented. Note that if your /usr and /var directories are in different partitions, you will need some extra steps like [here] (https://github.com/NVIDIA/nvidia-docker/issues/211). To get an interactive shell to a container that will not be automatically deleted after you exit do

nvidia-docker run --name cntk_container1 -ti cntk bash

If you want to share your data and configurations between the host (your machine or VM) and the container in which you are using CNTK, use the -v option, e.g.

nvidia-docker run --name cntk_container1 -ti -v /project1/data:/data -v /project1/config:/config cntk bash

This will make /project1/data from the host visible as /data in the container, and /project1/config as /config. Such isolation reduces the chances of your containerized experiments overwriting or using wrong data.

Clone this wiki locally