-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize size of default-cpu / gpu dockerfiles, update to latest ubuntu and cuda, remove python 2 #3557
Conversation
How small are they now? |
I think docker slim reduces the size based on what is used and not used during execution. But we are providing a general image which does not have a specific behavior since user can do anything with it. So I am not sure docker-slim can help much. I checked the images and looks like they are already removing the apt files. |
Okay, sounds good. From my earlier comment "How small are they now?" I meant, how large are the images now (in bytes?) I know they used to be 5 GB, so I'd like to know how much space we saved in this PR. |
I am not sure why but locally if you do Just as an experiment, I tried removing python2, the docker image size is halved. |
wget | ||
|
||
RUN curl https://bootstrap.pypa.io/pip/2.7/get-pip.py --output get-pip.py | ||
RUN python2 get-pip.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be able to remove python 2 now.
RUN add-apt-repository ppa:deadsnakes/ppa && \ | ||
apt-get update -y && \ | ||
apt-get install python3.6 -y \ | ||
RUN apt-get install python3.6 -y \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's combine this with the previous apt-get command
RUN add-apt-repository ppa:deadsnakes/ppa && \ | ||
apt-get update -y && \ | ||
apt-get install python3.6 -y \ | ||
RUN apt-get install python3.6 -y \ | ||
python3.6-venv \ | ||
python3.6-dev \ | ||
python3-pip \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the end, we should also add && rm -rf /var/lib/apt/lists/*
, which will save more space -- see https://stackoverflow.com/questions/61990329/dockerfile-benefits-of-repeated-apt-cache-cleans. let's also add this to the end of all apt-get install commands.
@@ -60,26 +54,11 @@ RUN wget http://scala-lang.org/files/archive/scala-$SCALA_VERSION.deb && \ | |||
apt-get clean && \ | |||
apt-get autoremove && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should combine all apt-get install commands into one single command if possible. I believe this should save a bunch of space (as we'd store far less layers).
RUN pip2 install -U \ | ||
tensorflow-gpu==1.12.0 \ | ||
tensorboard \ | ||
keras | ||
RUN python3 -m pip install -U \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For both dockerfiles: we should combine all pip install commands into a single command, and also use the --no-cache-dir
option to save space. See https://stackoverflow.com/questions/45594707/what-is-pips-no-cache-dir-good-for (we can probably keep the pip install command at the end with apex as-is, though)
@epicfaace looks like the newer nvidia/cuda image is much larger than the older version we used before. The old base image was under 1GB (link) and if we upgrade the newer image is 2.61GB (link), this will increase our default-gpu image size by a lot. What should we do in this case? |
As per our discussion, let's try the following:
|
sudo rm -rf /opt/ghc | ||
sudo apt clean | ||
docker rmi $(docker image ls -aq) | ||
df -h | ||
- run: python3 codalab_service.py build ${SERVICE} $([ -z "${CODALAB_DOCKER_USERNAME}" ] || echo "--push") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the github action file, should be working now: https://hub.docker.com/layers/codalab/default-cpu/update_images/images/sha256-ee7b6df243b8e51db27fccc8f1f125dee21bddff29fa21da1377b11b75c4acd4?context=explore
Image size changes with this PR: It might still be worth it to consider using multistage builds for default-gpu to save more space -- what do you think? |
I think it might be fine for now, we can have a separate PR to do multistage. |
FYI @percyliang -- default-cpu size decreased because of optimizations / removing Python 2. default-gpu size increased because we upgraded ubuntu (which required us to upgrade cuda, which is much larger than the earlier version). We could decrease the size of both images, especially default-gpu, by using docker multi-stage builds. However, we decided that this is beyond the scope of this PR and could do so in another PR. |
# Newer pip dropped the support for python 2 so we need to specify the version. | ||
RUN python -m pip install --upgrade "pip < 21.0" | ||
|
||
## Python packages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this line (duplicated from above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, it has been removed.
Reasons for making this change
Update (to latest ubuntu) and make default-cpu / gpu dockerfiles smaller
Related issues
fixes #3526
Screenshots
Checklist