Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker cleanup #1946

Merged

Conversation

filiperinaldi
Copy link

@filiperinaldi filiperinaldi commented Jan 31, 2019

Status

PRODUCTION / DEVELOPMENT

Description

  • Unify Dockerfiles to both Arm and x86. There is a lot of duplication for the generic part (creating user, installing ROS)
  • Create multiple images with different levels of support (allows to disable Cuda)
  • Install dependencies on the base image using rosdep so only the necessary is installed
  • Added DockerHub hooks
  • When using the base images, mount the user's Autoware source code as a volume

See autowarefoundation/autoware_ai#530

Related PRs

None.

Todos

  • Documentation
  • Add run.sh
  • Fix Cuda support
  • Simplify ROS package dependencies

Tests

  • AArch64 (Host: Ubuntu 18.04)
  • x86_64 (Host: Ubuntu 16.04)
  • x86_64 + Cuda (Host: Ubuntu 16.04)

Steps to Test or Reproduce

cd docker/generic
# Building:
./build.sh --tag-prefix local

# Running with cuda
./run.sh --tag-prefix local

# Running without cuda
./run.sh --tag-prefix local --cuda off

Copy link

@sgermanserrano sgermanserrano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@filiperinaldi Thanks for the PR!
@esteve @gbiggs @amc-nu I have been involved in the development of this PR, hence I'd like for at least one of you to have a look at it before merging. The changes on this PR are also needed for #2042

Copy link
Member

@amc-nu amc-nu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@filiperinaldi @sgermanserrano Thanks for all the effort put on this PR. Looks great.

I have some questions/observations:

  1. Is there any recommended nvidia-docker version?
  2. I would suggest adding --rm to the docker build in the scripts.
  3. After pulling the branch and running ./run.sh I get a message telling me the image autoware/autoware:latest-kinetic-cuda is not available. Do you plan to upload it to docker hub before or after merging this PR?
  4. I'm getting an error when building the CUDA version. liblibdpm_ttic.so: undefined reference to 'cuMemFree_v2' (Plus other undefined references when linking that same file). I think this PR might be missing Fix/colcon build #2000 . Can you confirm? If that's the case, it will be fixed after merging, so please ignore this point.
  5. CPU version builds flawlessly. However, seems it still doesn't contain [fix] Install commands for all the packages #1861 Fix install directives #1990 it is also pointing to devel space. I wasn't able to run anything inside the container.

As a note @sgermanserrano @kfunaoka we'll need to update the Wiki. Maybe we need an issue and add it to the release todo list.

RUN su -c "bash -c 'source /opt/ros/$ROS_DISTRO/setup.bash; \
cd /home/$USERNAME/Autoware/ros; \
./colcon_release'" $USERNAME
RUN echo "source /home/$USERNAME/Autoware/ros/devel/setup.bash" >> \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here points to devel

rm -rf /var/lib/apt/lists/*
COPY --from=nvidia/opengl:1.0-glvnd-runtime-ubuntu16.04 \
/usr/local/lib/x86_64-linux-gnu \
/usr/lib/x86_64-linux-gnu
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the change to /usr/lib? /usr/local/lib makes more sense IMHO

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@esteve that change was made to address a duplication problem: filiperinaldi#10

/usr/local/share/glvnd/egl_vendor.d/10_nvidia.json
RUN echo '/usr/local/lib/x86_64-linux-gnu' >> /etc/ld.so.conf.d/glvnd.conf && \
ldconfig && \
echo '/usr/$LIB/libGL.so.1' >> /etc/ld.so.preload && \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the value of $LIB? It doesn't seem to be set anywhere.

Servando German and others added 3 commits March 1, 2019 09:35
Unify docker support with a new (and more generic) set of docker files.

Two main images are created:
- Base image containing an Ubuntu installation with all dependencies
- Build image containing a pre-built Autoware

When Cuda support is enabled, there is a further Base+Cuda support image.
These Dockerfiles can be used for both AArch64 and x86_64 images.

Co-authored-by: Filipe Rinaldi <filipe.rinaldi@arm.com>
@sgermanserrano sgermanserrano mentioned this pull request Mar 1, 2019
@esteve esteve self-requested a review March 1, 2019 10:37
@filiperinaldi filiperinaldi force-pushed the topic/docker_cleanup branch from 5d265ba to 0b014cf Compare March 1, 2019 10:44
esteve
esteve previously requested changes Mar 1, 2019
COPY ./ /tmp/Autoware
RUN apt-get update && \
apt-get install -y ros-$ROS_DISTRO-desktop-full && \
rosdep install -y --from-paths /tmp/Autoware/ros/src --ignore-src && \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd advise against using rosdep here as it prevents Docker from using the cache effectively. There's also no trace of what packages are being installed, and more importantly, given that ROS1 does not guarantee ABI compatibility, any ROS package installed afterwards may cause a segfault.

Copy link
Author

@filiperinaldi filiperinaldi Mar 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous Dockerfile had a list of ROS packages being installed. They were not comprehensive and were very likely to be out of sync as the dependencies change from version to version.

The rationale for doing it like this was that for a given version of Autoware, its corresponding Docker base image will have the dependencies pre-installed. If you are using this based to develop new features, then you will have to manually run rosdep again to complement the new dependencies required.

The effect on Docker's cache could be minimised with a list, but the cache would still be discarded eventually when the list gets updated. It is a trade-off between getting the dependencies automatically sorted out (instead of someone manually finding them out and updating the Dockerfile) vs increasing the usage of cache.
What do you think would be a better balance? Perhaps splitting ros-desktop-full and a few other basic ROS packages out into their own RUN command to at least re-use that layer? Move the rosdep step to be part of the build steps?

I'm not sure I understand the compatibility issue. Aren't these all deb packages which are meant to be compatible on the same distro?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think would be a better balance? Perhaps splitting ros-desktop-full and a few other basic ROS packages out into their own RUN command to at least re-use that layer? Move the rosdep step to be part of the build steps?

I think having a list of known packages and then running rosdep install is a good compromise. OSRF has the process entirely automated in their images, perhaps in the future we could just have a script that runs rosdep install and if there are any new packages, add them to Dockerfile and commit the changes.

I'm not sure I understand the compatibility issue. Aren't these all deb packages which are meant to be compatible on the same distro?

Unfortunately not. If I have packages installed from a previous sync and I install new ones, they are most certainly to crash because they are not ABI-compatible. ROS releases must be upgraded/installed all at once, you can't mix packages from different syncs, even if they are from the same distribution.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a correction, it's not certain as I stated, but there's a chance that the packages are incompatible.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@filiperinaldi
Copy link
Author

@amc-nu See below reply to the different points:

@filiperinaldi @sgermanserrano Thanks for all the effort put on this PR. Looks great.
I have some questions/observations:

1. Is there any recommended nvidia-docker version?

There is a need to use nvidia-docker v2

I'll add the version to the README.md.

sgermanserrano and others added 2 commits March 6, 2019 15:02
Make it explicit in the documentation that we recommend NVIDIA Docker
v2.
@filiperinaldi filiperinaldi force-pushed the topic/docker_cleanup branch from 405250d to c007199 Compare March 6, 2019 15:03
@filiperinaldi
Copy link
Author

@amc-nu See below reply to the different points:

@filiperinaldi @sgermanserrano Thanks for all the effort put on this PR. Looks great.
I have some questions/observations:

1. Is there any recommended nvidia-docker version?

There is a need to use nvidia-docker v2

I'll add the version to the README.md.

Done.


ENV LIBRARY_PATH /usr/local/cuda/lib64/stubs

# Support for Nvidia docker v2

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job! This is where I want to add.

@MouriNaruto
Copy link

Good Job. You have done something I want to.

I want to add OpenGL support to the dockerfile. But I found you have done it.

Thank you very much.

Mouri

@sgermanserrano sgermanserrano requested a review from esteve March 11, 2019 09:15
@esteve esteve dismissed their stale review March 12, 2019 11:26

I think the changes look overall, but unfortunately I won't be able to review it before 1.12, so I'm dismising my review

@sgermanserrano sgermanserrano merged commit b1a9c0b into autowarefoundation:develop Mar 13, 2019
anubhavashok pushed a commit to NuronLabs/autoware.ai that referenced this pull request Sep 7, 2021
* Docker: Add truly generic docker support

Unify docker support with a new (and more generic) set of docker files.

Two main images are created:
- Base image containing an Ubuntu installation with all dependencies
- Build image containing a pre-built Autoware

When Cuda support is enabled, there is a further Base+Cuda support image.
These Dockerfiles can be used for both AArch64 and x86_64 images.

Co-authored-by: Filipe Rinaldi <filipe.rinaldi@arm.com>

* ros/.gitignore: Ignore log and coverage folders

* Autoware built using colcon instead of catkin

* Install ROS dependencies via file

* Docker: Recommend NVIDIA Docker version 2

Make it explicit in the documentation that we recommend NVIDIA Docker
v2.
@mitsudome-r mitsudome-r added the version:autoware-ai Autoware.AI label Jun 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants