Adding nvidia-docker v2 support #1167

koenlek · 2018-03-19T08:10:27Z

Status

PRODUCTION, safe for the "optional" todos mentioned below.

Description

Adding nvidia docker v2 support as discussed in issue autowarefoundation/autoware_ai#93 .

Todos

I'd like to leave these up to the Autoware maintainers.

I think these new images should become the standard (as Autoware assumes or even needs an nvidia GPU to run on, right?) You can consider renaming current images to something like "-legacy", rather than suffixing these new images with "-nvidia".
Documentation could be checked on all old references to the old nvidia-docker run command (which is docker run --runtime nvidia these days).

Steps to Test or Reproduce

See autowarefoundation/autoware_ai#93.

Make sure to install nvidia-docker v2 and see if the old image works for your (should not work. Opening RVIZ would crash for example). Then try again with the new image and see if it works.

…ipt_for_dpx2 fix a docker script for dpx2

amc-nu · 2018-03-27T04:21:16Z

@koenlek this needs to point to develop branch. Can you change the target please?
Thanks

amc-nu · 2018-03-28T07:43:47Z

@yk-fujii can you also please check this one?

jokla · 2018-03-28T15:34:20Z

@koenlek Are the line 7 and 9 in the file build.sh commented on purpose?

I tested with the nvidia-docker v2 and now Rviz is working. Thank you!

…g purposes

…docker_v2

koenlek · 2018-03-29T07:15:15Z

Well spotted, I removed the comments, my bad. It now points to develop. I merged upstream develop in my branch and resolved two small merge conflicts that arose from that.

koenlek · 2018-05-07T15:26:31Z

@messi49 : can you make sure this MR is reviewed & merged?

koenlek · 2018-06-05T14:53:25Z

@messi49 Please review :)

koenlek · 2018-06-28T16:24:41Z

Will this MR be reviewed? Its 3 months old already. Give me a heads up if I need to fix the Merge conflicts.

bdholt1 · 2018-07-30T14:32:41Z

@koenlek Thanks for this contribution! I've tried it myself and I can confirm that it successfully builds a docker image containing Autoware using nvidia-docker2.

I don't think we should maintain support for nvidia-docker (v1) any more. All (or at least most) of Nvidia's documentation now refers to version 2. We're just making life harder for ourselves by trying to support multiple setups. On that basis I would remove the second argument 'nvidia' to the run.sh and build.sh files.

I'm also doubtful that we should be compiling the Autoware source code into the docker image that we build. In most other projects the docker image only serves to provide a consistent set of dependencies. After that it's common for the build script to mount the code in the newly created image and run build commands.

koenlek · 2018-08-01T09:09:32Z

@bdholt1, I agree on both that you should ditch nvidia-docker v1 and that it would be better to exclude source code from the docker image. That is how we use docker at our company too.

The docker image should provide you with a development environment in which you can compile the autoware code (i.e. with all compile-time dependencies available). You can also make a runtime image which contains all the compiled software, but I would advice you to look into catkin install targets to make the deployed filesystem cleaner. When using the install targets you specify a file structure which contains only the files that need to be deployed (e.g. the binaries, launch files, param files, but not the src/header files). Just take a look at any official ros package github repo (e.g. https://github.com/ros-planning/navigation) for an example.

Some recommendations:

figure out catkin install targets, to more cleanly deploy the code.
consider switching to catkin_tools, rather than catkin_make. In my opinion, it is faster (especially for large catkin workspaces), cleaner, and more user friendly. It is created and maintained by the people behind ROS, but for some odd reason, it was never pushed as the official recommended successor of catkin_make (probably to avoid the migration troubles/frustrations that it would enforce to all ROS users). When switching you probably run into some cmake issues (as catkin_tools has no "cmake crosstalk" and thus is a bit more strict on your catkin correctness, i.e. the CMakeLists.xtx and package.xml). With catkin tools, just run catkin config --install (somewhat similar to catkin_make install) to enable the install space. Note that in ROS2, install targets will become obligatory (the devel workspace disappears in favor of the install workspace)..
make two images (check e.g. apollo images too https://hub.docker.com/r/apolloauto/apollo/tags/):
- autoware-dev-<version> -> no code, no actual autoware
- autoware-release-<version> -> containing deployed autoware code (no code, only binaries and "share" files like params / launch)
for each, you need to make an nvidia and a non nvidia version (the patch I provided will make opengl apps such as rviz work for nvidia, but then its broken on intel integrated graphics...). e.g.
- autoware-dev-<version>-nvidia and autoware-dev-<version>

We have a check_nvidia bash function that automatically can 'patch' a docker run command to take care, see this example:

check_nvidia(){
  if nvidia-smi &> /dev/null; then
    export NV_RUN="--runtime=nvidia"
    export NV_TAG_SUFFIX="-nvidia"
  else
    export NV_RUN=""
    export NV_TAG_SUFFIX=""
  fi
}

# adds GUI support and few other tricks
export DOCKER_COMMON_ARGS="--env=DISPLAY \
                           --env=XDG_RUNTIME_DIR \
                           --env=QT_X11_NO_MITSHM=1 \
                           --device=/dev/dri:/dev/dri \
                           -v /tmp/.X11-unix:/tmp/.X11-unix:rw \
                           -v /etc/localtime:/etc/localtime:ro"

xhost +local:docker
check_nvidia
docker run ${NV_RUN} ${DOCKER_COMMON_ARGS:?} -it --rm \
   --name="autoware" \
   -v /dev/bus/usb:/dev/bus/usb \
   -v /home/$USER/.Xauthority:/home/autoware/.Xauthority \
   -v "$HOME/docker_shared:/home/autoware/docker_shared" \
   -v autoware_home:/home/autoware \
   -v /media:/media \
   --net=host  \
   --add-host autoware:127.0.0.1 \
   --hostname=autoware \
   --privileged \
   autoware/autoware:1.7.0-kinetic${NV_TAG_SUFFIX}

Feel free to do with this advice whatever you like. I suggest you take it over from here from me and make some issues (stories) out of it if you like.

bdholt1 · 2018-08-01T11:25:21Z

@koenlek I completely agree with your comments. I'd love to take over here but this is a fairly substantial chunk of work so I'm going to suggest that @esteve looks into it.

esteve · 2018-08-03T11:12:46Z

@koenlek thanks for your contribution, I'll take over from here.

esteve · 2018-08-03T11:36:33Z

Closing this PR, I'm working on this at #1416 Thanks @koenlek

yk-fujii and others added 4 commits March 16, 2018 22:01

fix a docker script for dpx2

e66a422

Merge pull request autowarefoundation#1160 from CPFL/hotfix/fix_a_scr…

21dcd18

…ipt_for_dpx2 fix a docker script for dpx2

Adding nvidia-docker v2 support

ac84507

Renaming confusing "nvidia" folder name

de95360

amc-nu requested a review from messi49 March 28, 2018 07:44

Accidentally forget to remove some commenting that I added for testin…

79e3fac

…g purposes

koenlek changed the base branch from master to develop March 29, 2018 07:11

Merge remote-tracking branch 'origin/develop' into bugfix/fix_nvidia_…

37cfbee

…docker_v2

dejanpan mentioned this pull request Jul 14, 2018

[Feature] Cross compile Autoware for aarch64 #1375

Merged

bdholt1 requested review from esteve and removed request for esteve July 31, 2018 10:59

kfunaoka mentioned this pull request Aug 2, 2018

[Feature] Add run_tests on Travis #1396

Closed

esteve mentioned this pull request Aug 3, 2018

Bugfix/fix nvidia docker v2 (redux) #1416

Closed

2 tasks

esteve closed this Aug 3, 2018

sgermanserrano mentioned this pull request Feb 28, 2019

Docker cleanup #1946

Merged

7 tasks

mitsudome-r added the version:autoware-ai Autoware.AI label Jun 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding nvidia-docker v2 support #1167

Adding nvidia-docker v2 support #1167

koenlek commented Mar 19, 2018

amc-nu commented Mar 27, 2018

amc-nu commented Mar 28, 2018

jokla commented Mar 28, 2018 •

edited

Loading

koenlek commented Mar 29, 2018

koenlek commented May 7, 2018

koenlek commented Jun 5, 2018

koenlek commented Jun 28, 2018

bdholt1 commented Jul 30, 2018

koenlek commented Aug 1, 2018

bdholt1 commented Aug 1, 2018

esteve commented Aug 3, 2018 •

edited

Loading

esteve commented Aug 3, 2018

Adding nvidia-docker v2 support #1167

Adding nvidia-docker v2 support #1167

Conversation

koenlek commented Mar 19, 2018

Status

Description

Todos

Steps to Test or Reproduce

amc-nu commented Mar 27, 2018

amc-nu commented Mar 28, 2018

jokla commented Mar 28, 2018 • edited Loading

koenlek commented Mar 29, 2018

koenlek commented May 7, 2018

koenlek commented Jun 5, 2018

koenlek commented Jun 28, 2018

bdholt1 commented Jul 30, 2018

koenlek commented Aug 1, 2018

bdholt1 commented Aug 1, 2018

esteve commented Aug 3, 2018 • edited Loading

esteve commented Aug 3, 2018

jokla commented Mar 28, 2018 •

edited

Loading

esteve commented Aug 3, 2018 •

edited

Loading