Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create template for adding nvidia support to docker images #33

Closed
ruffsl opened this issue Mar 3, 2018 · 17 comments
Closed

Create template for adding nvidia support to docker images #33

ruffsl opened this issue Mar 3, 2018 · 17 comments

Comments

@ruffsl
Copy link
Member

ruffsl commented Mar 3, 2018

Context:

Supplementary Context:

Solution:

Perhaps we could add a template for adding the necessary configuration steps for supporting nvidia hardware acceleration in containers for users display forwarding rviz, gzviewer, or other opengl dependent programs. This could be done by mimicking nvidia's own image build steps. Then we could simply add a new tags for the OSRF docker repos that use the template for new child images.

Here is a rough but working example:

FROM osrf/ros:kinetic-desktop-full

RUN apt-get update && apt-get install -y --no-install-recommends \
        pkg-config \
        libxau-dev \
        libxdmcp-dev \
        libxcb1-dev \
        libxext-dev \
        libx11-dev && \
    rm -rf /var/lib/apt/lists/*

COPY --from=nvidia/opengl:1.0-glvnd-runtime-ubuntu16.04 \
  /usr/local/lib/x86_64-linux-gnu \
  /usr/local/lib/x86_64-linux-gnu

COPY --from=nvidia/opengl:1.0-glvnd-runtime-ubuntu16.04 \
  /usr/local/share/glvnd/egl_vendor.d/10_nvidia.json \
  /usr/local/share/glvnd/egl_vendor.d/10_nvidia.json

RUN echo '/usr/local/lib/x86_64-linux-gnu' >> /etc/ld.so.conf.d/glvnd.conf && \
    ldconfig && \
    echo '/usr/local/$LIB/libGL.so.1' >> /etc/ld.so.preload && \
    echo '/usr/local/$LIB/libEGL.so.1' >> /etc/ld.so.preload

# nvidia-container-runtime
ENV NVIDIA_VISIBLE_DEVICES \
    ${NVIDIA_VISIBLE_DEVICES:-all}
ENV NVIDIA_DRIVER_CAPABILITIES \
    ${NVIDIA_DRIVER_CAPABILITIES:+$NVIDIA_DRIVER_CAPABILITIES,}graphics

using --runtime=nvidia and existing X11 forwarding methods.
Perhaps @flx42 would have a better recommendation for something cleaner.

Personally, I find the necessity for multi-stage builds and assortment of in image copies not as elegant at mounting such files from the host and setting a three lightweight variables in the child dockerfile, as originally described in our ROS wiki:

FROM osrf/ros:indigo-desktop-full
# nvidia-docker hooks
LABEL com.nvidia.volumes.needed="nvidia_driver"
ENV PATH /usr/local/nvidia/bin:${PATH}
ENV LD_LIBRARY_PATH /usr/local/nvidia/lib:/usr/local/nvidia/lib64:${LD_LIBRARY_PATH}

But as nvidia docker v1 is deprecated and will become harder and harder to build from old sources, it would be nice to adopt a supported solution.

@tfoote
Copy link
Contributor

tfoote commented Mar 7, 2018

I agree that the multi-stage build seems fragile and seems to break the idea of things being non-reproducible. From what I can see it's just installing some libraries and a json file it seems like it might be simpler for us to just build them in a normal build/ release install pipeline. If they are reproducible that shouldn't be a problem. Hopefully by doing that we're not limited to the only 2 architectures they appear to be targeting?

@flx42
Copy link

flx42 commented Mar 13, 2018

Looks like ubuntu 18.04 will provide packages for libglvnd, which would significantly simplify the dockerfile.
I think we should wait, since it's just a few weeks away.

@ruffsl
Copy link
Member Author

ruffsl commented Mar 14, 2018

I suppose libglvnd could be installed through the package manager when the docker image is based on 18.04, but I'm not sure that immediately helps ROS distros targeting 16.04 LTS. Perhaps its fare the newer nvidia docker v2 plugin is only supported with newer 18.04 based releases.

Which package is minimally necessary in this case: libglvnd-core-dev, libglvnd-dev, libglvnd0?
https://packages.ubuntu.com/search?keywords=libglvnd&searchon=names&suite=bionic&section=all

@flx42
Copy link

flx42 commented Mar 14, 2018

but I'm not sure that immediately helps ROS distros targeting 16.04 LTS. Perhaps its fare the newer nvidia docker v2 plugin is only supported with newer 18.04 based releases.

For 16.04 there are not many options: do something like you have above. Or use our own images for the base.
Or, but that's a marginal improvement, our OpenGL images are generic with an ARG for the FROM, which means you could be able to reuse the Dockerfiles to add libglvnd on top of your own images, with something like --build-arg from=osrf/...:
https://gitlab.com/nvidia/opengl/blob/ubuntu16.04/base/Dockerfile#L3
https://gitlab.com/nvidia/opengl/blob/ubuntu16.04/1.0-glvnd/runtime/Dockerfile#L42

Which package is minimally necessary in this case: libglvnd-core-dev, libglvnd-dev, libglvnd0?

libglvnd-dev and libglvnd0.

Other suggestions @3XX0?

@tfoote
Copy link
Contributor

tfoote commented Mar 14, 2018

What about just a debian package that we inject to support libglvnd? Either a backport of the 18.04 package or manually build a package from the above procedure?

@ruffsl
Copy link
Member Author

ruffsl commented Jun 20, 2018

Just tried out GL Vendor-Neutral Dispatch library with nvidia-docker2 using 18.04. Works pretty slick:
NVIDIA/nvidia-docker#136 (comment)
Bump @flx42 .

@flx42
Copy link

flx42 commented Jun 20, 2018

Nice @ruffsl! Are you going to target only 18.04 then?

@ruffsl
Copy link
Member Author

ruffsl commented Jun 20, 2018

@flx42 , targeting melodic images on 18.04 using libglvnd for nvidia-docker2 sounds reasonable. As @tfoote mentioned however, would it be possible to backport libglvnd to say 16.04 so nvidia-docker2 could just easily be used on older mantined images? @tfoote , libglvnd appears to be available for debian stretch in backports.

https://packages.debian.org/search?keywords=libglvnd&searchon=names&suite=all&section=all

@tfoote
Copy link
Contributor

tfoote commented Jun 20, 2018

We could backport it if there's not already a package for it available and it will let things work well on xenial. I see it's first available on artful: https://packages.ubuntu.com/source/artful/libglvnd

@abecciu
Copy link

abecciu commented Jun 20, 2018

+1 for 16.04 backport. Thanks for the hard work!

@flx42
Copy link

flx42 commented Jun 20, 2018

It's not handled by us, so I created a backport request:
https://bugs.launchpad.net/xenial-backports/+bug/1777944

@flx42
Copy link

flx42 commented Jun 25, 2018

I've been told a backport of this change for a LTS release carries risk, since you will need changes to the Mesa packages and our nvidia driver packages. Not saying it won't happen, but this use case is probably insufficient to convince them.

So, you should use ubuntu 18.04 or debian stretch I guess.

@ruffsl
Copy link
Member Author

ruffsl commented Jun 25, 2018

@tfoote , do you think it would make sense add the libglvnd install and ENVs for nvidia-docker to directly to desktop, or go the opposite route and create a new tag building off desktop-full, e.g. desktop-full-libglvnd, to append the needed layers. I'm more inclined to migrate it directly into melodic-desktop given libglvnd0 only adds about 25mb to that image (currently 2.05GB). I'm not aware of any downside though, as I'd expect most folks using the desktop image would be using them for the GUI tools they include, and if not, libglvnd wouldn't really changing things otherwise, but I'm not sure.

@mikaelarguedas , last time I think you modified the templates to white list distro specific changes:
8a83d5f . For this I'm thinking of just adding a new config here to include the additional packages or added image. Might be easier, but I wanted to check.

@tfoote
Copy link
Contributor

tfoote commented Jun 25, 2018

Certainly adding the libglnvd0 for desktop and desktop-full and any other GUI specific targets makes sense to me.

@ruffsl
Copy link
Member Author

ruffsl commented Jun 28, 2018

I have some WIP here: osrf/docker_images#165
having some issues with rviz though.

@tfoote
Copy link
Contributor

tfoote commented Jan 3, 2019

Now that we have the ability to run docker images in nvidia docker without baking in the nvidia before hand using: https://github.com/osrf/rocker

I think it makes sense to not bake those things into the core images.

@ruffsl
Copy link
Member Author

ruffsl commented Jan 3, 2019

@tfoote , perhaps you could reference rocker in the docker GUI entries in the ros wiki?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants