Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ORT Docker image size - is it possible to make it smaller? #3230

Open
woznik opened this issue Oct 22, 2020 · 9 comments
Open

ORT Docker image size - is it possible to make it smaller? #3230

woznik opened this issue Oct 22, 2020 · 9 comments
Labels
docker About Docker topics enhancement Issues that are considered to be enhancements

Comments

@woznik
Copy link

woznik commented Oct 22, 2020

Hello
Do you think it would be possible to decrease the size of the docker image (max to 2GB) - currently it's over 4GB
The image contains Android SDK - is this KIT (and maybe others) necessary for ORT to find the dependencies information?
The current size of a docker image has a huge impact on the duration of the scanning proces.
I use this docker image in a gitlab pipeline with a group of shared runners. Each time a user triggers an action to pull the image it takes at least 3-5 minutes to complete the download.
It impacts unnecessarly network and energy consumption.
If it is not possible I could try ti reorganize my use case

@sschuberth sschuberth added docker About Docker topics enhancement Issues that are considered to be enhancements labels Oct 22, 2020
@sschuberth
Copy link
Member

The image contains Android SDK - is this KIT (and maybe others) necessary for ORT to find the dependencies information?

When analyzing Android projects, unfortunately yes. While the Android plugin for Gradle is able to bootstrap several SDK components, the basic SDK needs to be installed even if just querying the dependencies of an Android project.

But that said, the stock Dockerfile is supposed to get you started easily and to showcase all ORT capabilities. If you do not need Android project analysis, you can easily trim the Dockerfile to your needs by omitting unwanted package managers / SDKs / frameworks.

But I agree there still is general room for improvement. I was thinking about using Alpine as a base image also for the final image, or look into ways to dynamically install the Android SDK only when Android projects are analyzed.

@woznik
Copy link
Author

woznik commented Oct 22, 2020

OK, I get it. Thank you for the explanation

@woznik woznik closed this as completed Oct 23, 2020
@sschuberth
Copy link
Member

Let's keep this open to track the ideas for still reducing the size that I mentioned before.

@sschuberth sschuberth reopened this Oct 23, 2020
@mnonnenmacher
Copy link
Member

@sschuberth We are currently investigating implementing bootstrapping for Pub/Flutter, because according to research by @zhernovs the Flutter installation takes about 2GB of data in the Docker image. This would make the analyzer slower for Pub/Flutter projects of course, but they are relatively rare and users who heavily rely on those could still make their own Docker file that includes Flutter. Any objections to that approach?

@sschuberth
Copy link
Member

No objections. Just a question: As you know I was planning to remove bootstrapping for scanners. Would some of that logic still be valuable to keep in this context?

@zhernovs
Copy link
Contributor

zhernovs commented Dec 2, 2020

Here are some particular numbers related to this topic:

At the moment I did research uncompressed image from this Dockerfile weighs 3.91 GB. The biggest part of it is flutter: 1.5 GB. Built-in flutter/.pub-cache is 481 MB. So bootstrapping flutter will reduce docker image by almost 2 times.
I also did some local tests on HERE company's gitlab instance to verify how much benefits we will have with reduced-size image. In this experiment I've removed flutter (1.5 GB) with android-sdk (116 MB):

Compressed default image in gitlab registry: 1.78 GiB
Compressed image without flutter and android-sdk: 933.98 MiB

Time to start docker executor with default image: 1:29
Time to start docker executor with shrank image: 00:52

Time to download, unpack, prepare flutter on resulting image: 01:16

For future, the second tool by size that might be optimized/bootstrapped is go (327 MB).

Base image (adoptopenjdk:11-jre-hotspot-bionic 229 MB) could be replaced with something lighter (as an option adoptopenjdk/openjdk11:alpine-jre 149 MB), but this requires complete rewriting of Dockerfile if new image is based on Alpine (currently on Ubuntu).

@sschuberth
Copy link
Member

FYI, bootstrapping Flutter was merged.

@sschuberth
Copy link
Member

Maybe this also is yet another topic where Buildpacks as already mentioned here could help, also see https://github.com/paketo-buildpacks/gradle.

@sschuberth
Copy link
Member

Note to myself: To reduce the Docker image size we might want to take a look at https://github.com/docker-slim/docker-slim.

sschuberth added a commit that referenced this issue Jan 11, 2022
Build ORT with Java 17 LTS to benefit from newer bytecode optimizations
[1] and to get rid of the bogus "illegal reflective access" warning
triggered by Retrofit which caused a lot of confusion [2].

While "Alpine is not in a supported release by OpenJDK" [3],
eclipse-temurin [4] (which supersedes the deprecated adoptopenjdk [5])
does offer an Alpine image [6]. However, prefer to use the slightly
larger "17-jdk-focal" image instead as that JDK image can also be easily
used to *run* ORT in order to supersede #4178, so building and running
ORT share the same image.

At a later point, the effort to use "eclipse-temurin:17-jdk-alpine" for
running (and building) ORT could be undertaken in order to reduce the
Docker image size (see #3230). But installing all required tools and
building ScanCode on Alpine could become difficult.

[1]: #4912
[2]: https://github.com/oss-review-toolkit/ort/search?q=%22illegal+reflective+access%22&type=issues
[3]: https://hub.docker.com/_/openjdk
[4]: https://hub.docker.com/_/eclipse-temurin
[5]: https://hub.docker.com/_/adoptopenjdk
[6]: https://blog.adoptium.net/2021/09/eclipse-temurin-17-available/

Signed-off-by: Sebastian Schuberth <sebastian.schuberth@bosch.io>
sschuberth added a commit that referenced this issue Jan 11, 2022
Build ORT with Java 17 LTS to benefit from newer bytecode optimizations
[1] and to get rid of the bogus "illegal reflective access" warning
triggered by Retrofit which caused a lot of confusion [2].

While "Alpine is not in a supported release by OpenJDK" [3],
eclipse-temurin [4] (which supersedes the deprecated adoptopenjdk [5])
does offers both JRE and JDK Alpine images [6]. However, use neither of
them and instead prefer to use the slightly larger "17-jdk-focal" image
instead as that JDK image can also be easily used to *run* ORT in order
to supersede #4178, so building and running ORT share the same image.

At a later point, the effort to use "eclipse-temurin:17-jdk-alpine" for
running (and building) ORT could be undertaken in order to reduce the
Docker image size (see #3230). But installing all required tools and
building ScanCode on Alpine could become difficult.

[1]: #4912
[2]: https://github.com/oss-review-toolkit/ort/search?q=%22illegal+reflective+access%22&type=issues
[3]: https://hub.docker.com/_/openjdk
[4]: https://hub.docker.com/_/eclipse-temurin
[5]: https://hub.docker.com/_/adoptopenjdk
[6]: https://blog.adoptium.net/2021/09/eclipse-temurin-17-available/

Signed-off-by: Sebastian Schuberth <sebastian.schuberth@bosch.io>
sschuberth added a commit that referenced this issue Jan 12, 2022
Build ORT with Java 17 LTS to benefit from newer bytecode optimizations
[1] and to get rid of the bogus "illegal reflective access" warning
triggered by Retrofit which caused a lot of confusion [2].

While "Alpine is not in a supported release by OpenJDK" [3],
eclipse-temurin [4] (which supersedes the deprecated adoptopenjdk [5])
does offers both JRE and JDK Alpine images [6]. However, use neither of
them and instead prefer to use the slightly larger "17-jdk-focal" image
instead as that JDK image can also be easily used to *run* ORT in order
to supersede #4178, so building and running ORT share the same image.

At a later point, the effort to use "eclipse-temurin:17-jdk-alpine" for
running (and building) ORT could be undertaken in order to reduce the
Docker image size (see #3230). But installing all required tools and
building ScanCode on Alpine could become difficult.

[1]: #4912
[2]: https://github.com/oss-review-toolkit/ort/search?q=%22illegal+reflective+access%22&type=issues
[3]: https://hub.docker.com/_/openjdk
[4]: https://hub.docker.com/_/eclipse-temurin
[5]: https://hub.docker.com/_/adoptopenjdk
[6]: https://blog.adoptium.net/2021/09/eclipse-temurin-17-available/

Signed-off-by: Sebastian Schuberth <sebastian.schuberth@bosch.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docker About Docker topics enhancement Issues that are considered to be enhancements
Projects
None yet
Development

No branches or pull requests

4 participants