Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Packaging] Optimize Dockerfile build #25184

Closed
wants to merge 1 commit into from
Closed

Conversation

hholst80
Copy link

@hholst80 hholst80 commented Jan 19, 2023

This PR makes the build Docker artifact smaller by including multi-stage build in Dockerfile,
and trim ~250MB from the exported image.

image

Update: The image size is now down to 444MB.


This checklist is used to make sure that common guidelines for a pull request are followed.

@ghost ghost added the customer-reported Issues that are reported by GitHub users external to the Azure organization. label Jan 19, 2023
@ghost
Copy link

ghost commented Jan 19, 2023

Thank you for your contribution hholst80! We will review the pull request and get back to you soon.

@hholst80
Copy link
Author

@microsoft-github-policy-service agree

@hholst80 hholst80 changed the title Optimize Dockerfile build [Packaging] Optimize Dockerfile build Jan 20, 2023
@yonzhan
Copy link
Collaborator

yonzhan commented Jan 20, 2023

Packaging

@hholst80
Copy link
Author

hholst80 commented Feb 4, 2023

2 weeks of no activity. I am pretty sure I did not break anything with my changes.
(Check the Format of Pull Request Title and Content) should have been fixed 2 weeks ago.

Dockerfile Outdated
Comment on lines 79 to 81
COPY --from=builder /azure-cli/az.completion /root/.bashrc
COPY --from=builder /usr/local /usr/local
COPY --from=tools /usr/local/bin/jp /usr/local/bin/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ambiguous and hard to maintain.
Alternatively, we can modify the build script to install and delete the src in one step.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is strictly worse, but I see the point. We can move jp into the builder step and then we have just two COPY commands that would have very little reason for change. We could also do this:

COPY --from=builder /usr/local /usr/local
COPY --from=tools /usr/local /usr/local

I do not think az.completion should be included in the environment to begin with, nor should jp, but here we are. I only want to make the existing solution smaller. Without sacrificing readability or maintainability.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think that az.competion.sh should be in /root is what I mean. The jp binary is a single static executable. If the user wants it inside az-cli they can just inject it at container creation or have it mounted. There is no need to have it bundled with az unless there is a runtime dependency to it from az cli itself.

image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line is not safe COPY --from=builder /usr/local /usr/local. What if the installed package also modifies another folder. We can't guarantee that everything we need in that folder. (Although it might be true.)
I prefer using a single RUN command to do the same thing. Like the CPython dockerfile

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this multistage solution, we do not carry over any potential residual components from the build process. That is very hard to achieve with RUN in a single stage. There is always junk left behind and not to mention it is hard to read unmaintainable. There is also observational bias to this commonality of long-winded single-stage builds: nobody dares to change anything because they do not understand the consequences of their changes. Not the best practice, that is not a place where we want to be.

At first, that can seem like a reasonable fear that a generic install script could install something outside the defined install location. To overcome our fear it is enough to open up the install script to see what it does: pip install exclusively. While it could be possible for a setup.py to do something incredibly stupid here the python pip framework is reasonably opinionated on what can and should be done (too opinionated if you ask me). If any files are installed they will be put in the target specified by pip configuration.

Do you feel strong about the /usr/local copy I might be able to fix that. But if you are against the multi-stage I cannot fix that. That is a requirement for a better more maintainable Dockerfile with a smaller build artifact that also can be generated more quickly.

Copy link
Contributor

@bebound bebound Feb 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not against mult-stage build. If CLI is a Node or Go project, I definitely choose the multi-stage build. I'm sure everything I need is the dist folder or the binary.

The new dockerfile also remove bash, git and other packages. To maintain compatibility, we also need to figure out what file is changed during installation, which is more difficult than pip installation.

PS: I know we should not keep irrelevant packages into image. In order to prevent issues like #24836, we still keep them for now.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if you would prefer this kind of solution instead.

https://github.com/hholst80/azure-cli/tree/dockerfile-multistage-alt

@hholst80
Copy link
Author

hholst80 commented Feb 7, 2023

Including the __pycache__ cleanup, the build image is down to ~704MB (uncompressed) on my local build. From 1.3GB that is worth while. I think the same cleanup (remove __pycache__) should be included in the rpm and deb build as well. Either ship the .py or just the .pyc but not both?

@jiasli jiasli modified the milestones: Feb 2023 (2023-03-07), Backlog Mar 2, 2023
@jiasli jiasli assigned bebound and unassigned jiasli May 19, 2023
@ozbillwang
Copy link

ozbillwang commented Sep 3, 2023

the latest image has been reduced to 1GB

mcr.microsoft.com/azure-cli             latest                    cb37d43fe470   5 weeks ago     1GB

But still not enough.

@hholst80

can you confirm with latest version 2.51.0, if your PR still can reduce extra 200MB?

======

some discussions in #19591

and add my comments and discoveries.

#19591 (comment)

The reason why it is so big, because, azure_cli keeps all version of codes for each component.

For example, for az network, it keeps 33 copes when you install azure_cli in Docker.

We need know how to install with latest version only.

$ pwd
/app/.venv/lib/python3.11/site-packages/azure/mgmt/network

$ ls -l
...
9708	v2018_11_01
9964	v2018_12_01
10180	v2019_02_01
10768	v2019_04_01
11280	v2019_06_01
11640	v2019_07_01
11860	v2019_08_01
11944	v2019_09_01
12172	v2019_11_01
12344	v2019_12_01
12736	v2020_03_01
12924	v2020_04_01
13592	v2020_05_01
14180	v2020_06_01
14420	v2020_07_01
14596	v2020_08_01
14700	v2020_11_01
14880	v2021_02_01
16984	v2022_01_01

$ ls -ld v*|wc -l
33

In general, this tool is installed about 33 times, that's why it is so big.

@jiasli
Copy link
Member

jiasli commented Sep 4, 2023

The reason why it is so big, because, azure_cli keeps all version of codes for each component.

For example, for az network, it keeps 33 copes when you install azure_cli in Docker.

We previously trimmed Python SDKs for Windows and Linux packages: #23946. We will apply the same logic to docker images.

@ozbillwang
Copy link

ozbillwang commented Sep 4, 2023

The reason why it is so big, because, azure_cli keeps all version of codes for each component.
For example, for az network, it keeps 33 copes when you install azure_cli in Docker.

We previously trimmed Python SDKs for Windows and Linux packages: #23946. We will apply the same logic to docker images.

Thanks for the update. Please share the ticket or issue for docker image so I can follow up (subsribe it)

@hholst80
Copy link
Author

hholst80 commented Sep 6, 2023

We're down to 485 MB now (compressed). If we keep this PR open long enough, we will eventually get it below 100 MB. ;-)

FYI, keeping all API versions costs 115 MB (again, compressed)

If anyone can give a green light to remove the "useful tools" included (jq, curl, and jp) we can get the image size down to 438 MB. Actually, if we are very aggressive, removing git, perl, and such too, below 400 MB is possible. This will break workflows (I was myself using curl inside the container), and I am not sure that the break is worth the small extra saving in size. It is however odd that the container exposed this tooling to being with because now there are users using it. Including myself...

@hholst80
Copy link
Author

hholst80 commented Sep 6, 2023

If this flies with the CI/CD gods, the size is almost acceptable:

image

@azure-client-tools-bot-prd
Copy link

azure-client-tools-bot-prd bot commented Sep 6, 2023

🔄AzureCLI-FullTest
️✔️acr
️✔️2020-09-01-hybrid
️✔️3.11
️✔️3.9
️✔️latest
️✔️3.11
️✔️3.9
️✔️acs
️✔️2020-09-01-hybrid
️✔️3.11
️✔️3.9
️✔️latest
️✔️3.11
️✔️3.9
️✔️advisor
️✔️latest
️✔️3.11
️✔️3.9
️✔️ams
️✔️latest
️✔️3.11
️✔️3.9
️✔️apim
️✔️latest
️✔️3.11
️✔️3.9
️✔️appconfig
️✔️latest
️✔️3.11
️✔️3.9
️✔️appservice
️✔️latest
️✔️3.11
️✔️3.9
️✔️aro
️✔️latest
️✔️3.11
️✔️3.9
️✔️backup
️✔️latest
️✔️3.11
️✔️3.9
️✔️batch
️✔️latest
️✔️3.11
️✔️3.9
️✔️batchai
️✔️latest
️✔️3.11
️✔️3.9
️✔️billing
️✔️latest
️✔️3.11
️✔️3.9
️✔️botservice
️✔️latest
️✔️3.11
️✔️3.9
️✔️cdn
️✔️latest
️✔️3.11
️✔️3.9
️✔️cloud
️✔️latest
️✔️3.11
️✔️3.9
️✔️cognitiveservices
️✔️latest
️✔️3.11
️✔️3.9
️✔️config
️✔️latest
️✔️3.11
️✔️3.9
️✔️configure
️✔️latest
️✔️3.11
️✔️3.9
️✔️consumption
️✔️latest
️✔️3.11
️✔️3.9
️✔️container
️✔️latest
️✔️3.11
️✔️3.9
🔄containerapp
🔄latest
🔄3.11
️✔️3.9
️✔️core
️✔️2018-03-01-hybrid
️✔️3.11
️✔️3.9
️✔️2019-03-01-hybrid
️✔️3.11
️✔️3.9
️✔️2020-09-01-hybrid
️✔️3.11
️✔️3.9
️✔️latest
️✔️3.11
️✔️3.9
️✔️cosmosdb
️✔️latest
️✔️3.11
️✔️3.9
️✔️databoxedge
️✔️2019-03-01-hybrid
️✔️3.11
️✔️3.9
️✔️2020-09-01-hybrid
️✔️3.11
️✔️3.9
️✔️latest
️✔️3.11
️✔️3.9
️✔️dla
️✔️latest
️✔️3.11
️✔️3.9
️✔️dls
️✔️latest
️✔️3.11
️✔️3.9
️✔️dms
️✔️latest
️✔️3.11
️✔️3.9
️✔️eventgrid
️✔️latest
️✔️3.11
️✔️3.9
️✔️eventhubs
️✔️latest
️✔️3.11
️✔️3.9
️✔️feedback
️✔️latest
️✔️3.11
️✔️3.9
️✔️find
️✔️latest
️✔️3.11
️✔️3.9
️✔️hdinsight
️✔️latest
️✔️3.11
️✔️3.9
️✔️identity
️✔️latest
️✔️3.11
️✔️3.9
️✔️iot
️✔️2019-03-01-hybrid
️✔️3.11
️✔️3.9
️✔️2020-09-01-hybrid
️✔️3.11
️✔️3.9
️✔️latest
️✔️3.11
️✔️3.9
️✔️keyvault
️✔️2018-03-01-hybrid
️✔️3.11
️✔️3.9
️✔️2020-09-01-hybrid
️✔️3.11
️✔️3.9
️✔️latest
️✔️3.11
️✔️3.9
️✔️kusto
️✔️latest
️✔️3.11
️✔️3.9
️✔️lab
️✔️latest
️✔️3.11
️✔️3.9
️✔️managedservices
️✔️latest
️✔️3.11
️✔️3.9
️✔️maps
️✔️latest
️✔️3.11
️✔️3.9
️✔️marketplaceordering
️✔️latest
️✔️3.11
️✔️3.9
️✔️monitor
️✔️latest
️✔️3.11
️✔️3.9
️✔️mysql
️✔️latest
️✔️3.11
️✔️3.9
️✔️netappfiles
️✔️latest
️✔️3.11
️✔️3.9
️✔️network
️✔️2018-03-01-hybrid
️✔️3.11
️✔️3.9
️✔️latest
️✔️3.11
️✔️3.9
️✔️policyinsights
️✔️latest
️✔️3.11
️✔️3.9
️✔️privatedns
️✔️latest
️✔️3.11
️✔️3.9
️✔️profile
️✔️latest
️✔️3.11
️✔️3.9
️✔️rdbms
️✔️latest
️✔️3.11
️✔️3.9
️✔️redis
️✔️latest
️✔️3.11
️✔️3.9
️✔️relay
️✔️latest
️✔️3.11
️✔️3.9
️✔️resource
️✔️2018-03-01-hybrid
️✔️3.11
️✔️3.9
️✔️2019-03-01-hybrid
️✔️3.11
️✔️3.9
️✔️latest
️✔️3.11
️✔️3.9
️✔️role
️✔️latest
️✔️3.11
️✔️3.9
️✔️search
️✔️latest
️✔️3.11
️✔️3.9
️✔️security
️✔️latest
️✔️3.11
️✔️3.9
️✔️servicebus
️✔️latest
️✔️3.11
️✔️3.9
️✔️serviceconnector
️✔️latest
️✔️3.11
️✔️3.9
️✔️servicefabric
️✔️latest
️✔️3.11
️✔️3.9
️✔️signalr
️✔️latest
️✔️3.11
️✔️3.9
️✔️sql
️✔️latest
️✔️3.11
️✔️3.9
️✔️sqlvm
️✔️latest
️✔️3.11
️✔️3.9
️✔️storage
️✔️2018-03-01-hybrid
️✔️3.11
️✔️3.9
️✔️2019-03-01-hybrid
️✔️3.11
️✔️3.9
️✔️2020-09-01-hybrid
️✔️3.11
️✔️3.9
️✔️latest
️✔️3.11
️✔️3.9
️✔️synapse
️✔️latest
️✔️3.11
️✔️3.9
️✔️telemetry
️✔️2018-03-01-hybrid
️✔️3.11
️✔️3.9
️✔️2019-03-01-hybrid
️✔️3.11
️✔️3.9
️✔️2020-09-01-hybrid
️✔️3.11
️✔️3.9
️✔️latest
️✔️3.11
️✔️3.9
️✔️util
️✔️latest
️✔️3.11
️✔️3.9
️✔️vm
️✔️2018-03-01-hybrid
️✔️3.11
️✔️3.9
️✔️2019-03-01-hybrid
️✔️3.11
️✔️3.9
️✔️2020-09-01-hybrid
️✔️3.11
️✔️3.9
️✔️latest
️✔️3.11
️✔️3.9

@azure-client-tools-bot-prd
Copy link

azure-client-tools-bot-prd bot commented Sep 6, 2023

️✔️AzureCLI-BreakingChangeTest
️✔️Non Breaking Changes

@ozbillwang
Copy link

If this flies with the CI/CD gods, the size is almost acceptable:

image

yes, you are right. i use it in CICD, thanks for the efforts

@idavis
Copy link

idavis commented Oct 11, 2023

Do we need all of these installed? Are these just needed for building or do we shell out to them?

# We don't use openssl (3.0) for now. We only install it so that users can use it.
# libintl and icu-libs - required by azure devops artifact (az extension add --name azure-devops)
RUN apk add --no-cache \
    bash bash-completion openssh-keygen ca-certificates openssl \
    curl jq git perl zip \
    libintl icu-libs libc6-compat \
&& update-ca-certificates

The jp installation is also hard-coded to amd64. Do we want support for native arm64 such as apple silicon or surface users?

@jongio
Copy link
Member

jongio commented Oct 11, 2023

@hholst80
Copy link
Author

Do we need all of these installed? Are these just needed for building or do we shell out to them?

# We don't use openssl (3.0) for now. We only install it so that users can use it.
# libintl and icu-libs - required by azure devops artifact (az extension add --name azure-devops)
RUN apk add --no-cache \
    bash bash-completion openssh-keygen ca-certificates openssl \
    curl jq git perl zip \
    libintl icu-libs libc6-compat \
&& update-ca-certificates

The jp installation is also hard-coded to amd64. Do we want support for native arm64 such as apple silicon or surface users?

I think it is out of scope for this PR to remove or add to existing tooling. Doing so will break between zero, 1, and a million users workflows so it requires some sensitivity and cost benefit analysis etc.

Removing curl broke my own workflows so it is a real problem not just theoretical issue.

This PR takes forever to get across and it's getting better and better. But it will not provide much value until it is merged.

@jongio
Copy link
Member

jongio commented Oct 12, 2023

@jiasli - I'm wondering if we have any CLI perf tests to see if removing the pycache impacts runtime perf.

Dockerfile Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
@hholst80
Copy link
Author

hholst80 commented Oct 12, 2023

@jiasli - I'm wondering if we have any CLI perf tests to see if removing the __pycache__ impacts runtime perf.

Compare it to the overhead of starting the container, I'd assume it's negligible.

Actually, the total time could be larger, since the image layers need to be unpacked and copied on storage. It could be cheaper to just build the byte code on the fly.

Edit: validation of assumptions

Benchmark with and without __pycache__

overhead of starting the docker container itself

I] root@fedora ~/azure-cli (dev)
# time docker run -it --rm azure-cli echo foo 

Executed in  246.94 millis    fish           external
   usr time    9.20 millis  257.00 micros    8.94 millis
   sys time    6.77 millis   74.00 micros    6.70 millis

Executed in  274.07 millis    fish           external
   usr time    8.17 millis  304.00 micros    7.86 millis
   sys time    8.95 millis  107.00 micros    8.84 millis

Executed in  266.15 millis    fish           external
   usr time    9.38 millis  245.00 micros    9.13 millis
   sys time    7.52 millis   86.00 micros    7.44 millis

Running az --version with __pycache__ (692MB image)

Executed in    1.98 secs      fish           external
   usr time    8.65 millis  386.00 micros    8.26 millis
   sys time    7.43 millis    0.00 micros    7.43 millis

Executed in    1.82 secs      fish           external
   usr time   10.15 millis    0.00 micros   10.15 millis
   sys time    6.44 millis  332.00 micros    6.10 millis

Executed in    1.78 secs      fish           external
   usr time    9.04 millis  266.00 micros    8.77 millis
   sys time    7.87 millis   84.00 micros    7.79 millis

Running az --version without __pycache__ (411MB)

Executed in    2.72 secs      fish           external
   usr time   10.46 millis  254.00 micros   10.20 millis
   sys time    5.18 millis   77.00 micros    5.11 millis

Executed in    2.58 secs      fish           external
   usr time    9.28 millis  258.00 micros    9.03 millis
   sys time    7.33 millis   78.00 micros    7.25 millis

Executed in    2.56 secs      fish           external
   usr time    8.35 millis  257.00 micros    8.09 millis
   sys time    8.17 millis   79.00 micros    8.09 millis

Runnig something more heavy, like az --help

Running az --help with __pycache__ (692MB)

Executed in    8.37 secs      fish           external
   usr time    9.06 millis  289.00 micros    8.78 millis
   sys time    8.90 millis   95.00 micros    8.80 millis

Executed in    8.17 secs      fish           external
   usr time   10.80 millis  246.00 micros   10.55 millis
   sys time    7.78 millis   82.00 micros    7.70 millis

Executed in    8.27 secs      fish           external
   usr time   10.30 millis  303.00 micros    9.99 millis
   sys time    8.12 millis  101.00 micros    8.02 millis

Running az --help without __pycache__ (411MB)

Executed in    9.59 secs      fish           external
   usr time   10.87 millis    0.00 micros   10.87 millis
   sys time    6.88 millis  353.00 micros    6.52 millis

Executed in    8.91 secs      fish           external
   usr time   11.70 millis  279.00 micros   11.42 millis
   sys time    8.00 millis   94.00 micros    7.91 millis

Executed in    8.88 secs      fish           external
   usr time   11.72 millis    0.00 micros   11.72 millis
   sys time    7.84 millis  336.00 micros    7.50 millis

Here I realized that the __pycache__ eliminiation was too aggressive. I removed things not only from azure package. I fixed that in the latest commit.

diff --git a/Dockerfile b/Dockerfile
index 794af5e3b..1c85845cf 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -33,7 +33,7 @@ COPY . /azure-cli
 RUN ./scripts/install_full.sh
 RUN python3 ./scripts/trim_sdk.py
 RUN /usr/local/bin/az && bash -c "source /usr/local/bin/az.completion.sh"
-RUN find /usr/local -name __pycache__ | xargs rm -rf
+RUN find /usr/local/lib/python*/site-packages/azure -name __pycache__ | xargs rm -rf
 
 # Install jp tool
 RUN <<HERE

Now the timings are much more comparable, but the size increased just slightly. The delta is likely to be due to dependencies on azure package outside its namespace.

[I] root@fedora ~/azure-cli (dev)
# time docker run -it --rm azure-cli az --help > /dev/null

________________________________________________________
Executed in    8.11 secs      fish           external
   usr time   15.45 millis    5.24 millis   10.21 millis
   sys time    9.74 millis    2.03 millis    7.71 millis

[I] root@fedora ~/azure-cli (dev)
# time docker run -it --rm azure-cli az --help > /dev/null

________________________________________________________
Executed in    8.07 secs      fish           external
   usr time   11.09 millis  332.00 micros   10.76 millis
   sys time    6.93 millis   89.00 micros    6.84 millis

[I] root@fedora ~/azure-cli (dev)
# time docker run -it --rm azure-cli az --version > /dev/null

________________________________________________________
Executed in    1.70 secs      fish           external
   usr time    8.79 millis    0.00 micros    8.79 millis
   sys time    7.48 millis  445.00 micros    7.04 millis

[I] root@fedora ~/azure-cli (dev)
# time docker run -it --rm azure-cli az --version > /dev/null

________________________________________________________
Executed in    1.72 secs      fish           external
   usr time   10.45 millis  330.00 micros   10.12 millis
   sys time    7.92 millis   88.00 micros    7.84 millis

[I] root@fedora ~/azure-cli (dev)
# 

Dockerfile Outdated

RUN arch=$(arch | sed s/aarch64/arm64/ | sed s/x86_64/amd64/) && curl -L https://github.com/jmespath/jp/releases/download/${JP_VERSION}/jp-linux-$arch -o /usr/local/bin/jp \
&& chmod +x /usr/local/bin/jp
RUN --mount=from=builder,target=/mnt cd /mnt/usr/local && cp -ru . /usr/local/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't grasp why this is needed. What's the scenario where mounting produces a benefit?

Dockerfile Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
@hholst80
Copy link
Author

The jp installation is also hard-coded to amd64. Do we want support for native arm64 such as apple silicon or surface users?

This was an oversight in a merge from my side. It is included now, with support for the same architecture as in dev batch.

Use a multi-stage build to trim of significant overweight.

* Utilize a multi-stage build
* Run ./scripts/trim_sdk.py
* Cleanup __pycache__
* Optimize layer merge with dockerfile:1.2 syntax

Thanks to @jongio for suggesting the awesome dive tool to find the
inefficiency in the layer merge.

Additional cleanups to improve maintainability of the Dockerfile:

* Remove useless scanelf statement
* Removed usage of dos2unix as it is a concern of the checkout
* Introduce modern dockerfile:1.4 syntax
@hholst80
Copy link
Author

hholst80 commented Oct 14, 2023

@bebound what is your final thoughts on the multi-stage build should we keep it as-is (I reduced it to a single build stage and merged the 'tooling' stage with that). We can reduce it down to a single build stage and remove the .build-deps but I think the result is more ugly. Also for some reason this build is 1MB larger than the multi-stage. Gods know why. I am OK to drop the multi-stage if you think this PR is better without it.

# syntax=docker/dockerfile:1.4
#---------------------------------------------------------------------------------------------
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License. See License.txt in the project root for license information.
#---------------------------------------------------------------------------------------------

ARG PYTHON_VERSION="3.11"

FROM python:${PYTHON_VERSION}-alpine

ARG JP_VERSION="0.2.1"
ARG CLI_VERSION

# Metadata as defined at http://label-schema.org
ARG BUILD_DATE

LABEL maintainer="Microsoft" \
      org.label-schema.schema-version="1.0" \
      org.label-schema.vendor="Microsoft" \
      org.label-schema.name="Azure CLI" \
      org.label-schema.version=$CLI_VERSION \
      org.label-schema.license="MIT" \
      org.label-schema.description="The Azure CLI is used for all Resource Manager deployments in Azure." \
      org.label-schema.url="https://docs.microsoft.com/cli/azure/overview" \
      org.label-schema.usage="https://docs.microsoft.com/cli/azure/install-az-cli2#docker" \
      org.label-schema.build-date=$BUILD_DATE \
      org.label-schema.vcs-url="https://github.com/Azure/azure-cli.git" \
      org.label-schema.docker.cmd="docker run -v \${HOME}/.azure:/root/.azure -it mcr.microsoft.com/azure-cli:$CLI_VERSION"

# We don't use openssl (3.0) for now. We only install it so that users can use it.
# libintl and icu-libs - required by azure devops artifact (az extension add --name azure-devops)
RUN apk add --no-cache \
    bash bash-completion openssh-keygen ca-certificates openssl \
    curl jq git perl zip \
    libintl icu-libs libc6-compat \
 && update-ca-certificates

RUN --mount=target=/mnt --mount=type=cache,target=/root/.cache --mount=type=tmpfs,target=/root/.azure <<HERE
  apk add --no-cache --virtual .build-deps gcc make openssl-dev libffi-dev musl-dev linux-headers

  mkdir /azure-cli
  cd /mnt
  cp -ar . /azure-cli

  cd /azure-cli
  ./scripts/install_full.sh
  python3 ./scripts/trim_sdk.py

  cd /
  rm -rf /azure-cli

  /usr/local/bin/az && bash -c "source /usr/local/bin/az.completion.sh"
  ln -s /usr/local/bin/az.completion.sh /etc/profile.d/
  find /usr/local/lib/python*/site-packages/azure -name __pycache__ | xargs rm -rf

  arch=$(arch | sed s/aarch64/arm64/ | sed s/x86_64/amd64/)
  curl -L https://github.com/jmespath/jp/releases/download/${JP_VERSION}/jp-linux-${arch} -o /usr/local/bin/jp
  chmod +x /usr/local/bin/jp
  
  apk del --no-cache .build-deps
HERE

ENV AZ_INSTALLER=DOCKER
CMD bash

@hholst80 hholst80 requested a review from bebound October 14, 2023 17:01
@bebound
Copy link
Contributor

bebound commented Oct 16, 2023

I've created a draft pr to remove the unnecessary packages: #27567

Removing COPY . /azure-cli is in #27683

Removing __pycache__ saves the images size, but it increases the container size.
We created a PR which only keeps pyc file: #25801, but we did not merge it.

@ozbillwang
Copy link

ozbillwang commented Nov 28, 2023

not sure if this change has been implemented, the image is size reduced in latest versions already, the images built after 14th Nov.

mcr.microsoft.com/azure-cli          latest            9f400a9915c1   2 weeks ago     717MB
azure_cli                     latest            791d5491db53   2 months ago    479MB

But still 200MB+ more than the image I build with this PR.

image

@bebound
Copy link
Contributor

bebound commented Nov 28, 2023

The two enhancements metioned in previous comment were merged in 2.54 and images size is dropped to 700MB.

There is no plan to keep only py file or pyc file for now.
Removing pyc file impacts the execution time and increases container size, which makes no sense.
Keeping pyc only is applied in MSI. I'm not sure if we need to do the same on Linux.

  • It removes the source code from error stack
  • This technology is not adopted among other projects.

@hholst80
Copy link
Author

hholst80 commented Dec 7, 2023

Removing pyc file impacts the execution time and increases container size, which makes no sense.

That was not what my benchmark tests showed though. There was no impact on az --version nor az --help. The container size can be fixed by adding PYTHONDONTWRITEBYTECODE=1 to the ENV section. It might make execution faster so it should be added as there is no downside in this PR to add it. I do not have time to keep my PR mergable, just pull out the Dockerfile (w/ or w/o the multi-stage) and copy it in if anyone has interest in it.

@bebound
Copy link
Contributor

bebound commented Dec 8, 2023

@hholst80 I highly doubt your test result that compiling bytecode each time does not slow down the execution.
According to my tests, the execution time for az increased from 2s to 7s after the removal of the .pyc files.

Let me clarify my point.

  1. Removing .pyc files from the image doesn't save disk space because when a user runs the image, the compiled .pyc files are generated and added to the container. This approach actually uses more disk space if there are multiple containers.
  2. If we remove pyc and add PYTHONDONTWRITEBYTECODE=1 in image env, this slows down the execution.

In conclusion, I prefer keeping pyc.

@hholst80
Copy link
Author

hholst80 commented Dec 8, 2023

I would too if I saw that kind result.

  1. What py cache files did you remove. Removing the standard lib pycache offers no benefit as they are anyway baked in the base.

  2. How and what sub command did you benchmark, and arguments

@bebound
Copy link
Contributor

bebound commented Dec 8, 2023

Here is the code:

run --rm -it mcr.microsoft.com/azure-cli:2.55.0 bash
find /usr/local/lib/python3.11/site-packages/azure  -name '*.pyc' -exec rm -f {} +
PYTHONDONTWRITEBYTECODE=1 time az

@hholst80
Copy link
Author

OK. So the az command raw without any options seems to take a considerable additional time. There is some 5.5 million lines of Python code so if a significant part of that is pulled in it would of course have a significant impact on the execution time.

@hholst80
Copy link
Author

@bebound We can close this then? Perhaps a refactor of the Dockerfile could be interesting at a later time, but the primary purpose was to reduce image size. This has been achieved.

@bebound bebound closed this Dec 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
customer-reported Issues that are reported by GitHub users external to the Azure organization.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants