Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add build option to reduce final image size #16729

Merged
merged 2 commits into from
Oct 24, 2023

Conversation

Staphylo
Copy link
Collaborator

Why I did it

Running SONiC releases past 202012 has become really challenging on system with small storage devices (4GB).
Some of these devices can also be limited by only having 4GB of RAM which complicates mitigations.
The main contributor to these issues is the SONiC image growth.
Being able to reduce it by some decent amount should allow these systems to run SONiC longer.
It would also reduce some impacts related to space savings mitigations.

Work item tracking
  • Microsoft ADO (number only):

How I did it

Add a build option to reduce the image size.
The image reduction process is affecting the builds in 2 ways:

  • change some packages that are installed in the rootfs
  • apply a rootfs reduction script

The script itself will perform a few steps:

  • remove file duplication by leveraging hardlinks
    • under /usr/share/sonic since the symlinks under the device folder are lost during the build.
    • under /var/lib/docker since the files there will only be mounted ro
  • remove some extra files (man, docs, licenses, ...)
  • some image specific space reduction (only for aboot images currently)

The script can later be improved but for now it's reducing the rootfs size by ~30%.

How to verify it

Compare the size of an image with this option enabled and this option enabled.
Expect the fully extracted content to be ~30% less.

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211
  • 202305

Description for the changelog

Add build option to reduce final image size

@Staphylo
Copy link
Collaborator Author

@StormLiangMS @lipxu I published this as a draft for now.
Please let me know what you think of this approach.


def remove_docs(self):
self._remove_root_paths([
'usr/share/doc',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe some actions can be enabled by default.
E.g. we clean up /usr/share/doc here:

sudo LANG=C chroot $FILESYSTEM_ROOT bash -c 'rm -rf /usr/share/doc/* /usr/share/locale/* /var/lib/apt/lists/* /tmp/*'

I think it's possible to remove other directories with docs/mans/licensses without any conditons.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these cleanup, especially the removal ones could definitely be added as part of the general build.

However this would have to be more than just here, since it would also need to apply to all docker images.
On top of that the removal of licenses could have some legal implication for public/redistributable builds such as the community ones.
This should however not be a problem for internal/private images.

Most of the savings of this change actually come from the hardlinking steps.
And for Aboot images the removal of other platforms and unneeded kernel modules and firmwares.

This mitigation, even if it merges in master, has 202305 as the main target for 4GB systems.
It therefore felt simpler to centralize the cleanup under a postprocessing script hidden behind a toggle instead of touching tens of files.
This script also has the advantage of cleaning up behind newly introduced code that forgot the cleanup steps.
Until there is some checks and tooling in place to prevent new code from contributing to bloat, things will slip and become a maintenance burden.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add to your comment, the proper thing to do here would be to add an /etc/dpkg/dpkg.conf.d/01_sonic conf instead of removing files like it's currently done in build_debian.sh
In this configuration you can specify things like path-exclude=/usr/share/doc however that will not retroactively apply the previously installed packages.

In the case of docker, that means that the debian base image will retain these files no matter what.
If you decide to remove them as part of the build of a child Dockerfile, this is actually going to create whiteout files which consumes space instead of the final goal being to release space.

Copy link
Contributor

@k-v1 k-v1 Sep 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of docker, that means that the debian base image will retain these files no matter what.

This is already done for docker containers:

RUN find /usr/share/doc -depth \( -type f -o -type l \) ! -name copyright | xargs rm || true

COPY ["dpkg_01_drop", "/etc/dpkg/dpkg.cfg.d/01_drop"]

So my suggestion was only about host system.

On top of that the removal of licenses could have some legal implication for public/redistributable builds such as the community ones.

We keep copyright files in docker containers but remove them from host system.
Probably we shouldn't remove these files from /usr/share/doc on host system.
There was my old PR #14417 to fix it. I can reopen it, but I'm still not sure should we keep these files or not.

@k-v1
Copy link
Contributor

k-v1 commented Sep 28, 2023

  1. docs, licenses and manuals
    I think it's better to use the same approach for all platforms. I mean remove all unused docs/manuals from host system without any additional options or flags. And don't remove licenses/copyright files for packages if it's required for legal reasons.

  2. firmwares
    Maybe we should remove graphics firmwares for all platforms if BUILD_REDUCE_IMAGE_SIZE is set or even without this option.

  3. hardlinks
    Can we enable this feature by default? Does it have any drawbacks?

  4. vim and vim.tiny
    I don't think we should do a package-level optimisation. I mean it's good approach for embedded devices with busybox, kernel without unused modules etc. But for SONiC with dockers, python and go packages 1-2MB gives you nothing.

@Staphylo
Copy link
Collaborator Author

Staphylo commented Oct 2, 2023

  1. I can do this change

  2. Because my change is actually removing files that were installed by packages, it's a bit dirty.
    My worry with removing firmwares/drivers broadly is that I will be making a selection that can at a later time bother some vendor. They would have to come back and re-enable the drivers/firmwares they want and in the process increase the image size that this change is trying to contain for a select few products.
    In most cases this image size is actually not a problem, only on devices with 4GB and soon 8GB of disk space.
    This is the reason why I decided to make this optimization only on Arista images, because we fully control all devices supported and run testing on all of these. I would happily spare myself from having to handle potential fallbacks and waste cycles on that.

  3. Under regular operations there should not be any fallbacks (none seen so far, and still running tests which is why this review is still a WIP).
    On the platform front, only one folder is being used at runtime under /usr/share/sonic/device so there should not be any problem. However if 2 files are the same for 2 products or 2 asics (in a multi-asic environment), if you change 1 file the other will change too. This can have unintended consequences and confuse developers which will waste cycles figuring out what is happening when they are making changes.
    On the /var/lib/docker/ front, at the time of hardlinking, only docker images are installed. There is therefore no containers. The image layers are all mounted ro by dockerd using the overlayfs driver. So even if a hardlinked file is modified in a container at runtime, thanks to the overlayfs driver it will actually be copied and modified in the rw layer of the container (CoW).
    This change can definitely have an impact on developers but is invisible for an end user because they are not expected to modify these files at runtime.
    However this could break some user scripts that are poking at the filesystem without warning.
    Because this is once again a hacky behavior, I would highly prefer not to introduce a post process step that diverge from the expected behavior for everyone unless explicitly stated in the config.

  4. I can revert that change but I believe you are misrepresenting the savings that replacing vim by vim-tiny represent.
    For a lab device, I agree it's more confortable to have a fully fledged vim editor, for production I don't think that matters that much. Once again this really is a mitigation for devices with limited storage space.
    I could find another name for the config option to make this clearer.

vim-tiny:
Need to get 1162 kB of archives.
After this operation, 2206 kB of additional disk space will be used.

vim:
Need to get 8174 kB of archives.
After this operation, 36.9 MB of additional disk space will be used.

Only considering compressed data vim is ~8x bigger than vim-tiny.

Something I was thinking was to potentially add levels to the BUILD_REDUCE_IMAGE_SIZE.
Such as n, y, aggressive/f to differentiate some different levels of optimisations to be done.

@k-v1
Copy link
Contributor

k-v1 commented Oct 2, 2023

In most cases this image size is actually not a problem, only on devices with 4GB and soon 8GB of disk space.

Because this is once again a hacky behavior, I would highly prefer not to introduce a post process step that diverge from the expected behavior for everyone unless explicitly stated in the config.

I agree. Then it's better to use hacks only when BUILD_REDUCE_IMAGE_SIZE is set.

For a lab device, I agree it's more confortable to have a fully fledged vim editor, for production I don't think that matters that much.
Only considering compressed data vim is ~8x bigger than vim-tiny.

I forgot about vim-runtime package. Then it makes sense.

@StormLiangMS
Copy link
Contributor

@Staphylo several questions.

  1. With current vim, we can use it to open file with .gz, how about vim-tiny?

  2. Does this change only impact on slim swi binary? Or also on slim bin or bin?

  3. Have you run test with 202305 on 7050qx / 7060?

@lipxu
Copy link
Contributor

lipxu commented Oct 19, 2023

Hi, @Staphylo , I picked this PR and enabled BUILD_REDUCE_IMAGE_SIZE, the test result seems OK for the private internal image.

@Staphylo
Copy link
Collaborator Author

@StormLiangMS @lipxu

With current vim, we can use it to open file with .gz, how about vim-tiny?

It is not possible to open gz compressed files directly with vim.tiny.
I'll admit that I've always used zless for that.
If the use of vim.tiny is a problem, I can remove it from this change for now and create a separate PR for it.
I believe with the savings we've been able to make so far this is not a deal breaker anymore.

Does this change only impact on slim swi binary? Or also on slim bin or bin?

It would impact any image built with this option, so .bin included.
I could make it .swi specific but I felt most of the savings could be shared independently of the image.
The only difference is that for .swi images there is some specific logic inside the scripts/build-optimize-fs-size.py script to trim some blobs that we are confident about for our platforms but cannot safely test for other platforms.

Have you run test with 202305 on 7050qx / 7060?

This is still ongoing

On another note, I will have to refactor this PR a bit since the new staged build feature created a conflict #15924
However I don't believe this will be backported to any earlier releases so the PR on master will look slightly different than what we'll use on 202305.
I am planning on opening a PR straight to 202305 as well as rebase this PR on master.

@Staphylo Staphylo force-pushed the master-reduce-image-size branch from 19038a4 to 5ce9116 Compare October 19, 2023 12:34
@Staphylo Staphylo marked this pull request as ready for review October 19, 2023 12:34
@StormLiangMS
Copy link
Contributor

@StormLiangMS @lipxu

With current vim, we can use it to open file with .gz, how about vim-tiny?

It is not possible to open gz compressed files directly with vim.tiny. I'll admit that I've always used zless for that. If the use of vim.tiny is a problem, I can remove it from this change for now and create a separate PR for it. I believe with the savings we've been able to make so far this is not a deal breaker anymore.

Does this change only impact on slim swi binary? Or also on slim bin or bin?

It would impact any image built with this option, so .bin included. I could make it .swi specific but I felt most of the savings could be shared independently of the image. The only difference is that for .swi images there is some specific logic inside the scripts/build-optimize-fs-size.py script to trim some blobs that we are confident about for our platforms but cannot safely test for other platforms.

Have you run test with 202305 on 7050qx / 7060?

This is still ongoing

On another note, I will have to refactor this PR a bit since the new staged build feature created a conflict #15924 However I don't believe this will be backported to any earlier releases so the PR on master will look slightly different than what we'll use on 202305. I am planning on opening a PR straight to 202305 as well as rebase this PR on master.

hi @Staphylo I think it would be good to get the vim back, it is very handy tool for production. And there is a merge conflict, could you help to take a look?

@StormLiangMS
Copy link
Contributor

  1. I can do this change
  2. Because my change is actually removing files that were installed by packages, it's a bit dirty.
    My worry with removing firmwares/drivers broadly is that I will be making a selection that can at a later time bother some vendor. They would have to come back and re-enable the drivers/firmwares they want and in the process increase the image size that this change is trying to contain for a select few products.
    In most cases this image size is actually not a problem, only on devices with 4GB and soon 8GB of disk space.
    This is the reason why I decided to make this optimization only on Arista images, because we fully control all devices supported and run testing on all of these. I would happily spare myself from having to handle potential fallbacks and waste cycles on that.
  3. Under regular operations there should not be any fallbacks (none seen so far, and still running tests which is why this review is still a WIP).
    On the platform front, only one folder is being used at runtime under /usr/share/sonic/device so there should not be any problem. However if 2 files are the same for 2 products or 2 asics (in a multi-asic environment), if you change 1 file the other will change too. This can have unintended consequences and confuse developers which will waste cycles figuring out what is happening when they are making changes.
    On the /var/lib/docker/ front, at the time of hardlinking, only docker images are installed. There is therefore no containers. The image layers are all mounted ro by dockerd using the overlayfs driver. So even if a hardlinked file is modified in a container at runtime, thanks to the overlayfs driver it will actually be copied and modified in the rw layer of the container (CoW).
    This change can definitely have an impact on developers but is invisible for an end user because they are not expected to modify these files at runtime.
    However this could break some user scripts that are poking at the filesystem without warning.
    Because this is once again a hacky behavior, I would highly prefer not to introduce a post process step that diverge from the expected behavior for everyone unless explicitly stated in the config.
  4. I can revert that change but I believe you are misrepresenting the savings that replacing vim by vim-tiny represent.
    For a lab device, I agree it's more confortable to have a fully fledged vim editor, for production I don't think that matters that much. Once again this really is a mitigation for devices with limited storage space.
    I could find another name for the config option to make this clearer.
vim-tiny:
Need to get 1162 kB of archives.
After this operation, 2206 kB of additional disk space will be used.

vim:
Need to get 8174 kB of archives.
After this operation, 36.9 MB of additional disk space will be used.

Only considering compressed data vim is ~8x bigger than vim-tiny.

Something I was thinking was to potentially add levels to the BUILD_REDUCE_IMAGE_SIZE. Such as n, y, aggressive/f to differentiate some different levels of optimisations to be done.

hi @Staphylo for the hardlinks, one question, if someone modified one file in the container, does this change persistent after reboot?

@Staphylo
Copy link
Collaborator Author

@StormLiangMS I will make the necessary changes and rebase my PR to address the conflict.

Regarding the hardlinks, if someone makes a change in a container the original file (which might be hardlinked) will remain untouched.
In SONiC, dockerd uses overlayfs as the storage driver. This means that the layers that might have hardlinked files will be mounted read-only in the container.
Any file modification happening in a container will end up in the upper layer which is rw via a copy-on-write mechanism.
So assuming docker_inram is disabled, yes the change will persist across reboots and won't impact other containers.

Add a build option to reduce the image size.
The image reduction process is affecting the builds in 2 ways:
 - change some packages that are installed in the rootfs
 - apply a rootfs reduction script

The script itself will perform a few steps:
 - remove file duplication by leveraging hardlinks
   - under /usr/share/sonic since the symlinks under the device folder are lost during the build.
   - under /var/lib/docker since the files there will only be mounted ro
 - remove some extra files (man, docs, licenses, ...)
 - some image specific space reduction (only for aboot images currently)

The script can later be improved but for now it's reducing the rootfs
size by ~30%.
@Staphylo Staphylo force-pushed the master-reduce-image-size branch from 5ce9116 to 746bbf1 Compare October 23, 2023 14:59
@StormLiangMS
Copy link
Contributor

Thanks, @Staphylo, could you also make the change for the 202305 PR?

#16948

Copy link
Contributor

@StormLiangMS StormLiangMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@StormLiangMS
Copy link
Contributor

ADO:25423859

@StormLiangMS StormLiangMS merged commit e4a4971 into sonic-net:master Oct 24, 2023
16 checks passed
StormLiangMS pushed a commit that referenced this pull request Oct 24, 2023
Why I did it
Running SONiC releases past 202012 has become really challenging on system with small storage devices (4GB).
Some of these devices can also be limited by only having 4GB of RAM which complicates mitigations.
The main contributor to these issues is the SONiC image growth.
Being able to reduce it by some decent amount should allow these systems to run SONiC longer.
It would also reduce some impacts related to space savings mitigations.

Work item tracking
Microsoft ADO (number only):
How I did it
Add a build option to reduce the image size.
The image reduction process is affecting the builds in 2 ways:

change some packages that are installed in the rootfs
apply a rootfs reduction script
The script itself will perform a few steps:

remove file duplication by leveraging hardlinks
under /usr/share/sonic since the symlinks under the device folder are lost during the build.
under /var/lib/docker since the files there will only be mounted ro
remove some extra files (man, docs, licenses, ...)
some image specific space reduction (only for aboot images currently)
The script can later be improved but for now it's reducing the rootfs size by ~30%.

How to verify it
Compare the size of an image with this option enabled and this option enabled.
Expect the fully extracted content to be ~30% less.

Which release branch to backport (provide reason below if selected)
This is a backport of #16729

Description for the changelog
Add build option to reduce final image size
lixiaoyuner pushed a commit to lixiaoyuner/sonic-buildimage that referenced this pull request Feb 6, 2024
enable reduce image size for slim image build
sonic-net#16729

Signed-off-by: xuliping <xuliping@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants