Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete Qualcomm DTBs to free up space on aarch64 #1464

Closed
dustymabe opened this issue Apr 12, 2023 · 13 comments · Fixed by coreos/fedora-coreos-config#2367
Closed

Delete Qualcomm DTBs to free up space on aarch64 #1464

dustymabe opened this issue Apr 12, 2023 · 13 comments · Fixed by coreos/fedora-coreos-config#2367
Assignees
Labels
jira for syncing to jira

Comments

@dustymabe
Copy link
Member

With our recently implemented extended upgrade testing we are now running a matrix of extended upgrade tests when we do production builds.

With the (unreleased) 38.20230408.1.0 next build we are now hitting a problem where we are running out of space on upgrade:

Apr 11 21:43:24.721835 ostree[688]: error: ostree-finalize-staged.service failed on previous boot:
Installing kernel: Copying smk-k26-revA-sm-k26-revA-sck-kv-g-revB.dtb: regfile copy: No space left on device

Full logs for the upgrade test (38.20230322.1.0->38.20230408.1.0) are here:

The next stream is now sitting at two releases with kernel-6.2 and this would be the third:

  • 38.20230322.1.0 -> 38.20230402.1.0 -> 38.20230408.1.0

Noteworthy is the fact that with 6.2 the kernel changed formatting of the kernel to EFI_ZBOOT.

The usage in /boot/ is only slightly larger but apparently it's enough to get us to a threshold where we run out of space when trying to fit 3 sets of kernel/initramfs at the same time (temporarily on upgrade).

For illustration here is the usage of a non 6.2 kernel deployment and a 6.2 kernel deployment on the same system (this is a testing stream based system):

[builder@coreos-aarch64-builder]$ rpm-ostree status 
State: idle
warning: Failed to query journal: couldn't find current boot in journal
AutomaticUpdatesDriver: Zincati
  DriverState: active; periodically polling for updates (last checked Wed 2023-04-12 13:21:41 UTC)
Deployments:
● fedora:fedora/aarch64/coreos/testing
                  Version: 37.20230401.2.0 (2023-04-03T20:03:03Z)
                   Commit: c375f7121f9804de91d70dac1e9f96a829e61a23de4b9e7ec9d1540af37083e2
             GPGSignature: Valid signature by ACB5EE4E831C74BB7C168D27F55AD3FB5323552A

  fedora:fedora/aarch64/coreos/testing
                  Version: 37.20230322.2.0 (2023-03-22T18:10:46Z)
                   Commit: f5b62128344d8484773481f6b9fd89ca5c0b74af685022d807207c97aa242c6e
             GPGSignature: Valid signature by ACB5EE4E831C74BB7C168D27F55AD3FB5323552A


[builder@coreos-aarch64-builder]$ rpm --root /sysroot/ostree/deploy/fedora-coreos/deploy/c375f7121f9804de91d70dac1e9f96a829e61a23de4b9e7ec9d1540af37083e2.0/ -q kernel
kernel-6.2.8-200.fc37.aarch64
[builder@coreos-aarch64-builder]$ rpm --root /sysroot/ostree/deploy/fedora-coreos/deploy/f5b62128344d8484773481f6b9fd89ca5c0b74af685022d807207c97aa242c6e.0/ -q kernel
kernel-6.1.18-200.fc37.aarch64


[builder@coreos-aarch64-builder fcos]$ du -sh /boot/ostree/*
111M    /boot/ostree/fedora-coreos-42e69e7bfcdf67d34ac950fd672ec4630904f6500ed73a018e24b93dcdec442c
115M    /boot/ostree/fedora-coreos-698b48a4df1602c282cd422c92ddc0dc065b8fb51f873909bd6f968852b7d0c7

If I boot 38.20230408.1.0 clean I see:

[core@cosa-devsh ~]$ rpm-ostree status
State: idle
AutomaticUpdatesDriver: Zincati
  DriverState: active; periodically polling for updates (last checked Wed 2023-04-12 16:08:05 UTC)
Deployments:
● fedora:fedora/aarch64/coreos/next
                  Version: 38.20230408.1.0 (2023-04-11T20:00:06Z)
                   Commit: d63b65f25908f417157805850421a39af3cb719f7970f36e0e2ffc29097db139
             GPGSignature: Valid signature by 6A51BBABBA3D5467B6171221809A8D7CEB10B464

[core@cosa-devsh ~]$ sudo du -sh /boot/ostree/*
116M    /boot/ostree/fedora-coreos-eeca130c728b59599b77e10f70f5dba4d39d95d4d2d49c16dd5fb042cccf4260
[core@cosa-devsh ~]$ sudo du -sh /boot/ostree/*/*
29M     /boot/ostree/fedora-coreos-eeca130c728b59599b77e10f70f5dba4d39d95d4d2d49c16dd5fb042cccf4260/dtb
72M     /boot/ostree/fedora-coreos-eeca130c728b59599b77e10f70f5dba4d39d95d4d2d49c16dd5fb042cccf4260/initramfs-6.2.9-300.fc38.aarch64.img
16M     /boot/ostree/fedora-coreos-eeca130c728b59599b77e10f70f5dba4d39d95d4d2d49c16dd5fb042cccf4260/vmlinuz-6.2.9-300.fc38.aarch64
@dustymabe
Copy link
Member Author

Note that this is basically a new occurence of #1247 which will affect our production streams if we don't act.

@dustymabe
Copy link
Member Author

Potential solutions here for our short term problem:

  • ship an update with no new kernel (i.e. no new usage in /boot) along with new ostree with autopruning (being developed by @jlebon) turned on (this unblocks ppc64le too)
  • identify some dtb files that we can remove that we don't care about for FCOS aarch64 usages (see below list):
[core@cosa-devsh ~]$ sudo du -sh /boot/ostree/*/*/*
1.1M    /boot/ostree/fedora-coreos-eeca130c728b59599b77e10f70f5dba4d39d95d4d2d49c16dd5fb042cccf4260/dtb/allwinner
25K     /boot/ostree/fedora-coreos-eeca130c728b59599b77e10f70f5dba4d39d95d4d2d49c16dd5fb042cccf4260/dtb/amd
2.4M    /boot/ostree/fedora-coreos-eeca130c728b59599b77e10f70f5dba4d39d95d4d2d49c16dd5fb042cccf4260/dtb/amlogic
39K     /boot/ostree/fedora-coreos-eeca130c728b59599b77e10f70f5dba4d39d95d4d2d49c16dd5fb042cccf4260/dtb/apm
567K    /boot/ostree/fedora-coreos-eeca130c728b59599b77e10f70f5dba4d39d95d4d2d49c16dd5fb042cccf4260/dtb/apple
232K    /boot/ostree/fedora-coreos-eeca130c728b59599b77e10f70f5dba4d39d95d4d2d49c16dd5fb042cccf4260/dtb/arm
158K    /boot/ostree/fedora-coreos-eeca130c728b59599b77e10f70f5dba4d39d95d4d2d49c16dd5fb042cccf4260/dtb/broadcom
11K     /boot/ostree/fedora-coreos-eeca130c728b59599b77e10f70f5dba4d39d95d4d2d49c16dd5fb042cccf4260/dtb/cavium
4.0M    /boot/ostree/fedora-coreos-eeca130c728b59599b77e10f70f5dba4d39d95d4d2d49c16dd5fb042cccf4260/dtb/freescale
182K    /boot/ostree/fedora-coreos-eeca130c728b59599b77e10f70f5dba4d39d95d4d2d49c16dd5fb042cccf4260/dtb/hisilicon
615K    /boot/ostree/fedora-coreos-eeca130c728b59599b77e10f70f5dba4d39d95d4d2d49c16dd5fb042cccf4260/dtb/marvell
1.2M    /boot/ostree/fedora-coreos-eeca130c728b59599b77e10f70f5dba4d39d95d4d2d49c16dd5fb042cccf4260/dtb/nvidia
13M     /boot/ostree/fedora-coreos-eeca130c728b59599b77e10f70f5dba4d39d95d4d2d49c16dd5fb042cccf4260/dtb/qcom
4.3M    /boot/ostree/fedora-coreos-eeca130c728b59599b77e10f70f5dba4d39d95d4d2d49c16dd5fb042cccf4260/dtb/rockchip
634K    /boot/ostree/fedora-coreos-eeca130c728b59599b77e10f70f5dba4d39d95d4d2d49c16dd5fb042cccf4260/dtb/ti
699K    /boot/ostree/fedora-coreos-eeca130c728b59599b77e10f70f5dba4d39d95d4d2d49c16dd5fb042cccf4260/dtb/xilinx

@jlebon
Copy link
Member

jlebon commented Apr 12, 2023

For more information, here's a comparison of the different sizes between the very first aarch64 image we built and the latest one.

On 34.20210821.3.0:

# du -sh *
14M     dtb
68M     initramfs-5.13.12-200.fc34.aarch64.img
13M     vmlinuz-5.13.12-200.fc34.aarch64

On 38.20230408.1.0:

# du -sh *
29M     dtb
72M     initramfs-6.2.9-300.fc38.aarch64.img
16M     vmlinuz-6.2.9-300.fc38.aarch64

So the kernel only increased by 3M, the initrd by 4M, and the dtb directory by 15M.

It'd be nice to be able to trim some of those devicetrees though it's theoretically possible that some users have installed FCOS on some of these boards.

@nullr0ute
Copy link

It'd be nice to be able to trim some of those devicetrees though it's theoretically possible that some users have installed FCOS on some of these boards.

Depending on the firmwares, if they're SystemReady devices you could in theory remove all of the DTBs. For example in traditional Fedora if the dtb symlink isn't there it will use the DT provided by the firmware, but of course there's some devices that won't boot in that case. What size is /boot? the recommended default from anaconda should be OK.

@nullr0ute
Copy link

nullr0ute commented Apr 12, 2023

Looking at the directory of a recent F-38:

$ du -h /boot/dtb-6.2.9-300.fc38.aarch64/|sort -h
12K     /boot/dtb-6.2.9-300.fc38.aarch64/cavium
24K     /boot/dtb-6.2.9-300.fc38.aarch64/amd
40K     /boot/dtb-6.2.9-300.fc38.aarch64/apm
168K    /boot/dtb-6.2.9-300.fc38.aarch64/broadcom
188K    /boot/dtb-6.2.9-300.fc38.aarch64/hisilicon
252K    /boot/dtb-6.2.9-300.fc38.aarch64/arm
592K    /boot/dtb-6.2.9-300.fc38.aarch64/apple
640K    /boot/dtb-6.2.9-300.fc38.aarch64/marvell
660K    /boot/dtb-6.2.9-300.fc38.aarch64/ti
740K    /boot/dtb-6.2.9-300.fc38.aarch64/xilinx
1.2M    /boot/dtb-6.2.9-300.fc38.aarch64/allwinner
1.2M    /boot/dtb-6.2.9-300.fc38.aarch64/nvidia
2.5M    /boot/dtb-6.2.9-300.fc38.aarch64/amlogic
4.2M    /boot/dtb-6.2.9-300.fc38.aarch64/freescale
4.4M    /boot/dtb-6.2.9-300.fc38.aarch64/rockchip
14M     /boot/dtb-6.2.9-300.fc38.aarch64/qcom
30M     /boot/dtb-6.2.9-300.fc38.aarch64/

I would say if you removed qcom you'd get your 14Mb back an be unlikely to have a complaint. apple/amd/cavium/apm could probably go as well but you don't gain much. YMMV

@jlebon
Copy link
Member

jlebon commented Apr 12, 2023

Draft coreos-status email: https://hackmd.io/bq_3hYQHRKapbIZW-UpVtw

dustymabe added a commit to dustymabe/fedora-coreos-config that referenced this issue Apr 12, 2023
…ch64

This will save us some space while we work on longer term
solutions for limited space in the /boot partition.

Fixes coreos/fedora-coreos-tracker#1464
@dustymabe
Copy link
Member Author

with coreos/fedora-coreos-config#2367 we now have something like:

[core@cosa-devsh ~]$ sudo du -sh /boot/ostree/*
102M    /boot/ostree/fedora-coreos-5463cfa9e9b1e44c5f5bb7185e14fc907f7a093088a5bd542e58b45997ac60c6

dustymabe added a commit to dustymabe/fedora-coreos-config that referenced this issue Apr 12, 2023
…ch64

This will save us some space while we work on longer term
solutions for limited space in the /boot partition.

Fixes coreos/fedora-coreos-tracker#1464

(cherry picked from commit 1375c9f)
@dustymabe dustymabe reopened this Apr 12, 2023
@dustymabe dustymabe self-assigned this Apr 12, 2023
@dustymabe dustymabe added the jira for syncing to jira label Apr 12, 2023
dustymabe added a commit to coreos/fedora-coreos-config that referenced this issue Apr 12, 2023
…ch64

This will save us some space while we work on longer term
solutions for limited space in the /boot partition.

Fixes coreos/fedora-coreos-tracker#1464

(cherry picked from commit 1375c9f)
@dustymabe
Copy link
Member Author

We discussed this in the community meeting today.

13:09:07*  dustymabe | #agreed we will try to remove the qcom dtb
                     | files and see if that gets us unblocked for
                     | now and then we'll deploy the ostree
                     | autoprune in the coming weeks as a more
                     | permanent "short-term" solution.

@dustymabe
Copy link
Member Author

The update with removed qcom files went out in our next stream today in 38.20230408.1.1

@jlebon jlebon changed the title upgrades with 6.2 kernels running out of space on aarch64 Delete Qualcomm DTBs to free up space on aarch64 Apr 14, 2023
@dustymabe
Copy link
Member Author

The fix for this went into next stream release 38.20230408.1.1. Please try out the new release and report issues.

@dustymabe dustymabe added the status/pending-testing-release Fixed upstream. Waiting on a testing release. label Apr 14, 2023
@jlebon
Copy link
Member

jlebon commented Apr 14, 2023

Note we decided not to send out an email for this issue in the end. We believe there are few users (likely none) on these platforms. Nonetheless, I've opened #1467 to track the reversal of this change and serve as a landing point for anyone trying to get FCOS to work on the now broken devices.

@dustymabe
Copy link
Member Author

The fix for this went into testing stream release 38.20230414.2.0. Please try out the new release and report issues.

@dustymabe dustymabe added status/pending-stable-release Fixed upstream and in testing. Waiting on stable release. and removed status/pending-testing-release Fixed upstream. Waiting on a testing release. labels Apr 18, 2023
@dustymabe
Copy link
Member Author

The fix for this went into stable stream release 38.20230414.3.0.

@dustymabe dustymabe removed the status/pending-stable-release Fixed upstream and in testing. Waiting on stable release. label May 3, 2023
c4rt0 pushed a commit to c4rt0/fedora-coreos-config that referenced this issue May 17, 2023
…ch64

This will save us some space while we work on longer term
solutions for limited space in the /boot partition.

Fixes coreos/fedora-coreos-tracker#1464
dustymabe added a commit to dustymabe/fedora-coreos-config that referenced this issue May 26, 2023
Now that we have autopruning ability we can re-add the qcom dtbs
on aarch64 that were dropped in
coreos/fedora-coreos-tracker#1464

Fixes coreos/fedora-coreos-tracker#1467
dustymabe added a commit to dustymabe/fedora-coreos-config that referenced this issue May 26, 2023
Now that we have autopruning ability we can re-add the qcom dtbs
on aarch64 that were dropped in
coreos/fedora-coreos-tracker#1464

Fixes coreos/fedora-coreos-tracker#1467
dustymabe added a commit to dustymabe/fedora-coreos-config that referenced this issue May 31, 2023
Now that we have autopruning ability we can re-add the qcom dtbs
on aarch64 that were dropped in
coreos/fedora-coreos-tracker#1464

Fixes coreos/fedora-coreos-tracker#1467
dustymabe added a commit to coreos/fedora-coreos-config that referenced this issue May 31, 2023
Now that we have autopruning ability we can re-add the qcom dtbs
on aarch64 that were dropped in
coreos/fedora-coreos-tracker#1464

Fixes coreos/fedora-coreos-tracker#1467
HuijingHei pushed a commit to HuijingHei/fedora-coreos-config that referenced this issue Oct 10, 2023
…ch64

This will save us some space while we work on longer term
solutions for limited space in the /boot partition.

Fixes coreos/fedora-coreos-tracker#1464
HuijingHei pushed a commit to HuijingHei/fedora-coreos-config that referenced this issue Oct 10, 2023
Now that we have autopruning ability we can re-add the qcom dtbs
on aarch64 that were dropped in
coreos/fedora-coreos-tracker#1464

Fixes coreos/fedora-coreos-tracker#1467
HuijingHei pushed a commit to HuijingHei/fedora-coreos-config that referenced this issue Oct 10, 2023
…ch64

This will save us some space while we work on longer term
solutions for limited space in the /boot partition.

Fixes coreos/fedora-coreos-tracker#1464
HuijingHei pushed a commit to HuijingHei/fedora-coreos-config that referenced this issue Oct 10, 2023
Now that we have autopruning ability we can re-add the qcom dtbs
on aarch64 that were dropped in
coreos/fedora-coreos-tracker#1464

Fixes coreos/fedora-coreos-tracker#1467
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira for syncing to jira
Projects
None yet
3 participants