Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segment fault with Ubuntu 24.04 20250120.5.0 #11471

Open
2 of 16 tasks
RadxaYuntian opened this issue Jan 26, 2025 · 6 comments
Open
2 of 16 tasks

Segment fault with Ubuntu 24.04 20250120.5.0 #11471

RadxaYuntian opened this issue Jan 26, 2025 · 6 comments

Comments

@RadxaYuntian
Copy link

Description

We have a scheduled job running every Sunday, which failed today, with no code change in last 2 weeks.

After checking the build log, it always failed at a dkms package installation. Once the workflow file is changed to print the dkms log, the error is always gcc segment fault.

Changing running environment to ubuntu22-04 fixed the segment fault. Action still failed but that's because the change we made to investigate this issue.

What may be unusual for us is that we are using binfmt to run aarch64 gcc in a devcontainer, because the final output is an aarch64 system image. So this is not some normal gcc failing.

Platforms affected

  • Azure DevOps
  • GitHub Actions - Standard Runners
  • GitHub Actions - Larger Runners

Runner images affected

  • Ubuntu 20.04
  • Ubuntu 22.04
  • Ubuntu 24.04
  • macOS 12
  • macOS 13
  • macOS 13 Arm64
  • macOS 14
  • macOS 14 Arm64
  • macOS 15
  • macOS 15 Arm64
  • Windows Server 2019
  • Windows Server 2022
  • Windows Server 2025

Image version and build link

20250120.5.0

Is it regression?

20250105.1.0: https://github.com/RadxaOS-SDK/rsdk/actions/runs/12848906725

Expected behavior

DKMS install successfully without gcc segfault.

Actual behavior

gcc segfault:

   2025-01-26 07:47:36,252 bdebstrap ERROR: mmdebstrap failed with exit code 25. See above for details.
  
  /workspaces/rsdk
  
  DKMS make.log for radxa-overlays-0.1.20 for kernel 6.1.68-2-stable (aarch64)
  Sun Jan 26 07:47:16 UTC 2025
  make: Entering directory '/usr/src/linux-headers-6.1.68-2-stable'
  Segmentation fault (core dumped)
  warning: the compiler differs from the one used to build the kernel
    The kernel was built by: aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110
    You are using:           gcc (Debian 12.2.0-14) 12.2.0
    CC [M]  /var/lib/dkms/radxa-overlays/0.1.20/build/radxa-overlays.o
    DTC     /var/lib/dkms/radxa-overlays/0.1.20/build/arch/arm64/boot/dts/amlogic/overlays/meson-g12-disable-gpu.dtbo
    DTC     /var/lib/dkms/radxa-overlays/0.1.20/build/arch/arm64/boot/dts/amlogic/overlays/meson-g12-disable-hdmi.dtbo
    DTC     /var/lib/dkms/radxa-overlays/0.1.20/build/arch/arm64/boot/dts/rockchip/overlays/radxa-s0-ext-antenna.dtbo
  gcc: internal compiler error: Segmentation fault signal terminated program cc1
  Please submit a full bug report, with preprocessed source (by using -freport-bug).
  See <file:///usr/share/doc/gcc-12/README.Bugs> for instructions.
  make[2]: *** [scripts/Makefile.lib:409: /var/lib/dkms/radxa-overlays/0.1.20/build/arch/arm64/boot/dts/rockchip/overlays/radxa-s0-ext-antenna.dtbo] Error 4
  make[1]: *** [scripts/Makefile.build:500: /var/lib/dkms/radxa-overlays/0.1.20/build/arch/arm64/boot/dts/rockchip/overlays] Error 2
  make[1]: *** Waiting for unfinished jobs....

Repro steps

  1. Clone https://github.com/RadxaOS-SDK/rsdk
  2. Cherry pick RadxaYuntian/rsdk@090908a to view dkms log
  3. Trigger workflow_dispatch for build.yaml
@RadxaYuntian
Copy link
Author

RadxaYuntian commented Jan 26, 2025

The gcc version Debian 12.2.0-14 was released on 2023/01/08, so the last successful run (2025/01/19) and today's failed run are both using the same version in the devcontainer.

@deviantintegral
Copy link

I can confirm this as well at https://github.com/pbkhrv/rtl_433-hass-addons/actions/runs/12972957498/job/36181006667. That job is compiling aarch64 in Docker under QEMU (I know, proper cross compiling would be better, but this is what the official Home Assistant builder action does so 🤷 ).

Is there a way to specify the runner image version to a previous 24.04 release to confirm the regression?

@woblerr
Copy link

woblerr commented Jan 26, 2025

The same problem for buildx for linux/arm64 via QEMU: https://github.com/woblerr/docker-pgbackrest/actions/runs/12965488407/job/36165276019#step:7:2658

Rollback to the ubuntu-22.04 runner solved the problem.

@MyreMylar
Copy link

Chiming in to say that we are seeing segfaults on our test runners for pygame-ce in the ppc64le architecture build since getting version 20250120.5.0. and, perhaps related, it also reporting that it can no longer detect the GNU compiler type for our S390x architecture build.

As @deviantintegral says it would be nice to have a way to roll back to a previous runner image to isolate the problem.

@RaviAkshintala
Copy link
Contributor

Hi @RadxaYuntian Thank you for bringing this issue to our attention. We will look into this issue and will update you after investigating.

stevenhorsman added a commit to stevenhorsman/cloud-api-adaptor that referenced this issue Jan 27, 2025
Due to an
[issue](actions/runner-images#11471)
with Ubuntu 24.04 20250120.5.0 runner image
we have been seeing failures in our multi-arch images for
the last few days which is blocking the release. I assume that
the issue is something related to qemu, so downgrade to 22.04
until this issue is resolved.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
@BrianPugh
Copy link

I'm also having very similar issues in tamp when using cibuildwheel to build python wheels for ppc64le and aarch64 targets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants