Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Live-ISO test failing #2110

Open
lentzi90 opened this issue Dec 9, 2024 · 13 comments
Open

Live-ISO test failing #2110

lentzi90 opened this issue Dec 9, 2024 · 13 comments
Labels
kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. triage/accepted Indicates an issue is ready to be actively worked on.
Milestone

Comments

@lentzi90
Copy link
Member

lentzi90 commented Dec 9, 2024

Which jobs are failing?

Periodic E2E: https://github.com/metal3-io/baremetal-operator/actions/workflows/e2e-test-periodic-main.yml

Which tests are failing?

live-iso: https://github.com/metal3-io/baremetal-operator/actions/runs/12227994617/job/34105542268

Since when has it been failing?

Fist failure was 4th December

Jenkins link

No response

Reason for failure (if possible)

The BMH cannot be provisioned because the "image is not valid for use".

From Ironic logs:

2024-12-05 14:23:05.425 1 DEBUG ironic.common.images [None req-f1216348-13a7-4a46-bfcb-8de7d88b3fb1 - - - - - -] Image http://192.168.222.1/sysrescue-out.iso downloaded in 2.58 seconds. fetch_into /usr/lib/python3.9/site-packages/ironic/common/images.py:386
2024-12-05 14:23:05.430 1 DEBUG oslo_utils.imageutils.format_inspector [None req-f1216348-13a7-4a46-bfcb-8de7d88b3fb1 - - - - - -] Format inspector for vmdk does not match, excluding from consideration (Signature KDMV not found: b'3\xed\x90\x90') _process_chunk /usr/lib/python3.9/site-packages/oslo_utils/imageutils/format_inspector.py:1365
2024-12-05 14:23:05.435 1 DEBUG oslo_utils.imageutils.format_inspector [None req-f1216348-13a7-4a46-bfcb-8de7d88b3fb1 - - - - - -] Format inspector for vhdx does not match, excluding from consideration (Region signature not found at 30000) _process_chunk /usr/lib/python3.9/site-packages/oslo_utils/imageutils/format_inspector.py:1365
2024-12-05 14:23:05.437 1 ERROR ironic.common.images [None req-f1216348-13a7-4a46-bfcb-8de7d88b3fb1 - - - - - -] Security: The requested user image for the deployment node image cache failed to be able to be parsed by the image format checker: Multiple formats detected: iso,gpt: oslo_utils.imageutils.format_inspector.ImageFormatError: Multiple formats detected: iso,gpt
2024-12-05 14:23:05.512 1 ERROR ironic.conductor.utils [None req-f1216348-13a7-4a46-bfcb-8de7d88b3fb1 - - - - - -] Node f9ba65c4-987a-42b7-8965-1787c849a3f5 failed deploy step {'step': 'deploy', 'priority': 100, 'argsinfo': None, 'interface': 'deploy'}: The requested image is not valid for use.: ironic.common.exception.InvalidImage: The requested image is not valid for use.

Anything else we need to know?

We thought first that the issue was that we did not specify the image hash, but this has been ruled out in #2103.

Label(s) to be applied

/kind failing-test
One or more /area label. See https://github.com/metal3-io/baremetal-operator/labels for the list of labels.

@metal3-io-bot metal3-io-bot added kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. needs-triage Indicates an issue lacks a `triage/foo` label and requires one. labels Dec 9, 2024
@lentzi90
Copy link
Member Author

lentzi90 commented Dec 9, 2024

/triage accepted

@metal3-io-bot metal3-io-bot added triage/accepted Indicates an issue is ready to be actively worked on. and removed needs-triage Indicates an issue lacks a `triage/foo` label and requires one. labels Dec 9, 2024
@tuminoid
Copy link
Member

tuminoid commented Dec 9, 2024

Added to CI tracker.

@tuminoid
Copy link
Member

tuminoid commented Dec 9, 2024

IIRC @Rozzii said this coincides with new ironic build on the same date.

@tuminoid tuminoid added this to the BMO - v0.9 milestone Dec 9, 2024
@lentzi90
Copy link
Member Author

lentzi90 commented Dec 9, 2024

I wonder if it could be related to this: https://bugs.launchpad.net/nova/+bug/2091114
We did a temporary workaround for it in CAPO.

@Rozzii
Copy link
Member

Rozzii commented Dec 9, 2024

I was expecting this originally: https://opendev.org/openstack/ironic/commit/669304bc0c6b2762c872b71297480c4b4ffdb554

@Rozzii
Copy link
Member

Rozzii commented Dec 9, 2024

Based on the quay artifacts, I would expect the root cause landed in ironic between Nov 6 and Dec 3.

@lentzi90
Copy link
Member Author

lentzi90 commented Dec 9, 2024

I think you are correct that this came with oslo.utils.
The specific commit that introduced it in oslo.utils is here: openstack/oslo.utils@91af49b
Then there was some related changes that looks like they would be relevant to the live-iso use-case: openstack/oslo.utils@3d4ae16
I haven't figured out yet how we are supposed to "allow" multiple formats.

@Rozzii
Copy link
Member

Rozzii commented Dec 9, 2024

I think in the test system we supposed to have the safety check diasbled as we don't really care, but I have thought that enabling all the formats is the default.

@tuminoid
Copy link
Member

tuminoid commented Dec 9, 2024

BMO 0.8 virtualmedia works, where ironic is pinned to 26.0: #2111

@Rozzii
Copy link
Member

Rozzii commented Dec 12, 2024

I have created a bug report on Ironic side and continued working on this: https://bugs.launchpad.net/ironic/+bug/2091611

We could discuss reverting the pinning #2112 as there is a possibility to turn off the feature that initiates the problematic image format inspection, so that we could continue the testing of other Ironic features.

@tuminoid
Copy link
Member

This is marked for 0.9. We have 2 rounds of workarounds implemented now, but the actual fix is pending. Should we close this as implemented in 0.9 and create another for 0.10 for the proper fix?

@lentzi90
Copy link
Member Author

Sure we can do that or just change the milestone. Doesn't matter to me

@tuminoid
Copy link
Member

It seems the revert is still failing, meaning main/0.9 is still broken. I guess whatever fix is going to be needs to be in 0.9 later on so let's actually keep 0.9 here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. triage/accepted Indicates an issue is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

4 participants