Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS EKS Image Builder breaking issue with RHEL 8.10 or cloud-init 23.x version #8453

Open
saiteja313 opened this issue Jul 8, 2024 · 3 comments

Comments

@saiteja313
Copy link
Contributor

What happened:

AWS EKS Image Builder breaking issue with RHEL 8.10 or cloud-init 23.x version.

Error during execution :

TASK [providers : Execute cloud-init-vmware.sh] ********************************
    vsphere-iso.vsphere: fatal: [default]: FAILED! => {"changed": true, "cmd": "bash -o errexit -o pipefail /tmp/cloud-init-vmware.sh", "delta": "0:00:00.509110", "end": "2024-06-11 15:56:52.420263", "msg": "non-zero return code", "rc": 1, "start": "2024-06-11 15:56:51.911153", "stderr": "Traceback (most recent call last):\n  File \"/usr/lib/python3.6/site-packages/cloudinit/sources/DataSourceVMwareGuestInfo.py\", line 48, in <module>\n    LOG = logging.getLogger(__name__)\nAttributeError: module 'cloudinit.log' has no attribute 'getLogger'", "stderr_lines": ["Traceback (most recent call last):", "  File \"/usr/lib/python3.6/site-packages/cloudinit/sources/DataSourceVMwareGuestInfo.py\", line 48, in <module>", "    LOG = logging.getLogger(__name__)", "AttributeError: module 'cloudinit.log' has no attribute 'getLogger'"], "stdout": "using python 3\ninstalling datasource\nvalidating datasource", "stdout_lines": ["using python 3", "installing datasource", "validating datasource"]}

==

This is related to that recent RHEL 8.x update  :

2024-01-08 Jon Maloy <jmaloy@redhat.com> - 23.4-1
- Rebase to 23.4.1 [RHEL-18314]
- Resolves: RHEL-18314
([RHEL-8]Rebase cloud-init to 23.4)

And the fact your tooling using a very old library of cloud-init-vmware.sh.

Ref:
https://github.com/aws/eks-anywhere-build-tooling

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • EKS Anywhere Release:
  • EKS Distro Release:
@saiteja313
Copy link
Contributor Author

@saiteja313
Copy link
Contributor Author

Only "specific" scenario that would work right now is with :

  1. image-builder v0.4.0
  2. eksa-release v0.19.6 (or any v.19.6- lower version) under release-channel 1-29
    (eksa-release v0.19.7 or higher wont work)

And there are two "RHSM" workarounds build when using a RHEL 8.8 iso :

  1. "rhsm_server_release_version": "8.8"
  2. "extra_rpms": "cloud-init-23.1.1-10.el8.noarch"
    (default cloud-init 23.4.x would never work)

==

DO NOT work Config :

  • Image builder v0.5.0 or higher regardless of if we keep everything else the same way
  • RHEL 8.10 OS version (ISO) or latest supported RHEL8 distro
  • eksa-release v0.19.7 (and higher)
  • EKS release-channel 1-30 (this would require Image-Builder v0.5.0 which dont work)

The root source of most of the problems seems to come from the deprecated/unsupported/archived version of cloud-init-vmware-guestinfo tool

@abhay-krishna
Copy link
Member

abhay-krishna commented Aug 13, 2024

@saiteja313

And the fact your tooling using a very old library of cloud-init-vmware.sh

Our tooling doesn't use cloud-init-vmware.sh at all. Please refer to the below explanation:

It is clear that the cloud-init-vmware-guestinfo is the source of all the trouble, but the step Directly install Guestinfo which installs the tool shouldn't have been executed in the first place as it's required only if Cloud-init version is less than 21.3. I tried reproducing this with a RHEL 8.10 source ISO, but I was unable to since, in my local build and CI builds, the final cloud-init version is set to 23.4, so the build doesn't even execute that step and the image build works fine without any of the above workarounds. I tried using this image in an EKS-A cluster and that worked fine too.

The only thing that would make the build execute that guestinfo step is if this cloud-init version is set to a value <= 21.3. We can find out what this value is if we run the image-builder CLI with --ansible-verbosity flag to 2, then try to debug from that point on.

Could you try running the image-builder CLI with --ansible-verbosity flag to 2 and provide the output you get for this version step? Here is what the output of that step looks like for me:

vsphere-iso.vsphere: TASK [providers : Set cloud-init version] **************************************
vsphere-iso.vsphere: task path: <path>/image-builder/images/capi/ansible/roles/providers/tasks/vmware-redhat.yml:29
vsphere-iso.vsphere: ok: [default] => {"ansible_facts": {"cloud_init_version": "23.4"}, "changed": false}

The sample config file I used is:

{
  "cluster": "<cluster>",
  "convert_to_template": "false",
  "create_snapshot": "true",
  "datacenter": "<datacenter>",
  "datastore": "<datastore>",
  "folder": "<folder>",
  "insecure_connection": "true",
  "linked_clone": "false",
  "network": "<network>",
  "password": "<password>",
  "resource_pool": "<resource_pool>",
  "template": "",
  "username": "<username>",
  "vcenter_server": "<vcenter_server>",
  "vsphere_library_name": "<library>",
  "rhel_username": "<rhel_username>",
  "rhel_password": "<rhel_password>",
  "iso_url": "file:///tmp/rhel-8.10-x86_64-dvd.iso",
  "iso_checksum_type": "sha256",
  "iso_checksum": "9b3c8e31bc2cdd2de9cf96abb3726347f5840ff3b176270647b3e66639af291b",
}

The command I ran to build the image is:

image-builder build --hypervisor vsphere --os redhat --os-version 8 --vsphere-config vsphere.json --release-channel 1-29 --ansible-verbosity 2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants