Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buildextend-virtualbox and buildextend-vmware improperly handle raw disks >=8GB #3787

Open
mtalexan opened this issue May 1, 2024 · 6 comments

Comments

@mtalexan
Copy link
Contributor

mtalexan commented May 1, 2024

Bug Report

The buildextend-virtualbox and buildextend-vmware formats generate OVA files. The OVA standard defines that it's a USTAR format tarball containing an OVF definition and the associated files (primarily the VMDK disk files). USTAR tarball format only supports individual files <8GB.

If the vmdk is 8GB or larger, the VMDK needs to be generated as a split disk so the files can be placed into the OVA (USTAR tarball), otherwise a build error is produced by the tar command. Currently this isn't done.

Environment

What operating system is being used to run coreos-assembler?

Container image on Fedora 39 and on Ubuntu 22.04.

What operating system is being assembled?

Customized Fedora CoreOS image.

Is coreos-assembler running in Podman or Docker?

Yes, either/both.

If Podman, is coreos-assembler running privileged or unprivileged?

Privileged.

Expected Behavior

The OVA file is generated successfully.

Actual Behavior

The OVA file is not generated, and a build error is reported from the tar command.
Example:

2024-05-01 17:43:54,748 INFO - Running command: ['qemu-img', 'convert', '-f', 'qcow2', '-O', 'vmdk', '/srv/tmp/tmp1lkujnoa/fedora-coreos-e37b7ac.20240501170000-qemu.x86_64.qcow2.working', '-o', 'subformat=streamOptimized', '/srv/tmp/tmp1lkujnoa/fedora-coreos-e37b7ac.20240501170000-virtualbox.x86_64.vmdk']
2024-05-01 17:49:08,455 INFO - Running command: ['qemu-img', 'info', '--output=json', '/srv/tmp/tmp1lkujnoa/fedora-coreos-e37b7ac.20240501170000-virtualbox.x86_64.vmdk']
2024-05-01 17:49:08,460 INFO - Processing work image callback
2024-05-01 17:49:08,494 INFO - Running command: ['qemu-img', 'info', '--output=json', '/srv/tmp/tmp1lkujnoa/fedora-coreos-e37b7ac.20240501170000-virtualbox.x86_64.vmdk']
2024-05-01 17:49:08,499 INFO - Running command: ['tar', '--owner=0', '--group=0', '-C', '/srv/tmp/tmp1lkujnoa', '-ch', '--format=ustar', '-f', '/srv/builds/e37b7ac.20240501170000/x86_64/fedora-coreos-e37b7ac.20240501170000-virtualbox.x86_64.ova', 'coreos.ovf', 'disk.vmdk']
tar: value 9055751168 out of off_t range 0..8589934591
tar: Exiting with failure status due to previous errors
2024-05-01 17:49:08,500 ERROR - Command returned bad exitcode
2024-05-01 17:49:08,500 ERROR - COMMAND: ['tar', '--owner=0', '--group=0', '-C', '/srv/tmp/tmp1lkujnoa', '-ch', '--format=ustar', '-f', '/srv/builds/e37b7ac.20240501170000/x86_64/fedora-coreos-e37b7ac.20240501170000-virtualbox.x86_64.ova', 'coreos.ovf', 'disk.vmdk']

Reproduction Steps

  1. Using Fedora CoreOS config, run cosa init
  2. Add a huge hard to compress file to the rootfs overrides dd if=/dev/urand of=overrides/rootfs/somebigfile.bin bs=4M count=4000 status=progress (16GB file of random binary data here)
  3. Fetch & Build: cosa fetch && cosa build
  4. Build vbox (or vmware): cosa buildextend-virtualbox or cosa buildextend-vmware
  5. See the tar error

Other Information

According to the QEMU block driver documentation for VMDK format, it appears it's possible to add an extra subformat option twoGbMaxExtentSparse which the associated code suggests would generate split VMDK files with each file being only 2GB in size, and isn't mutually exclusive with the existing streamOptimized option (or the others non-subformat options used for the vmware build.

This could easily be added to the QemuVariant data structures in the code that define how to generate the output disk, and would get handled when the VMDK is created. The OVF template also wouldn't need to be changed at all for this, the split VMDK format produces a disk.vmdk file that then references all the split part files, so the way to refer to it in the OVF is the same regardless of whether it's a split VMDK or not.

What would need to be changed though is the files that need to be collected for inclusion in the OVA. Instead of it being only a single disk.vmdk and the generated OVF file, it would need to include the all the VMDK split part files as well.

@mtalexan
Copy link
Contributor Author

mtalexan commented May 1, 2024

None of the other output formats that happen to use a VMDK disk format need to be affected by this (i.e. aws and vmware_vmdk), they all have their own QemuVariant defined.

The only thing that should be affected is the definition and implementation in the ova.py file.


As a reference, the disk conversion format generation portion of the process operating on a workspace that has already been built to buildextend-qemu can be run manually/repeatedly by calling /usr/lib/coreos-assembler/cmd-artifact-disk. Either the definition from the ova.py file can be parsed automatically by calling it with the arguments target virtualbox and/or target vmware, or the conversion optiions can be specified manually with manual ..... When calling it with manual, only the output disk image is generated, but when called with target the whole image build process is executed (i.e. the OVA file is generated).

EDIT: Corrected what portions of the build are run.

@mtalexan
Copy link
Contributor Author

mtalexan commented May 1, 2024

The easiest implementation for this change would likely be to just always use split VMDKs for the OVA formats (virtualbox and vmware) rather than trying to detect when it's needed to make the VMDK fit in the OVA. There would be no real effect on the usage of the resulting OVA since OVAs are consumed directly by the VirtualBox and VMWare tools (extraction is handled by the tools themselves, and they already support split VMDKs).

@mtalexan
Copy link
Contributor Author

mtalexan commented May 1, 2024

Correction to the original post, the streamOptimized subformat is mutually exclusive with the twoGbMaxExtent* subformats.

However, the qemu-img info command can be run on the disk.vmdk file and includes the list of split VMDK files corresponding to each extent.

@mtalexan
Copy link
Contributor Author

mtalexan commented May 2, 2024

I'm a moron. There's no reason to mess with the subformat, the existing VMDK just needs to be chunked up. The OVF format supports chunked files and the split command can do it for us.

The built-in file renaming logic will cause an issue though. That seems to be a special case only though, and isn't even strictly necessary for OVA output if the OVF template includes the VMDK name instead of hardcoding it. Other than the unnecessary rename for OVAs, the ibmcloud and gcp are the only types that do/need renaming.

@mtalexan
Copy link
Contributor Author

mtalexan commented May 6, 2024

I was working on a change that would use split VMDK disks, and discovered that VMWare and Virtualbox both fail to comply with the OVF 1.0/1.1 spec. Neither accepts split disks.

However it also turns out neither generate or enforce the OVF standard properly either. Among other things, they don't care if the OVA archive is in POSIX or USTAR format, they both always generate it in POSIX (non-compliant) format when explicitly generating an "OVF 1.0" archive and accept input archives in either POSIX or USTAR.

So the simple fix seems to be to drop the unnecessarily strict OVF 1.0 compliance during generation since even a round trip export-import from the tools themselves (or between the tools) don't comply, and relax to use the POSIX tar format. Notably, they don't support GNU format though, only POSIX and USTAR.

@dustymabe
Copy link
Member

Hey @mtalexan I've read over things here briefly. Obviously for FCOS and RHCOS we don't generate images that large so we never hit the limitation you are running into -> Yay for us, bad for you.

With all things there is always risk/reward that has to be considered. Changing things here would represent a decent amount of risk for us, with not very much reward.

Maybe we could add an option to do what you want, but probably wouldn't change the default behavior.

Also I will mention that there are efforts underway to do most of our disk building using the https://github.com/osbuild/osbuild project, so any changes we do here in COSA might be useless if that project didn't also support those changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants