Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

manifests: Rework to allow co-located builds for each RHEL version #803

Closed
wants to merge 6 commits into from

Conversation

travier
Copy link
Member

@travier travier commented May 23, 2022

manifests: Split into common & RHEL 8.6 specific manifests


Keep RHEL 8.6 based RHCOS as the default


ci: Regroup tests in qemu & metal buckets

  • Only build the ostree aci container archive in the build step as it's
    the only artifact used for the layering test.
  • Group the remaining tests in two buckets: qemu & metal.
  • Disable the now uneeded tests.

ci: Make RHEL version explicit

Prepare for testing multiple RHEL versions.


ci: Prepare test names

Pre-create test names to be able to enable them in Prow before we merge
the corresponding test code.


Enable building RHCOS from different RHEL minor releases from the same branch and repo. This will enable us to work on RHEL 8.6 & 9.0 based RHCOS in parallel instead of using different branches or costly reverts.

Each RHEL minor release based RHCOS is now under a subfolder.

To try it out:

# RHEL 8.6 based RHCOS
$ mkdir rhcos-rhel-8.6 && cd rhcos-rhel-8.6 && cosa init https://github.com/travier/os -b splitmanifest rhel-8.6

# Upcoming PR: RHEL 9.0 based RHCOS
$ mkdir rhcos-rhel-9.0 && cd rhcos-rhel-9.0 && cosa init https://github.com/travier/os -b splitmanifest rhel-9.0

@openshift-ci openshift-ci bot requested review from miabbott and ravanelli May 23, 2022 18:39
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 23, 2022
@HuijingHei
Copy link
Contributor

HuijingHei commented May 24, 2022

The whole looks good.
Just one small question, maybe should based on the latest patch, seems missed PR #791

Copy link
Member

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for working on this!

It looks like the default is still 8.5 for the record?

I think one thing we should do pretty quickly after this lands is adjust the CI to cover both cases (at least 8.6 optionally) and also add a periodic that builds the non-default cases.

ci/build-test-qemu-init.sh Outdated Show resolved Hide resolved
rhel-8.5.yaml Outdated Show resolved Hide resolved
@travier
Copy link
Member Author

travier commented May 24, 2022

It looks like the default is still 8.5 for the record?

Yes, I would prefer to land this one on 8.5 and then try the switch to 8.6 again.

I think one thing we should do pretty quickly after this lands is adjust the CI to cover both cases (at least 8.6 optionally) and also add a periodic that builds the non-default cases.

Agree.

@travier
Copy link
Member Author

travier commented May 24, 2022

The whole looks good. Just one small question, maybe should based on the latest patch, seems missed PR #791

Thanks, fixed.

@cgwalters
Copy link
Member

Looks like we may be missing a symlink

 + cosa buildextend-extensions
Traceback (most recent call last):
  File "/usr/lib/coreos-assembler/cmd-buildextend-extensions", line 124, in <module>
    sys.exit(main())
  File "/usr/lib/coreos-assembler/cmd-buildextend-extensions", line 34, in main
    raise Exception(f"Missing {extensions_src}")
Exception: Missing src/config/extensions.yaml 

@travier
Copy link
Member Author

travier commented May 25, 2022

Bundling this BZ in this PR: https://bugzilla.redhat.com/show_bug.cgi?id=2068148

@cgwalters
Copy link
Member

/test validate

@HuijingHei
Copy link
Contributor

See error about this, do we need to add the workaround?

+ cosa buildextend-metal
Config commit: fatal: unsafe repository ('/tmp/cosa/src/config' is owned by someone else)
To add an exception for this directory, call:
	git config --global --add safe.directory /tmp/cosa/src/config

@HuijingHei
Copy link
Contributor

/retest

@cgwalters
Copy link
Member

I went ahead and pushed #810 to this PR.

@HuijingHei
Copy link
Contributor

/retest

@cgwalters
Copy link
Member

I'm OK with this as is, though I think long term it will be cleaner to do the same as fedora-coreos-config instead of having subdirectories. Which means we will either need to branch as we do there (e.g. with main-rhel-9 type branch) or (my preference) close the last gap and have coreos-assembler know how to take an optional build type, and use that to e.g. choose which manifest.yaml and image.yaml to use.

@cgwalters
Copy link
Member

Are we intending to block on getting this in for 4.11 to do the 8.6 switch? Or are we going to try to re-revert again?

@cgwalters
Copy link
Member

Since we know we need to ship 8.6, I think my vote is:

  • Do the reverts again to get us back to 8.6 on master without landing this PR
  • Cherry pick to release-4.11 (also we need to evaluate what other things should be cherry picked)
  • Look at landing this PR on master

@miabbott
Copy link
Member

miabbott commented Jun 3, 2022

* Cherry pick to release-4.11 (also we need to evaluate what _other_ things should be cherry picked)

Looking at the diff between master and release-4.11, I think we should probably pull in a939a71 and fe25079 to 4.11

@cgwalters
Copy link
Member

Looking at the diff between master and release-4.11, I think we should probably pull in a939a71 and fe25079 to 4.11

Done

@mike-nguyen
Copy link
Member

Building SCOS live ISO, I got

$ cosa buildextend-live
Targeting build: 411.91.202206031601-0
Traceback (most recent call last):
  File "/usr/lib/coreos-assembler/cmd-buildextend-live", line 55, in <module>
    raise Exception(f"missing directory {srcdir_prefix}")
Exception: missing directory src/config/live/

@miabbott
Copy link
Member

miabbott commented Jun 15, 2022

I fought the black box of Prow for most of today trying to get this PR to work, but keep bumping into problems where the symlinked files to/from the rhel-8.6 subdir cannot be found during the test:

Formatting 'cache2.qcow2.tmp', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=10737418240 lazy_refcounts=off refcount_bits=16
Disabled fsync on cache repositories
error: Can't open file "/tmp/cosa/src/config/common.yaml" for reading: No such file or directory (os error 2) 

Interestingly, if I cosa init from a git URL where this change lives, things seem to work fine. But cosa init from the copied directory runs into the described problem.

I've not yet uncovered a reasonable workaround for this, which allows for using the new layout in both local use cases and the Prow case.

(I've reset the state of the PR to basically where I started hacking on it, with some small changes to the commits)

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 15, 2022

@travier: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/rhcos-86-build-test-qemu a45ed57 link true /test rhcos-86-build-test-qemu
ci/prow/rhcos-86-build-test-metal a45ed57 link true /test rhcos-86-build-test-metal

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@cgwalters
Copy link
Member

The problem likely has something to do with the special cases in the virtio bits related to src/config - see coreos/coreos-assembler#2774

In fact I bet its: "Note: This only works for absolute symlinks today." right?

@miabbott
Copy link
Member

The problem likely has something to do with the special cases in the virtio bits related to src/config - see coreos/coreos-assembler#2774

In fact I bet its: "Note: This only works for absolute symlinks today." right?

I've not tested the use of absolute symlinks here, but this sounds likely

@miabbott

This comment was marked as outdated.

@miabbott
Copy link
Member

I hacked up cosa init to work in this situation of local source dir + subdirectory, which seems to get me out of the described problems, but I'm not confident it won't break existing workflows that use the source dir + subdir setup:

Hmph, well it seemed to work locally, but not in a pod on build02. Guess that is a sign of my brain being fried for the day.

@cgwalters
Copy link
Member

If you want to test what is happening in Prow, you'll need to be sure to
export FORCE_UNPRIVILEGED=1 to ensure everything goes through qemu/supermin.

@miabbott
Copy link
Member

miabbott commented Jun 16, 2022

I tested two additional commits on top of this branch that got things working for the FORCE_UNPRIVILEGED=1 case, but the additional changes feel like the approach is wrong. See the top commits on my branch - https://github.com/miabbott/os/commits/splitmanifest

My branch also appears to work in the local, unprivileged container case.

@travier WDYT?

@miabbott
Copy link
Member

miabbott commented Jun 16, 2022

I tested two additional commits on top of this branch that got things working for the FORCE_UNPRIVILEGED=1 case, but the additional changes feel like the approach is wrong. See the top commits on my branch - https://github.com/miabbott/os/commits/splitmanifest

Maybe we should invert the relationship here, where we just pick a default version and symlink manifest.yaml and friends at the top-level to the contents in rhel-8.6?

I think this is kind of what Jonathan suggested - #803 (comment)

@miabbott
Copy link
Member

I tested two additional commits on top of this branch that got things working for the FORCE_UNPRIVILEGED=1 case, but the additional changes feel like the approach is wrong. See the top commits on my branch - https://github.com/miabbott/os/commits/splitmanifest

My branch also appears to work in the local, unprivileged container case.

@travier WDYT?

It's ugly but it seems to work - #849

@travier
Copy link
Member Author

travier commented Jun 17, 2022

I have another bug with cosa replacing the config -> config-git symlink to an absolute symlink when buiding locally.
I think we should drop the sub dir setup and go back to a simpler "oneshot" symlink setup.

@travier
Copy link
Member Author

travier commented Jun 17, 2022

Prep commits split into #850 & #851

@travier
Copy link
Member Author

travier commented Jun 17, 2022

Alternative option in #852

travier added a commit to travier/coreos-assembler that referenced this pull request Jun 20, 2022
Remove support for subdirectory in cosa init given that:
- we currently don't use this feature,
- we chose another path for multiple version support in openshift/os as
  using subdirectories lead to a lot of additionnal complexity.

See: openshift/os#803

This reverts commit 4781e46.
@travier
Copy link
Member Author

travier commented Jun 20, 2022

We've merged #852 (see #852 (comment)) so I'll close this one. We can always revisit the directory layout if we figure out something better.

@jlebon
Copy link
Member

jlebon commented Jun 20, 2022

coreos/coreos-assembler#2936 should make cosa handle symlinks better.

travier added a commit to travier/coreos-assembler that referenced this pull request Jul 29, 2022
Remove support for subdirectory in cosa init given that:
- we currently don't use this feature,
- we chose another path for multiple version support in openshift/os as
  using subdirectories lead to a lot of additionnal complexity.

See: openshift/os#803

This reverts commit 4781e46.
travier added a commit to travier/coreos-assembler that referenced this pull request Aug 2, 2022
Remove support for subdirectory in cosa init given that:
- we currently don't use this feature,
- we chose another path for multiple version support in openshift/os as
  using subdirectories lead to a lot of additionnal complexity.

See: openshift/os#803

This reverts commit 4781e46.
travier added a commit to travier/coreos-assembler that referenced this pull request Aug 4, 2022
Remove support for subdirectory in cosa init given that:
- we currently don't use this feature,
- we chose another path for multiple version support in openshift/os as
  using subdirectories lead to a lot of additionnal complexity.

See: openshift/os#803

This reverts commit 4781e46.
travier added a commit to travier/coreos-assembler that referenced this pull request Aug 22, 2022
Remove support for subdirectory in cosa init given that:
- we currently don't use this feature,
- we chose another path for multiple version support in openshift/os as
  using subdirectories lead to a lot of additionnal complexity.

See: openshift/os#803

This reverts commit 4781e46.
travier added a commit to travier/coreos-assembler that referenced this pull request Aug 23, 2022
Remove support for subdirectory in cosa init given that:
- we currently don't use this feature,
- we chose another path for multiple version support in openshift/os as
  using subdirectories lead to a lot of additionnal complexity.

See: openshift/os#803

This reverts commit 4781e46.
travier added a commit to travier/coreos-assembler that referenced this pull request Aug 25, 2022
Remove that support as:
- we currently don't use this feature,
- we chose another path for multiple version support in openshift/os as
  using subdirectories lead to a lot of additionnal complexity.

See: openshift/os#803

This reverts commit 4781e46.
travier added a commit to travier/coreos-assembler that referenced this pull request Aug 26, 2022
Remove that support as:
- we currently don't use this feature,
- we chose another path for multiple version support in openshift/os as
  using subdirectories lead to a lot of additionnal complexity.

See: openshift/os#803

This reverts commit 4781e46.
travier added a commit to travier/coreos-assembler that referenced this pull request Aug 26, 2022
Remove that support as:
- we currently don't use this feature,
- we chose another path for multiple version support in openshift/os as
  using subdirectories lead to a lot of additionnal complexity.

See: openshift/os#803

This reverts commit 4781e46.
travier added a commit to travier/coreos-assembler that referenced this pull request Aug 26, 2022
Remove that support as:
- we currently don't use this feature,
- we chose another path for multiple version support in openshift/os as
  using subdirectories lead to a lot of additionnal complexity.

See: openshift/os#803

This reverts commit 4781e46.
travier added a commit to coreos/coreos-assembler that referenced this pull request Aug 29, 2022
Remove that support as:
- we currently don't use this feature,
- we chose another path for multiple version support in openshift/os as
  using subdirectories lead to a lot of additionnal complexity.

See: openshift/os#803

This reverts commit 4781e46.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants