-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[kubevirt] Add auto-detection of the correct RHCOS base image #994
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: rmohr The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
f61bee4
to
b9511a6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey, I started thinking about this more. One thing that this PR exposed to me is that we already have the ability to detect the openstack url correctly. I didn't realize that.
Since we have this ability, I think it would be good for us to go ahead and support using the openstack images with DataVolumeTemplates so we have a someone practical path forward for users even if RFE-2501 doesn't land in 4.11
thoughts?
func kubevirtMachineTemplateSpec(nodePool *hyperv1.NodePool, containerDiskChecksum string) *capikubevirt.KubevirtMachineTemplateSpec { | ||
template := nodePool.Spec.Platform.Kubevirt.NodeTemplate.DeepCopy() | ||
dataVolumeTemplates := template.Spec.DataVolumeTemplates | ||
for i, dv := range dataVolumeTemplates { | ||
if dv.Name == "containervolume" { | ||
if dv.Spec.Source != nil && dv.Spec.Source.Registry != nil && | ||
dv.Spec.Source.Registry.URL != nil && strings.Contains(*dv.Spec.Source.Registry.URL, "{rhcos:version}") { | ||
image := strings.ReplaceAll(*dv.Spec.Source.Registry.URL, "{rhcos:version}", containerDiskChecksum) | ||
dataVolumeTemplates[i].Spec.Source.Registry.URL = &image | ||
} | ||
} | ||
} | ||
volumes := template.Spec.Template.Spec.Volumes | ||
for i, volume := range volumes { | ||
if volume.Name == "containervolume" { | ||
if volume.ContainerDisk != nil && strings.Contains(volume.ContainerDisk.Image, "{rhcos:version}") { | ||
volumes[i].ContainerDisk.Image = strings.ReplaceAll(volume.ContainerDisk.Image, "{rhcos:version}", containerDiskChecksum) | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we're looking at this, I think it would be wise to go ahead and shore up the ability to use the openstack urls with DataVolumes so we have a solution we can support easily even without RFE-2501
To do that, can we make multiple magic template strings here?
{rhcos:openstack:url}
- used to template the full https:// url of the openstack qcow
{rhcos:openstack:checksum}
- used to template just the checksum part.
{rhcos:version:major}
- used to template the major version in use
{rhcos:version:minor}
- used to template the minor version in use
Then we can keep everything exactly the way it is today. Someone could set the container disk image using the cli with
--containerdisk=quay.io/containerdisks/rhcos-{rhcos:major}.{rhcos:minor}:{rhcos:openstack:checksum}
and if someone wanted to use a DataVolumeTemplate that referenced the openstack qcow url directly, they'd construct the NodePool with a DataVolumeTemplate in the kubevirt platform bit that looked like this.
dataVolumeTemplates:
- metadata:
name: rhcos-dv
spec:
pvc:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
source:
http:
url: {rhcos:openstack:url}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the big idea I'm trying to work with here is a way to get us where we want to be "consuming officially published container disks" while allowing us to have a stop gap for end users "consume openstack urls for DataVolumes" and a path for development to "unofficially consume container disks"
if we template all of this, then we can meet all those use cases in a way that doesn't make us have to reverse course necessarily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am undecided. At least {rhcos:openstack:url}
probably makes sense, but I am not really a fan of http based import. I would prefer to import also into datavolumes via a container disk. The involved raw-conversion and issing integrity checks is something I really don't like about datavolumes in general.
Regarding to
--containerdisk=quay.io/containerdisks/rhcos-{rhcos:major}.{rhcos:minor}:{rhcos:openstack:checksum}
Could you explain what the purpose of the major and minor template parameter is? I can understand checksum
because it gives the exactly matching disk. If I want a specific image I can specify it with a clear tag and this would for instance mean that I don't know the version I install, but I assume that specific version a dedicated repo exists.
cmd/nodepool/kubevirt/create.go
Outdated
// TODO (nargaman): replace with official container image, after RFE-2501 is completed | ||
// As long as there is no official container image | ||
// The image must be provided by user | ||
// Otherwise it must fail | ||
if o.ContainerDiskImage != "" && o.ContainerDiskRepo != "" { | ||
return errors.New("only one of \"--containerdisk\" and \"--containerdisk-repository\" can be specified") | ||
} | ||
if o.ContainerDiskImage == "" && o.ContainerDiskRepo == "" { | ||
return errors.New("the container disk image for the Kubevirt machine must be provided by user (\"--containerdisk\" or \"containerdisk-repository\" flag)") | ||
} | ||
if o.ContainerDiskRepo != "" { | ||
o.ContainerDiskImage = o.ContainerDiskRepo + ":{rhcos:version}" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can get away with not needing to specify a new container-disk-repo cli arg at all, and just simply use the templated values in the container-disk string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am unsure. I think that --containerdisk-repository
is much clearer than asking people to add various different template paramters. At least when finally dealing with air-gaped installations it is much clearer to just point to the mirror repo. What do you think?
@rmohr to summarize our discussion about this issue. We'll move ahead with two templated values That allows us to use the http import immediately and begin using the "unoffical" container disks as well. For the |
b9511a6
to
1b0e13d
Compare
✔️ Deploy Preview for hypershift-docs ready! 🔨 Explore the source changes: bd2113f 🔍 Inspect the deploy log: https://app.netlify.com/sites/hypershift-docs/deploys/62276db50826e50007571258 😎 Browse the preview: https://deploy-preview-994--hypershift-docs.netlify.app/how-to/kubevirt-platform |
@@ -342,13 +356,14 @@ func (r *NodePoolReconciler) reconcile(ctx context.Context, hcluster *hyperv1.Ho | |||
}) | |||
|
|||
// Validate platform specific input. | |||
var ami string | |||
if nodePool.Spec.Platform.Type == hyperv1.AWSPlatform { | |||
rhcosInfo := &rhcosDetails{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@enxebre I guess this would be another interesting point where potentially every provider except none
would have to diverge regaring to what exactly to render in the machine template. Did you already think about a structure on how you want this to look like?
I just hacked something in here at the moment.
@davidvossel I think the draft should now look like we agreed. Can you take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rmohr makes sense, the only thing I question is if we should support default behavior that automatically uses a datavolume+openstack rhcos url when people perform a hypershift create cluster
ideally, i'd like for this just to work out of the box today with hypershift create cluster kubevirt -name my-cluster
if possible without needing to know anything about disks.
Do you think using the openstack url + datavolume would be a practical default at this point?
cmd/cluster/kubevirt/create.go
Outdated
if opts.KubevirtPlatform.ContainerDiskImage == "" { | ||
return errors.New("the container disk image for the Kubevirt machine must be provided by user (\"--containerdisk\" flag)") | ||
if opts.KubevirtPlatform.ContainerDiskImage != "" && opts.KubevirtPlatform.ContainerDiskRepo != "" { | ||
return errors.New("only one of \"--containerdisk\" and \"--containerdisk-repository\" can be specified") | ||
} | ||
if opts.KubevirtPlatform.ContainerDiskImage == "" && opts.KubevirtPlatform.ContainerDiskRepo == "" { | ||
return errors.New("the container disk image for the Kubevirt machine must be provided by user (\"--containerdisk\" or \"--containerdisk-repository\" flag)") | ||
} | ||
if opts.KubevirtPlatform.ContainerDiskRepo != "" { | ||
opts.KubevirtPlatform.ContainerDiskImage = fmt.Sprintf("%v:%v", opts.KubevirtPlatform.ContainerDiskRepo, util.RHCOSOpenStackChecksumParameter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if neither the container or the repo are provided, can we just default to using the datavolume http import?
1b0e13d
to
7aa5d4a
Compare
Makes sense, working on it 👍 |
3ba071a
to
10c7d55
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getting close. I just have reservations about the modifications to the nodepool api. Maybe we can get what we're looking for using the hypershift cli with the existing api?
cmd/nodepool/kubevirt/create.go
Outdated
Disks: []kubevirtv1.Disk{ | ||
{ | ||
Name: "containervolume", | ||
Name: util.RHCOSMagicVolumeName, | ||
DiskDevice: kubevirtv1.DiskDevice{ | ||
Disk: &kubevirtv1.DiskTarget{ | ||
Bus: "virtio", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to do the same sort of disk/volume defaulting here as in the api/fixtures/example.go ? It's unclear to me why we have two locations where it looks like the VirtualMachineInstance is specified for the create command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was experimenting here a little bit with the API. The idea is, that people and the hypershift commandline tool are by default no longer specifying the root disk at all. It just gets added automatically. As a fallback, the rhcos
volume can be overwritten completely. But instead of having to always completely override it, the RootVolume section is added, similar to AWS, which only exposes size and storage class.
api/v1alpha1/nodepool_types.go
Outdated
|
||
// RootVolume specifies configuration for the root volume of node instances. | ||
// | ||
// +optional | ||
RootVolume *KubeVirtRootVolume `json:"rootVolume,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we get away with not adding this RootVolume info to the nodepool?
Would it be possible to just have the size and storage class as cmd options that render the NodeTemplate with the expected size/storage class on the datavolumetemplate?
api/v1alpha1/nodepool_types.go
Outdated
|
||
// RootVolume specifies configuration for the root volume of node instances. | ||
// | ||
// +optional | ||
RootVolume *KubeVirtRootVolume `json:"rootVolume,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we get away with not adding this RootVolume info to the nodepool?
Would it be possible to just have the size and storage class as cmd options that render the NodeTemplate with the expected size/storage class on the datavolumetemplate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weird this exact same comment got posted twice. Not sure how that happened.
10c7d55
to
0d3820e
Compare
@davidvossel PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
This looks great, maybe it's time to revise the title and description to make it a real non wip pr?
I'll check this code out and give it a run locally at some point soon to make sure the default workflow works as expected with the datavolume support. The kubevirt e2e lanes will verify the container disk flow
if (opts.KubevirtPlatform.ContainerDiskImage != "" && opts.KubevirtPlatform.ContainerDiskRepo != "") && (opts.KubevirtPlatform.RootVolumeSize > 0 || opts.KubevirtPlatform.RootVolumeStorageClass != "") { | ||
return errors.New("container disk options (\"--containerdisk*\" flag) can not be used together with customized root volume options (\"--root-volume-*\" flags)") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine for now. We could technically support this combination in the future if we wanted to though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm cancel
found a minor issue during testing.
api/fixtures/example_kubevirt.go
Outdated
} else { | ||
dataVolume := defaultDataVolume() | ||
setDataVolumeDefaults(&dataVolume, o) | ||
exampleTemplate.NodeTemplate.Spec.DataVolumeTemplates = []kubevirtv1.DataVolumeTemplateSpec{dataVolume} | ||
} | ||
return exampleTemplate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to set an entry in the exampleTemplate.NodeTemplate.Spec.Template.Spec.Volumes
for this data volume
We need three things
- DataVolumeTemplate
- Volume
- Disk
right now this just lacks the volume part. I discovered this while testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
@rmohr I'm not familiar with the kubervirt images here, but wanted to mention that since 4.10 the RHCOS live-iso image is shipped in the release payload via the |
@hardys could you point me to the images which are in there? Is it identical to the openstack rhcos image? In general, kubevirt would probably have to support a different disk source type to leverage it. Containerdisks are the canonical way to get images in kubevirt and they have a certain layout. |
Prepares the hypershift cli to auto-detect the required rhcos version from the cluster release payload. Since this is not known at cluster creation time, template parameters are introduced: `{rhcos:openstack:checksum}` for container disks, and `{rhcos:openstack:url}` for datavolume imports. The hypershift cli is extended the following way: * By default the hypershift CLI adds now a datavolume which will http-import the later-on auto-detected rhcos image. * `--containerdisk-repository` can be specified, to let the cluster figure out the right rhcos version when containerdisks should be used. Example: `--containerdisk-repository=quay.io/containerdisks/rhcos`. * `--root-volume-size` and `--root-volume-storage-class` can be used to customize the datavolume. Finally the nodepool-create and the cluster-create command now share the same kubevirt template to remove the duplication. Signed-off-by: Roman Mohr <rmohr@redhat.com>
First the OpenStack RHCOS image, belonging to the current release is detected. Then, if `{rhcos:openstack:checksum}` or `{rhcos:openstack:url}` are found in the machine template, they are replaced with the corresponding detected RHCOS version on machine creation. A new `ValidRHCOSImage` condition is added which has the same function like `ValidAMI` but reports the actual detected RHCOS version. It can later be used by a broad variety of platforms, not just AWS. Signed-off-by: Roman Mohr <rmohr@redhat.com>
0d3820e
to
bd2113f
Compare
@rmohr I think it only contains the live-iso image currently, with some scripts to either extract the ISO (potentially with modifications performed via coreos-installer) It's not identical to the openstack qcow image, but AFAIK the main difference (other than the ISO/QCOW format) is the default platform that decides which ignition data source is enabled, that could be set via another copy script e.g via Maybe this isn't a good fit for kubervirt, but I thought it worth mentioning, since baremetal IPI recently moved from the openstack qcow image to this release payload ISO image, and it potentially makes lifecycling the bootimage over multiple versions a lot easier. |
@rmohr: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
As far as I can see, what we have in common is that we are shipping content in containers. KubeVirt just expects a fully working and configured RHCOS, like for openstack and it can't modify the image on the fly, there is a well established and popular format on the kubevirt side which we want to integrate with. |
/hold The openstack url download is not really working. Going now for the more final flow. Sorting out details in coreos/stream-metadata-go#41 |
@rmohr: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
@rmohr just doing so housekeeping, please reopen if we still want to pursue this |
@enxebre: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
👍 was great working a little bit on hypershift with you. |
What this PR does / why we need it:
The RHCOS version is now automatically detected based on the release payload. In order to leverage this new information, new ways exist to inject a proper RHCOS image into the NodePool:
Full automatic injection
In this case users simply don't specify a root disk on the NodePool teample. Hypershift will figure out the right RHCOS version and inject a DataVolume which imports the correct RHCOS version from the official RHCOS openstack download sources. The new commandline flags
--root-volume-storage-class
and--root-volume-size
flags can be used to fill the newKubeVirtRootVolume
API parts of the platform.If a working storage provider is present, a cluster can now be created like this:
or in a more advanced example
ContainerDisk
The checksum is used to cross-reference the
release payload version with the quay.io/containerdisks/rhcos version.
Check for containerDisks or DataVolumeTemplates which are named
rhcos
and have in their container source{rhcos:openstack:checksum}
set and replace it with the discovered rhcos container version.
As an example:
quay.io/containerdisks/rhcos:{rhcos:openstack:checksum} would be replaced be modified to
quay.io/containerdisks/rhcos:mytag, if
mytag`would be discovered in the release payload.
On the
hypershift
client, a new flag called--containerdisk-repository
was added to the kubevirt platform which can be pointend to any rhcos repository (e.g.quay.io/containerdisks/rhcos
) which contains containerdisks which are tagged with thesha256sum
of the openstack qcow2 image of rhcos.A minimal example to make use of this feature is
Customized HTTP DataVolume import
Checks for a DataVolumeTemplates which is named
rhcos
and have in its http source{rhcos:openstack:url}
set and replace it with the discovered rhcos download url.
Which issue(s) this PR fixes (optional, use
fixes #<issue_number>(, fixes #<issue_number>, ...)
format, where issue_number might be a GitHub issue, or a Jira story:Fixes #
Checklist