-
Notifications
You must be signed in to change notification settings - Fork 247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support cloud-specific instance storage #1126
Comments
This problem domain clearly generalizes into e.g. bare metal scenarios with hetrogenous server hardware and one wants to be able to say something dynamic like "raid 1 all drives you see matching this set of hardware vendors". |
One possibility I guess would be to go to a "two phase" approach where the instance boots in an ephemeral mode ( |
Ignition generally does exactly what it's told to do, and doesn't automatically detect things. But Afterburn is all about querying the cloud platform for instance metadata. Spitballing a differently-hacky idea: add an Afterburn mode that runs before Ignition config fetch, generates an Ignition config fragment, and drops it in the base config directory. That mode would (currently) have to run before config fetch, so the config fragment couldn't be based on any user-provided configuration. However, it could aggregate all the instance disks into a RAID with a well-known name. The user config could then change the RAID level, if desired, and put a filesystem on top. Downside: the automatically-generated RAID would preclude the user from putting filesystems directly on individual instance disks. |
Hmm at least in AWS it doesn't seem like the instance store devices are in the metadata; they just show up as block devices to the instance. Further, we can't hardcode a policy in afterburn; it needs to be supported for something else to use the instance storage (e.g. Ceph, a database cache, etc). It'd be a backwards incompatible change for us to default to consuming it. Perhaps afterburn could try to gather a convenient list of block devices (something like symlinks in The block level aspect makes this much more Ignition than Afterburn though I think.
Yeah I know. But...ugly tradeoffs abound. This would be very "cloud native" at least. |
It appears that non-NVMe devices should show up in instance metadata, and NVMe devices can be distinguished by device model.
Yeah, I'm not immediately seeing a clean solution. It seems worth more discussion, though. I think the approach most consistent with Ignition's design is to say "the Ignition config is expected to understand any hardware it wants to configure", but as you say, the rest of the stack may not be equipped to deal with that. |
Hum. I guess at least for OpenShift, the fact that we always perform an OS update+reboot means we could wedge this whole thing into the MCO or in custom Ignition to start, basically blow away + remount We can experiment with that and if successful try to drive it into base CoreOS.
Yeah, the ones I'm interested in here are NVMe.
Right, but...hm, I guess maybe we could add "model matching" into Ignition? That could be generic enough to work across bare metal too. |
At least for the IPI case OCP would know exactly what storage is present in the instance type / it has configured to be added and could automatically generate the relevant config snippet to do this under the current Ignition model (without the additional symlinks). |
PoC implementation here https://github.com/cgwalters/coreos-cloud-instance-store-provisioner |
It could know, but doesn't today and fixing it is nontrivial. We currently have a single pointer config applied to all instance types. The thing provisioning VMs (https://github.com/openshift/machine-api-operator) is distinct from the thing generating Ignition configs (https://github.com/openshift/machine-config-operator/) with just a few links between them. We absolutely could rearchitect this; that's openshift/machine-config-operator#1619 I thought more about this though and agree taking that direction long term would be cleaner. We might need some better mechanisms in either Ignition or CoreOS to do the "matching"; maybe a udev rule that generates e.g. Anyways closing based on the above for now. |
For context, we already have cases where we do this cloud-specific symlinking for block devices in initramfs via udev rules. |
Has come up a few times, see also coreos/ignition#1126
Has come up a few times, see also coreos/ignition#1126
Has come up a few times, see also coreos/ignition#1126
Has come up a few times, see also coreos/ignition#1126
Has come up a few times, see also coreos/ignition#1126
I think conceptually this was moved to e.g. coreos/fedora-coreos-tracker#1122, coreos/fedora-coreos-tracker#601, coreos/fedora-coreos-tracker#1165, etc... |
Yeah, also worth noting though that today we have a Live ISO, in which one can execute completely arbitrary code before the install to disk; specifically, one can inspect the hardware and e.g. dynamically generate Ignition that is passed to The Live ISO is not necessarily ergonomic to use in all clouds though. |
This is related to an effort I was looking at around making use of instance-local storage in OpenShift 4: https://hackmd.io/dTUvY7BIQIu_vFK5bMzYvg
It works well to use Ignition to configure the instance store disks; e.g. to mount them at
/var
. But the problem comes in naming and enumerating them. Take e.g. the AWSm5d
instances (docs) - instance storage can be any of 1, 2, or 4 disks. In GCP it's supported to attach up to 9.As far as I can tell one could generally rely on
/dev/nvme0n1
being the boot drive (which we don't want to format obviously) and/dev/nvme1n1
and beyond being the instance storage. The disk IDs make it quite clear:But...that
2DB46DE31B58F726F
value is dynamic.Anyways, one idea is to directly support this in Ignition:
This would automatically find all instance-local disks and use RAID0 if appropriate (or just match the single block device directly).
Now clearly the MCO and machineAPI (for example) could be set up to pass a correct Ignition userdata to the instance depending on its type...but that requires exact coordination between the thing provisioning the VM and the provided Ignition, and also encoding an understanding of instance types into the thing rendering the Ignition (in the AWS case).
I suspect support for striping would cover 90% of cases and allow people to use a common Ignition config for multiple scenarios.
But it would add more cloud specifics into Ignition.
Another approach is to punt basically and don't use Ignition partitioning; have a systemd unit that runs in the real root that can be cloud-aware and e.g. specifically generate a mount unit for
/var/lib/containers
(e.g.) as opposed to all of/var
. But supporting/var
as instance storage is so much more elegant.The text was updated successfully, but these errors were encountered: