-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
InfrastructureMachineSpec.FailureDomain is required to be a String #8096
Comments
If I'm reading this right the infrastructureMachine field is not designed to be used with newer versions of the API - it exists for backward compatibility. From https://cluster-api.sigs.k8s.io/developer/providers/machine-infrastructure.html:
|
Ah, so it was an upgrade thing at some point? Sounds like we should remove it? It's just a trip hazard now. |
I think this will be part of the conversation around removing the v1alpha3 API types in #8071. Part of that discussion involves removing old code which only exists for migrating and dealing with issues on older API types. |
I think I will move our 'hydrated' failure domain to the InfrastructureMachineStatus instead. This is a bit of a workaround, but it was a reasonable candidate for somewhere to stash it anyway. It's a bit of a grey area because, while the failure domain is part of the specification of the machine, it's not a field I expect a user to typically enter manually: the machine controller copies the hydrated value based on the reference in the MachineSpec before creating the machine. The primary issue here is that we need to ensure the InfrastructureMachineSpec remains immutable for its entire lifetime. This is complex to do when you only have a reference to some of it, and the referenced object may be altered or deleted. The simplest solution is to just copy it and ignore the reference thereafter. This also leaves us maximum flexibility for wider scoped solutions in the future. |
Just referencing my comment from here: #8071 (comment)
|
/triage accepted |
This issue has not been updated in over 1 year, and should be re-triaged. You can:
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/ /remove-triage accepted |
/priority important-longterm |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
I just checked https://cluster-api.sigs.k8s.io/developer/providers/machine-infrastructure and failure domain is defined as We can improve, but I think this answer the question in the issue above Orthogonal, as I said many times, there is huge room for improvements in how we manage failure domains... /close |
@fabriziopandini: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
While writing kubernetes-sigs/cluster-api-provider-openstack#1466 I encountered this code
cluster-api/internal/controllers/machine/machine_controller_phases.go
Lines 305 to 314 in 327aae2
failureDomain
in the infrastructure machine spec to the machine spec.This places an additional constraint on the infrastructure machine spec in that a field called
failureDomain
, if it exists, must be a string. If this limitation is intentional it is not documented with the rest of the failure domain interface in the cluster API book: https://cluster-api.sigs.k8s.io/developer/providers/cluster-infrastructure.htmlThe behaviour of this code is to overwrite
MachineSpec.FailureDomain
with the value inInfrastructureMachineSpec.FailureDomain
if it is defined. There are only 2 cases where this would do anything:Hopefully no providers have the behaviour in (1). It would be almost impossible to provide reliable infrastructure in this case.
So my best guess is that the intention was (2)? However, in this case the CAPI controller isn't going to use the failure domain so we've placed a restriction on the infrastructure provider for aesthetic reasons?
Anything else you would like to add:
The specific error produced is:
/kind bug
The text was updated successfully, but these errors were encountered: