-
Notifications
You must be signed in to change notification settings - Fork 560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make subnet spec id field required for SSA to work with CC #3748
Conversation
73391a0
to
55a78fa
Compare
/test ? |
@sedefsavas: The following commands are available to trigger required jobs:
The following commands are available to trigger optional jobs:
Use
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
4b82f1b
to
dfa4e0a
Compare
/test pull-cluster-api-provider-aws-e2e-clusterclass |
dfa4e0a
to
29aa73c
Compare
/test pull-cluster-api-provider-aws-e2e-clusterclass |
29aa73c
to
042ce45
Compare
/test pull-cluster-api-provider-aws-e2e-clusterclass |
@sedefsavas Hi 👋 For my sake, would you mind clarifying the abbreviations in the title, please? :) Is CC |
@Skarlso I added the issues that this PR solves. |
@sedefsavas This is now ready for review, right? :) |
Yes, ready. Passed ClusterClass tests. |
Awesome, I shall take a crack at it. :) |
/test pull-cluster-api-provider-aws-e2e |
@@ -242,7 +242,7 @@ func (v *VPCSpec) IsIPv6Enabled() bool { | |||
// SubnetSpec configures an AWS Subnet. | |||
type SubnetSpec struct { | |||
// ID defines a unique identifier to reference this resource. | |||
ID string `json:"id,omitempty"` | |||
ID string `json:"id"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, what happens in case of existing clusters is that their ID will be set to a default empty string, right?
Does that affect the cluster at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means if a subnet spec is defined, it should have the id field.
Making id a unique identifier solves SSA coauthoring issue. More info in the CAPI issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I saw this one. Just was wondering about existing configs since now it will not just omit the value but update it to an empty string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For existing configs (by this I mean existing clusters), even though initially id field was empty, after the creation, ids are filled by CAPA controllers, so no existing cluster should have their id fields empty during upgrading to v1beta2.
If you are asking about using an existing template, they will no longer work for creating new clusters.
This is a breaking change for existing templates that use the unsupported use cases and the reason why we needed a new API version is because of this. We will need to document this properly with the v1beta2 release.
@@ -0,0 +1,28 @@ | |||
- op: add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So... (and this is probably my lack of knowledge in this area, but please bear with me) The way these different from currently configurable networking, like just plain adding your VPC ID into the cluster config, is that these values will now be managed fields and used in SSA. So the user doesn't have to change things in a config file but rather apply new patch values? Did I get that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this particular scenario tests BYO VPC and subnets which was not working before SSA and id field being compulsory.
042ce45
to
342dd42
Compare
Clusterclass and EKS tests passed. On the unmanaged side, only multi-workload test failed because it is no longer a supported workflow (see the explanation in PR definition). Removed the test. In summary, if the infrastructure is not external (i.e., id is not set), then the fields below cannot be filled (as we have in the multi-workload test).
|
@sedefsavas: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
c120f3c
to
39f088b
Compare
All e2e tests passed before, the new commit only removed the multi-az test, which was failing due to its use case became unsupported with this PR. So, not running them again. |
/lgtm |
@sedefsavas - sorry but are you saying that the previous logic of only setting az/cidrblock (and not id) didn't work previously? If it did work this seems quite a change in behaviour. |
Based on a conversation in slack with @sedefsavas:
Not from the conversation, but it would be good to [re-]add an e2e test that explicitly selects a subset of AZs So based on this: /approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: richardcase The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Fixes: #3536 |
Although I understand the intent behind this change, it takes away the following functionalities from the user.
Example:
Would there be a possibility to provide solutions to the above? |
This is a huge breaking change for us! We currently rely on being able to have complex subnet layouts that are still managed by CAPA and a lot of work has gone into CAPA recently by myself to make these complex subnet layouts workable via filters and the like. Currently we do similar to the following: spec:
network:
subnets:
- cidrBlock: 10.0.0.0/24
availabilityZone: eu-west-1a
tags:
subnet-role: control-plane
- cidrBlock: 10.0.1.0/24
availabilityZone: eu-west-1b
tags:
subnet-role: control-plane
... etc ... This allowed us to use CAPA to manage all the network infrastructure while still having the ability to segment out infrastructure into specific subnets by making use of the subnet filters based on tags. We also have enterprise customers with policies in place about specific tags needing to exist on all AWS resources, this would also now no longer be possible for them when using CAPA on its own. This PR completely blocks that possibility and now means we need an external solution to configure all the network setup for our CAPA clusters. @richardcase You stated "We'll try to come up with an alternative that people can use (perhaps explicitly specifying failure domains or something similar)" did this happen? Is there anywhere I can track / contribute to this? I see that we're not the only ones hitting issues with this and an issue was opened (and closed without discussion) covering this topic - #3883 There has also been discussion on Slack about this problem: https://kubernetes.slack.com/archives/CD6U2V71N/p1675159432540519 I'd like to re-open the discussion around this. I know the problem was to solve for Cluster Class and server-side apply but it seems short-sighted to me to reduce the feature-set of CAPA to this degree. |
@MarcusNoble - Coming up with the alternative did not happen yet and in all honesty it was forgotten about. The problem is that there was significant pressure from certain areas to get ClusterClass support into CAPA and the lack of a key on the subnets caused issues. The specifying of subnets has been an area of pain for as long as I can remember, as the original implementation wasn't great. In the slack discussion on this i did raise it was a change in behaviour but was told its undocumented and is a side effect. In hindsight, we should've been more diligent on this and not bowed to pressure to get clusterclass support out. If we plan to change this then it will be a breaking API change and so we will need to bump the API version / CAPA version number. We should take this opportunity to rethink how we handle the networking side of CAPA and once and for all deprecate the old subnet code which has been problematic for years. Lets add this to: #1484 (and also create a specific issue for this problem). |
This change has also highlighted that we didn't include this change in the release notes, which is a good example as to why we should go back to using the Prow release notes plugin. |
What this PR does / why we need it:
This PR makes the id field in the subnet spec required to solve SSA coauthoring issue in kubernetes-sigs/cluster-api#6320.
Before this PR, some fields in network.subnets list could be set, although the infrastructure was managed by CAPA and this was unintentional, just because there were no webhook checks in place.
After this PR, the only 2 possible cases we support are:
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #3530
Fixes #3528
Special notes for your reviewer:
Checklist: