Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Handle config secret updates #565

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

tjamet
Copy link

@tjamet tjamet commented Jul 13, 2024

What this PR does / why we need it:

As a cluster operator, I want to iterate my infrastructure provisioner
configuration and ensure the relevant cluster-api-providers are
provisioned as configured.

Currently, the GenericProvider Reconciler considers exclusively the
Providers spec and ignores the data pointed by the ConfigSecret field.

If one of those fields changes, the deployment of the provider is
unchanged.

For example, if an infrastructure is defined as

apiVersion: operator.cluster.x-k8s.io/v1alpha2
kind: InfrastructureProvider
metadata:
  name: aws
  namespace: infrastructure-aws-system
spec:
  version: v2.5.2
  configSecret:
    name: aws-variables
  deployment:
    replicas: 1
---
apiVersion: v1
kind: Secret
metadata:
  name: aws-variables
  namespace: capi-config
type: Opaque
stringData:
  AWS_B64ENCODED_CREDENTIALS: "SOME_BASE_64_CREDS"

It is impossible to get the latest version of
AWS_B64ENCODED_CREDENTIALS without changing the provisioner version or
the deployments or verbosity level

This commit proposes to also take into consideration the content of the
configuration so, any change to it leads to an adjustment of the
deployment.

Currently, if a ConfigSecret is updated, it is not reflected automatically to all providers using it.
Add an optional reconciler to trigger the update of all providers using it.

Both combined ensures that any configuration update leads to an update of the provider deployment, with the least changes in behaviour possible

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jul 13, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign neolit123 for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot
Copy link
Contributor

Welcome @tjamet!

It looks like this is your first PR to kubernetes-sigs/cluster-api-operator 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/cluster-api-operator has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link
Contributor

Hi @tjamet. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 13, 2024
Copy link

netlify bot commented Jul 13, 2024

Deploy Preview for kubernetes-sigs-cluster-api-operator ready!

Name Link
🔨 Latest commit e09838c
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-cluster-api-operator/deploys/66c5f712f84c4600073b4f26
😎 Deploy Preview https://deploy-preview-565--kubernetes-sigs-cluster-api-operator.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@furkatgofurov7
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 30, 2024
@tjamet
Copy link
Author

tjamet commented Jul 31, 2024

Digging into the failing test code, I don't understand how the proposed changes can influence the status.

the failing test (TestCheckCAPIOpearatorAvailability seems to test that deployments are working as expected.
It does create a deployment object running nginx with generateCAPIOperatorDeployment, update its status to consider it is running, and ensure that CheckDeploymentAvailability considers the deployment status as expected.

I can see the test failing similarly in a dependabot PR, leaning towards instability.

/retest

cmd/main.go Outdated
@@ -286,6 +291,24 @@ func setupReconcilers(mgr ctrl.Manager) {
setupLog.Error(err, "unable to create controller", "controller", "Healthcheck")
os.Exit(1)
}

if watchConfigSecretChanges {
Copy link
Member

@Danil-Grigorev Danil-Grigorev Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of watching secret changes, but I don’t like that it requires a separate controller. Hash of both objects can be combined, and stored under the same annotation.

Controller can establish watch on secrets with object mapper, which will trigger reconcile for all providers where the secret is referenced.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks or the review.
I have updated the code to watch for secret changes and trigger a reconciliation of each Provider using them

Let me know what you think

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 20, 2024
@tjamet tjamet force-pushed the handle-config-secret-updates branch 3 times, most recently from c8563e7 to a5a1fed Compare August 20, 2024 15:03
Copy link
Member

@Danil-Grigorev Danil-Grigorev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments regarding CR usage, but overall looks good.

cc @furkatgofurov7 @alexander-demicev PTAL

internal/envtest/environment.go Outdated Show resolved Hide resolved
builder := ctrl.NewControllerManagedBy(mgr).
For(r.Provider)
if r.WatchConfigSecretChanges {
builder = builder.Watches(&corev1.Secret{}, &providerSecretMapper{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be better to use handler.MapFunc, similar to this, since using queue directly is “internal-ish” way of doing that.

builder.Watch(
	source.Kind(mgr.GetCache(), &corev1.Secret{}),
	handler.EnqueueRequestsFromMapFunc(secretToProviders),
)
// where 
func secretToProviders(ctx context.Context, secret client.Object) []ctrl.Request
// loop over listConfigSecretUsers

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I addressed your comment.
Apparently, builder does not have a Watch function but a Watches one which accepts a plain object as first parameter.
I though used handler.EnqueueRequestsFromMapFunc as suggested and adapted the rest of the code

@tjamet tjamet force-pushed the handle-config-secret-updates branch from a5a1fed to c748db9 Compare August 21, 2024 07:56
As a cluster operator, I want to iterate my infrastructure provisioner
configuration and ensure the relevant cluster-api-providers are
provisioned as configured.

Currently, the GenericProvider Reconciler considers exclusively the
Providers spec and ignores the data pointed by the ConfigSecret field.

If one of those fields changes, the deployment of the provider is
unchanged.

For example, if an infrastructure is defined as

```yaml
apiVersion: operator.cluster.x-k8s.io/v1alpha2
kind: InfrastructureProvider
metadata:
  name: aws
  namespace: infrastructure-aws-system
spec:
  version: v2.5.2
  configSecret:
    name: aws-variables
  deployment:
    replicas: 1
---
apiVersion: v1
kind: Secret
metadata:
  name: aws-variables
  namespace: capi-config
type: Opaque
stringData:
  AWS_B64ENCODED_CREDENTIALS: "SOME_BASE_64_CREDS"
```

It is impossible to get the latest version of
`AWS_B64ENCODED_CREDENTIALS` without changing the provisioner version or
the deployments or verbosity level

This commit proposes to also take into consideration the content of the
configuration so, any change to it leads to an adjustment of the
deployment.
Currently, if a ConfigSecret is updated, it is not reflected automatically to all providers using it.

Implement this behaviour, under an opt-in flag.
@tjamet tjamet force-pushed the handle-config-secret-updates branch 4 times, most recently from 5d5c555 to 63f8947 Compare August 21, 2024 13:55
As requested in the reviews, change the reconcile logic
and remove the added secret reconciler.
Instead, all generic reconcilers will listen to secret changes and
trigger a reconciliation for each Provider using the secret
@tjamet tjamet force-pushed the handle-config-secret-updates branch from 63f8947 to e09838c Compare August 21, 2024 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants