Add tests for reconciliation events #255

phillebaba · 2021-01-24T23:02:20Z

This adds tests that are meant to standardize the expected behavior for events generated by KC. In the end the test should act as a guide for any future changes to how events are generated.

Steps:

Create Kustomization which will deploy a Pod
Set Pod to an unhealthy state and wait for Kustomization to receive status condition for failed health check.
Trigger a reconcile and check the events to make sure that only one event has been created for the failed health check.
Set Pod to a healthy state and wait for Kustomization to receive a healthy status condition
Trigger a reconcile and check the events to make sure that only one event has been created for the health check.

TBD:

Verify external event recorder event message matches status
Update revision and make sure only one event is created for health check.

phillebaba · 2021-01-24T23:06:57Z

controllers/kustomization_controller.go

@@ -237,7 +237,7 @@ func (r *KustomizationReconciler) Reconcile(ctx context.Context, req ctrl.Reques
 	if reconcileErr != nil {
 		// record the reconciliation error
 		r.recordReadiness(ctx, reconciledKustomization)
-		return ctrl.Result{RequeueAfter: kustomization.Spec.Interval.Duration}, reconcileErr
+		return ctrl.Result{RequeueAfter: kustomization.Spec.Interval.Duration}, nil


If we return an error here the resource will be immediately reconciled. So what is the point of setting RequeueAfter? Reconciling immediately will mean that the Kustomization will not reach a unhealthy condition until the backoff is large enough.

Hmm something is very wrong if it doesn’t schedule at the RetryInteval. Please undo this change as it breaks the retry behavior.

I am trying to find docs about the Reconcile behavior to be sure, but it feels like the requeue after is ignored if an error is returned. When a health check timeout error occurs the resources will reconcile again after 1 ms. As I understand it that is not what we want to happen?

We want to retry the reconciliation based on what GetRetryInterval gives us. Maybe something changed in controller-runtime 0.8, is your PR based on the latest main branch?

Yep i rebased from main yesterday. So it might be that or the in memory cluster not behaving like a "normal" cluster. I am seeing the same behavior in a real cluster where the condition will go from progressing to progressing as it is immediately rescheduled. Also the way we are patching status updates means that we will always override conditions in that status. The only way to determine if the health check timed out is to check the events.

We need to introduce a new condition called healthCheck that records the last check result. But I would not mix this into this PR. First let’s figure out why the retry interval doesn’t work.

I will investigate why it is reconciled immediately. So all we need to do is change the Type value in the status condition. The reason the conditions are replaced is because the merge key for conditions is the Type value.

The ready condition is reset by progressing and that’s ok, we need to create a dedicated condition for health checks.

Looking into controller-runtime, if you return an error from a reconcile call the result will be ignored. This should be the right code that does the error handling.
https://github.com/kubernetes-sigs/controller-runtime/blob/9e78e653228851684b6b9cdabe5aae8559fe3722/pkg/internal/controller/controller.go#L297

I am not sure if it has always been like this or if it is a change introduced in the latest version. Either way we need to fix KC so that it actually waits the interval time.

Fixed in #256, please rebase.

Signed-off-by: Philip Laine <philip.laine@gmail.com>

phillebaba · 2021-02-02T22:19:44Z

Its better to try to merge this PR which adds the test to avoid regression and then I can verify that #191 is not present in a later PR. There is still the issue that the health passed even is sent immediatly when the kustomization is reconciled, even if the health check does not pass.

stefanprodan · 2021-02-03T06:33:03Z

There is still the issue that the health passed even is sent immediatly when the kustomization is reconciled, even if the health check does not pass.

I don't see how is that possible, on a health check error we return early:
https://github.com/fluxcd/kustomize-controller/blob/main/controllers/kustomization_controller.go#L695

stefanprodan · 2022-01-05T08:16:45Z

I've added various tests for the events in the last months, closing this.

phillebaba force-pushed the fix/event-tests branch 2 times, most recently from f4ad401 to c23f84e Compare January 24, 2021 23:04

phillebaba changed the title ~~Add tests for Kustomize events~~ WIP: Add tests for Kustomize events Jan 24, 2021

phillebaba changed the title ~~WIP: Add tests for Kustomize events~~ Add tests for Kustomize events Jan 24, 2021

phillebaba marked this pull request as draft January 24, 2021 23:05

phillebaba commented Jan 24, 2021

View reviewed changes

phillebaba force-pushed the fix/event-tests branch from c23f84e to 7670ef3 Compare January 24, 2021 23:09

stefanprodan changed the title ~~Add tests for Kustomize events~~ Add tests for reconciliation events Jan 25, 2021

phillebaba force-pushed the fix/event-tests branch from 7670ef3 to ce622e2 Compare January 26, 2021 17:21

phillebaba force-pushed the fix/event-tests branch from ce622e2 to 89b1318 Compare February 2, 2021 22:17

Add tests for health check events

e55f0e5

Signed-off-by: Philip Laine <philip.laine@gmail.com>

phillebaba force-pushed the fix/event-tests branch from 89b1318 to e55f0e5 Compare February 2, 2021 22:18

phillebaba marked this pull request as ready for review February 2, 2021 22:18

stefanprodan closed this Jan 5, 2022

stefanprodan deleted the fix/event-tests branch January 5, 2022 08:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tests for reconciliation events #255

Add tests for reconciliation events #255

phillebaba commented Jan 24, 2021 •

edited

Loading

phillebaba Jan 24, 2021

stefanprodan Jan 25, 2021

phillebaba Jan 25, 2021

stefanprodan Jan 25, 2021

phillebaba Jan 25, 2021

stefanprodan Jan 25, 2021 •

edited

Loading

phillebaba Jan 25, 2021 •

edited

Loading

stefanprodan Jan 25, 2021

phillebaba Jan 25, 2021

stefanprodan Jan 25, 2021

phillebaba commented Feb 2, 2021 •

edited

Loading

stefanprodan commented Feb 3, 2021

stefanprodan commented Jan 5, 2022

Add tests for reconciliation events #255

Add tests for reconciliation events #255

Conversation

phillebaba commented Jan 24, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stefanprodan Jan 25, 2021 • edited Loading

Choose a reason for hiding this comment

phillebaba Jan 25, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

phillebaba commented Feb 2, 2021 • edited Loading

stefanprodan commented Feb 3, 2021

stefanprodan commented Jan 5, 2022

phillebaba commented Jan 24, 2021 •

edited

Loading

stefanprodan Jan 25, 2021 •

edited

Loading

phillebaba Jan 25, 2021 •

edited

Loading

phillebaba commented Feb 2, 2021 •

edited

Loading