Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not retry updates on conflict #217

Merged
merged 4 commits into from
Jul 27, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 11 additions & 3 deletions pkg/reconciler/internal/updater/updater.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ package updater

import (
"context"
"k8s.io/apimachinery/pkg/api/errors"

"helm.sh/helm/v3/pkg/release"
corev1 "k8s.io/api/core/v1"
Expand Down Expand Up @@ -53,13 +54,20 @@ func (u *Updater) UpdateStatus(fs ...UpdateStatusFunc) {
u.updateStatusFuncs = append(u.updateStatusFuncs, fs...)
}

func isRetryableUpdateError(err error) bool {
return !errors.IsConflict(err) && !errors.IsNotFound(err)
}

func (u *Updater) Apply(ctx context.Context, obj *unstructured.Unstructured) error {
backoff := retry.DefaultRetry

// Always update the status first. During uninstall, if
// we remove the finalizer, updating the status will fail
// because the object and its status will be garbage-collected
if err := retry.RetryOnConflict(backoff, func() error {
// because the object and its status will be garbage-collected.
//
// In case of a Conflict error, the update cannot be retried,
// and the reconciliation loop needs to be restarted anew.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit awkward, because these two new comment lines applies to both retry.OnError calls. However the above 3 lines only apply to the first retry.OnError call.

I can suggest two solutions:

  1. move these two lines to a separate comment above this one. However this takes it further away from the second OnError call, which is not great.
  2. create a helper function such as:
func retryOnRetryableUpdateError(backoff wait.Backoff, fn func() error) {
  return retry.OnError(backoff, isRetryableUpdateError, fn)
}

and then we could put this comment on this function, which would be the most appropriate place. We could optionally also inline isRetryableUpdateError in that case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good :)

if err := retry.OnError(backoff, isRetryableUpdateError, func() error {
st := statusFor(obj)
needsStatusUpdate := false
for _, f := range u.updateStatusFuncs {
Expand All @@ -78,7 +86,7 @@ func (u *Updater) Apply(ctx context.Context, obj *unstructured.Unstructured) err
return err
}

if err := retry.RetryOnConflict(backoff, func() error {
if err := retry.OnError(backoff, isRetryableUpdateError, func() error {
needsUpdate := false
for _, f := range u.updateFuncs {
needsUpdate = f(obj) || needsUpdate
Expand Down
54 changes: 30 additions & 24 deletions pkg/reconciler/reconciler.go
Original file line number Diff line number Diff line change
Expand Up @@ -521,8 +521,12 @@ func (r *Reconciler) Reconcile(ctx context.Context, req ctrl.Request) (res ctrl.
return ctrl.Result{}, err
}

shouldUpdate := true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introducing a new variable here that is ready by the defer closure (and not easy to grok straight away), I think it might be cleaner to introduce a method on the updater such as SkipAllUpdates() or DoNotApply() that would:

  • cause Apply(...) to just return nil without doing anything, and
  • (optional) cause the updater to panic() on further Ensure... calls to make sure that SkipUpdates is closely followed by a return from Reconcile()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this a lot 💯

u := updater.New(r.client)
defer func() {
if !shouldUpdate {
return
}
applyErr := u.Apply(ctx, obj)
if err == nil && !apierrors.IsNotFound(applyErr) {
err = applyErr
Expand Down Expand Up @@ -567,8 +571,11 @@ func (r *Reconciler) Reconcile(ctx context.Context, req ctrl.Request) (res ctrl.
u.UpdateStatus(updater.EnsureCondition(conditions.Initialized(corev1.ConditionTrue, "", "")))

if obj.GetDeletionTimestamp() != nil {
err := r.handleDeletion(ctx, actionClient, obj, log)
return ctrl.Result{}, err
if err := r.handleDeletion(ctx, actionClient, obj, log); err != nil {
return ctrl.Result{}, err
everettraven marked this conversation as resolved.
Show resolved Hide resolved
}
shouldUpdate = false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is slightly different from the case described in the PR description, right?
I'm guessing the problem here was that on deletions both the uninstallUpdater and the regular u updater would fire, and the latter was both useless and would cause errors since the resource would typically be already gone after deleting the finalizer?
If I got this right, please mention this in the PR description.

return ctrl.Result{}, nil
}

vals, err := r.getValues(ctx, obj)
Expand Down Expand Up @@ -660,28 +667,27 @@ const (
)

func (r *Reconciler) handleDeletion(ctx context.Context, actionClient helmclient.ActionInterface, obj *unstructured.Unstructured, log logr.Logger) error {
if !controllerutil.ContainsFinalizer(obj, uninstallFinalizer) {
log.Info("Resource is terminated, skipping reconciliation")
return nil
}

// Use defer in a closure so that it executes before we wait for
// the deletion of the CR. This might seem unnecessary since we're
// applying changes to the CR after is has a deletion timestamp.
// However, if uninstall fails, the finalizer will not be removed
// and we need to be able to update the conditions on the CR to
// indicate that the uninstall failed.
if err := func() (err error) {
uninstallUpdater := updater.New(r.client)
defer func() {
applyErr := uninstallUpdater.Apply(ctx, obj)
if err == nil {
err = applyErr
}
}()
return r.doUninstall(actionClient, &uninstallUpdater, obj, log)
}(); err != nil {
return err
if controllerutil.ContainsFinalizer(obj, uninstallFinalizer) {
// Use defer in a closure so that it executes before we wait for
// the deletion of the CR. This might seem unnecessary since we're
// applying changes to the CR after is has a deletion timestamp.
// However, if uninstall fails, the finalizer will not be removed
// and we need to be able to update the conditions on the CR to
// indicate that the uninstall failed.
if err := func() (err error) {
uninstallUpdater := updater.New(r.client)
defer func() {
applyErr := uninstallUpdater.Apply(ctx, obj)
if err == nil {
err = applyErr
}
}()
return r.doUninstall(actionClient, &uninstallUpdater, obj, log)
}(); err != nil {
return err
}
} else {
log.Info("Resource is already terminated, skipping deletion")
everettraven marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
log.Info("Resource is already terminated, skipping deletion")
log.Info("Resource is already terminated, skipping deletion.")

}

// Since the client is hitting a cache, waiting for the
Expand Down