Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor controller with pkg runtime helpers #342

Merged
merged 3 commits into from
Jan 13, 2022

Conversation

phillebaba
Copy link
Member

@phillebaba phillebaba commented Oct 19, 2021

Fixes #322

  • HelmRelease should implement the interfaces from conditions, and use this package to work with conditions.
  • HelmReleaseReconciler should implement the helpers from controller, and use these to record Events and Metrics.
  • HelmReleaseReconciler should exit early without recording any status when .spec.suspend is defined.
  • The patch package should be implemented to (safely) patch resources without running into conflicts.
  • The kstatus Reconciling and Stalled conditions should be properly implemented to provide better signals.

@phillebaba phillebaba marked this pull request as draft October 19, 2021 08:48
@stefanprodan stefanprodan changed the title Reactor helm controller with pkg runtime helpers Refactor controller with pkg runtime helpers Oct 25, 2021
@phillebaba phillebaba force-pushed the feature/pkg-runtime-helpers branch 3 times, most recently from 2b1a3df to e265e9d Compare October 27, 2021 09:20
// +kubebuilder:validation:Enum=ChartVersion;Revision
// +kubebuilder:default:=ChartVersion
// +optional
ReconcileStrategy string `json:"reconcileStrategy,omitempty"`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was removed from the HelmChart so I assume that it should be removed from here also.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess is that the answer is yes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed this comment for some reason :-S

No, it was introduced to the HelmChart resource in a later version than the reconcilers-dev branch. So you don't have access to the API against this version, but it will return.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok good, will keep this open and loop back when it becomes relevant.

),
conditions.WithNegativePolarityConditions(
v2.RemediatedCondition,
),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this is probably wrong, need to go back and read up on conditions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should contain a list of all conditions that have an influence on the end result of Ready. What defines if something is (or isn't) a negative polarity is if the condition highlights normal vs abnormal behavior, a condition type can't be both negative and positive (and should thus only be listed once).

Examples:

  • Ready == True is "normal behavior", as the condition is present and True if everything is operating as expected.
  • Stalled == True is "abnormal behavior" as the condition is present and True if an abnormality occurs, but removed from the resource once resolved.

RemediatedCondition is a negative polarity condition, as it being present indicates an error and should result in an inverse of the True and return a Ready == False.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So for now it looks right, but I will have to come back in a later PR and make changes to the different conditions.

@phillebaba
Copy link
Member Author

@hiddeco I am getting a hang of most things. Things seem to work in a full deployment now. The most confusing thing right now is the different conditions and the preferred methods. For example the differences between using released conditon and ready condition.

Need to work some of these things to make sure that we are not breaking existing functionality.

@phillebaba phillebaba changed the base branch from main to reconcilers-dev November 2, 2021 15:19
@phillebaba phillebaba marked this pull request as ready for review November 2, 2021 15:19
@phillebaba phillebaba force-pushed the feature/pkg-runtime-helpers branch 3 times, most recently from c581997 to 06415a3 Compare November 2, 2021 15:24
ValuesFile: template.Spec.ValuesFile,
Interval: template.GetInterval(hr.Spec.Interval),
ValuesFiles: template.Spec.ValuesFiles,
ValuesFile: template.Spec.ValuesFile,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should Suspend and AccessFrom be added to here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introduction of suspend would be an idea, but also raises a question: if a user decides to add both fields, the one of the chart won't be reconciled. Is this expected behavior, or an awkward result of design decisions?

For AccessFrom, @stefanprodan will likely be able to tell more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the suspend feature should be in its own PR to keep things simple. But yes the question about setting both fields is an important one. Is it due to legacy that we have both field options or is there some other reason behind this? My suggestion for now is to just throw an error if both are set.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the controller generates the HelmChart, AccessFrom should be set to allow the namespace where the HelmRelease is. Sadly it's impossible to to restrict the access to a single namespace on Kubernetes < 1.21 since the kubernetes.io/metadata.name label was aded in 1.21. Given this I would set this to allow all.

r.recordReadiness(ctx, hr)
}
// TODO: Progressing is set immediately on reconcile so is it necessary to set it here?
hr, _ = v2.HelmReleaseAttempted(hr, revision, releaseRevision, valuesChecksum)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should hasNewState be used here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep most of the logic (that should be rewritten as well as noted in (an) other issue(s)) intact I would maintain it for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the easiest solution be to skip the status patching and only implement the failure counter reset?

controllers/helmrelease_controller.go Show resolved Hide resolved
@phillebaba phillebaba force-pushed the feature/pkg-runtime-helpers branch 2 times, most recently from 7b175ee to 451c234 Compare November 2, 2021 15:36
@hiddeco hiddeco self-assigned this Nov 5, 2021
controllers/helmrelease_controller.go Outdated Show resolved Hide resolved
ValuesFile: template.Spec.ValuesFile,
Interval: template.GetInterval(hr.Spec.Interval),
ValuesFiles: template.Spec.ValuesFiles,
ValuesFile: template.Spec.ValuesFile,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introduction of suspend would be an idea, but also raises a question: if a user decides to add both fields, the one of the chart won't be reconciled. Is this expected behavior, or an awkward result of design decisions?

For AccessFrom, @stefanprodan will likely be able to tell more.

),
conditions.WithNegativePolarityConditions(
v2.RemediatedCondition,
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should contain a list of all conditions that have an influence on the end result of Ready. What defines if something is (or isn't) a negative polarity is if the condition highlights normal vs abnormal behavior, a condition type can't be both negative and positive (and should thus only be listed once).

Examples:

  • Ready == True is "normal behavior", as the condition is present and True if everything is operating as expected.
  • Stalled == True is "abnormal behavior" as the condition is present and True if an abnormality occurs, but removed from the resource once resolved.

RemediatedCondition is a negative polarity condition, as it being present indicates an error and should result in an inverse of the True and return a Ready == False.

controllers/helmrelease_controller.go Outdated Show resolved Hide resolved
r.recordReadiness(ctx, hr)
}
// TODO: Progressing is set immediately on reconcile so is it necessary to set it here?
hr, _ = v2.HelmReleaseAttempted(hr, revision, releaseRevision, valuesChecksum)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep most of the logic (that should be rewritten as well as noted in (an) other issue(s)) intact I would maintain it for now.

controllers/helmrelease_controller.go Show resolved Hide resolved
Signed-off-by: Philip Laine <philip.laine@xenit.se>
Signed-off-by: Philip Laine <philip.laine@xenit.se>
if reconcileErr != nil {
r.event(ctx, hr, hc.GetArtifact().Revision, events.EventSeverityError,
fmt.Sprintf("reconciliation failed: %s", reconcileErr.Error()))
if result, err := r.reconcileRelease(ctx, obj, chart, values); reconcileErr != nil {
Copy link
Contributor

@darkowlzz darkowlzz Dec 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been investigating the reason for the test failure in CI and found that because of reconcileErr here, the error from reconcileRelease() is always ignored. So the install-fail test doesn't gets expected result.

Suggested change
if result, err := r.reconcileRelease(ctx, obj, chart, values); reconcileErr != nil {
if result, err := r.reconcileRelease(ctx, obj, chart, values); err != nil {

Copy link
Contributor

@darkowlzz darkowlzz Dec 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not resolve the CI failures, there's more to it. Investigating...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should fix most of the CI failures, but I found some more conditionals related issues. Will comment them inline.

Copy link
Contributor

@darkowlzz darkowlzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried reproducing the e2e tests that are failing in the CI and left comments about what I did to make it work. Also left a few comments around the status conditions usage. Hope those comments will make it clear some of the conditionals usage patterns.

r.event(ctx, hr, hr.Status.LastAttemptedRevision, events.EventSeverityError, msg)
return v2.HelmReleaseNotReady(hr, v2.ArtifactFailedReason, msg), ctrl.Result{Requeue: true}, reconcileErr
obj.IncrementFailureCounter()
conditions.MarkFalse(obj, meta.ReadyCondition, v2.ArtifactFailedReason, "Chart reconcilliation failed: %s", reconcileErr.Error())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since ready condition is calculated as a summary based on other conditions, it'd be better to mark ReleasedCondition false here. Setting ready condition here gets overwritten in the Reconcile() when ready condition is set as a summary of other conditions.

msg := fmt.Sprintf("HelmChart '%s/%s' is not ready", hc.GetNamespace(), hc.GetName())
r.event(ctx, hr, hr.Status.LastAttemptedRevision, events.EventSeverityInfo, msg)
obj.IncrementFailureCounter()
conditions.MarkFalse(obj, meta.ReadyCondition, v2.ArtifactFailedReason, msg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, this can also be ReleasedCondition marked as false.

if err := r.checkDependencies(ctx, obj); err != nil {
msg := fmt.Sprintf("dependencies do not meet ready condition (%s), retrying in %s", err.Error(), r.requeueDependency.String())
obj.IncrementFailureCounter()
conditions.MarkFalse(obj, meta.ReadyCondition, v2.GetHelmChartFailedReason, msg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, this can also be ReleasedCondition marked as false.

r.event(ctx, hr, hr.Status.LastAttemptedRevision, events.EventSeverityError, err.Error())
return v2.HelmReleaseNotReady(hr, v2.InitFailedReason, err.Error()), ctrl.Result{Requeue: true}, nil
obj.IncrementFailureCounter()
conditions.MarkFalse(obj, meta.ReadyCondition, v2.InitFailedReason, "could not get chart values: %s", err.Error())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, this can also be ReleasedCondition marked as false.

r.event(ctx, hr, hr.Status.LastAttemptedRevision, events.EventSeverityError, err.Error())
return v2.HelmReleaseNotReady(hr, v2.ArtifactFailedReason, err.Error()), ctrl.Result{Requeue: true}, nil
obj.IncrementFailureCounter()
conditions.MarkFalse(obj, meta.ReadyCondition, v2.ArtifactFailedReason, "could not get chart artifact: %s", err.Error())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, this can also be ReleasedCondition marked as false.

fmt.Sprintf("reconciliation failed: %s", reconcileErr.Error()))
if result, err := r.reconcileRelease(ctx, obj, chart, values); reconcileErr != nil {
obj.IncrementFailureCounter()
conditions.MarkFalse(obj, meta.ReadyCondition, v2.ArtifactFailedReason, "could not get chart artifact: %s", err.Error())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, this can also be ReleasedCondition marked as false.

err = fmt.Errorf("install retries exhausted")
return v2.HelmReleaseNotReady(hr, released.Reason, err.Error()), err
conditions.MarkFalse(hr, meta.ReadyCondition, released.Reason, err.Error())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be RemediatedCondition.

@phillebaba phillebaba force-pushed the feature/pkg-runtime-helpers branch 4 times, most recently from d20927e to dfd9c2e Compare December 10, 2021 15:13
@phillebaba phillebaba force-pushed the feature/pkg-runtime-helpers branch 3 times, most recently from 5984c7c to 6e33965 Compare December 21, 2021 15:50
Signed-off-by: Philip Laine <philip.laine@xenit.se>
@phillebaba phillebaba merged commit f286591 into reconcilers-dev Jan 13, 2022
@phillebaba phillebaba deleted the feature/pkg-runtime-helpers branch January 13, 2022 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement pkg/runtime helpers
4 participants