Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Progress tracking actionset #1586

Merged
merged 11 commits into from
Aug 8, 2022
Merged

Progress tracking actionset #1586

merged 11 commits into from
Aug 8, 2022

Conversation

viveksinghggits
Copy link
Contributor

@viveksinghggits viveksinghggits commented Aug 5, 2022

Change Overview

Re-opens #1487, with #1576 fix.

Pull request type

Please check the type of change your PR introduces:

  • 🚧 Work in Progress
  • 🌈 Refactoring (no functional changes, no api changes)
  • 🐹 Trivial/Minor
  • 🐛 Bugfix
  • 🌻 Feature
  • 🗺️ Documentation
  • 🤖 Test

Issues

Test Plan

  • 💪 Manual
  • ⚡ Unit test
/kanister/pkg/progress (progress-tracking-actionset*) » go test                       
OK: 6 passed
PASS
ok  	github.com/kanisterio/kanister/pkg/progress	0.026s

Install Kanister on k8s version 1.19 and 1.21 to check if we are backup and restore an application.

kctx: kind-kind
~/work/opensource/kanister/pkg/progress (progress-tracking-actionset*) » k version 
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.2", GitCommit:"8b5a19147530eaac9476b0ab82980b4088bbc1b2", GitTreeState:"clean", BuildDate:"2021-09-15T21:38:50Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.1", GitCommit:"5e58841cce77d4bc13713ad2b91fa0d961e69192", GitTreeState:"clean", BuildDate:"2021-05-21T23:01:33Z", GoVersion:"go1.16.4", Compiler:"gc", Platform:"linux/amd64"}

kctx: kind-kind
~/work/opensource/kanister/pkg/progress (progress-tracking-actionset*) » k get actionsets.cr.kanister.io -n kanister
NAME                         PROGRESS   LAST TRANSITION TIME   STATE
backup-22dtd                 100.00     2022-08-05T14:16:49Z   complete
backup-bl5w2                 100.00     2022-08-05T14:16:33Z   complete
restore-backup-22dtd-stq9j   100.00     2022-08-05T14:31:19Z   complete
kctx: kind-kind
~/work/opensource/kanister/examples/mysql (progress-tracking-actionset*) » k version 
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.2", GitCommit:"8b5a19147530eaac9476b0ab82980b4088bbc1b2", GitTreeState:"clean", BuildDate:"2021-09-15T21:38:50Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.16", GitCommit:"e37e4ab4cc8dcda84f1344dda47a97bb1927d074", GitTreeState:"clean", BuildDate:"2022-05-19T20:23:31Z", GoVersion:"go1.15.15", Compiler:"gc", Platform:"linux/amd64"}
WARNING: version difference between client (1.22) and server (1.19) exceeds the supported minor version skew of +/-1

kctx: kind-kind
~/work/opensource/kanister/examples/mysql (progress-tracking-actionset*) » k get actionsets.cr.kanister.io -n kanister 
NAME                         PROGRESS   LAST TRANSITION TIME   STATE
backup-jpldg                 100.00     2022-08-05T14:39:43Z   complete
restore-backup-jpldg-vjlvp   100.00     2022-08-05T14:40:04Z   complete

ihcsim and others added 9 commits June 16, 2022 07:40
Signed-off-by: Ivan Sim <ivan.sim@kasten.io>
Signed-off-by: Ivan Sim <ivan.sim@kasten.io>
Signed-off-by: Ivan Sim <ivan.sim@kasten.io>
Signed-off-by: Ivan Sim <ivan.sim@kasten.io>
Signed-off-by: Ivan Sim <ivan.sim@kasten.io>
Signed-off-by: Ivan Sim <ivan.sim@kasten.io>
In the k8s cluster < 1.20 we got into a problem while updating
actionset progress. Details can be found [here](#1576).
This tries to fix that.
@github-actions
Copy link
Contributor

github-actions bot commented Aug 5, 2022

Thanks for submitting this pull request 🎉. The team will review it soon and get back to you.

If you haven't already, please take a moment to review our project contributing guideline and Code of Conduct document.

@infraq infraq added this to In Progress in Kanister Aug 5, 2022
"github.com/kanisterio/kanister/pkg/log"
"github.com/kanisterio/kanister/pkg/reconcile"
"github.com/kanisterio/kanister/pkg/validate"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Third-party pkg imports should be in separate group

return updateActionSet(ctx, client, actionSet, progressPercent, now)
}

func weight(phase *crv1alpha1.BlueprintPhase) float64 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] ideally function name should start with a verb. e.g calculateWeight or getWeight

return weightNormal
}

func longRunning(phase *crv1alpha1.BlueprintPhase) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] since the function returns bool - we can rename it to something - containsLongRunningFunc

return nil
}

func completedOrFailed(actionSet *crv1alpha1.ActionSet) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] Since the func returns bool, we can rename this to something like - isCompletedOrFailed

"nextUpdateTime": time.Now().Add(pollDuration),
}
log.Error().WithError(err).Print("failed to update phase progress", fields)
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may run into an infinite loop if the client fails to update action progress every iteration. Instead of continue, should we return error?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here is to ignore intermediate, flakey errors, due to e.g., API server or client-go throttling. The final progress should always be calculated based on the actionset status. If the actionset status completed, progress should always be 100% despite of intermediate errors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't lead to an infinite loop because either ctx will be done, or the completedOrFailed() function will return, based on the actionset status.

"namespace": actionSet.GetNamespace(),
"progress": progressPercent,
}
log.Debug().Print("updating action progress", fields)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also add actionset name (and may be phase name as well) to the logs so that we can differentiate between logs in case of multiple statements

Comment on lines +205 to +209
if err := reconcile.ActionSet(ctx, client.CrV1alpha1(), actionSet.GetNamespace(), actionSet.GetName(), updateFunc); err != nil {
return err
}

return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if err := reconcile.ActionSet(ctx, client.CrV1alpha1(), actionSet.GetNamespace(), actionSet.GetName(), updateFunc); err != nil {
return err
}
return nil
return reconcile.ActionSet(ctx, client.CrV1alpha1(), actionSet.GetNamespace(), actionSet.GetName(), updateFunc)

error check can be skipped

@@ -131,6 +132,16 @@ type ActionStatus struct {
DeferPhase Phase `json:"deferPhase,omitempty"`
}

// ActionProgress provides information on the progress of an action.
type ActionProgress struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also add the name of the current phase in progress?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

phase progress is planned for the next iteration. at this level, we just want to communicate the overall progress without bloating the API spec.

- Improve test a bit
- Add `omitempty` to another field of `actionProgress`
Kanister automation moved this from In Progress to Reviewer approved Aug 8, 2022
Copy link
Contributor

@pavannd1 pavannd1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe most of the changes in this PR are a part of the revert and were previously approved. My approval of this PR is purely focused on the intended bug fix. We can follow up on the other review comments in a separate PR.

Copy link
Contributor

@ihcsim ihcsim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@viveksinghggits Thanks for working on this 👍. Since the previous PR has been reviewed quite a bit, we should keep changes to this PR to the minimal, unless more bugs are found. If possible, I want to avoid another round of review churns - that last PR took a month (or more?) to merge.

I suggest other important non-bug review items be added to a new GH issue, which we may or may not address when we start the phase progress work.

We should start finding ways to surface these kind of errors sooner in the dev cycle. Reverting PRs is costly and inefficient as they create churns and re-works.

@mergify mergify bot merged commit 34c76cd into master Aug 8, 2022
@mergify mergify bot deleted the progress-tracking-actionset branch August 8, 2022 20:26
Kanister automation moved this from Reviewer approved to Done Aug 8, 2022
r4rajat pushed a commit to r4rajat/kanister that referenced this pull request Aug 28, 2022
* Update controller to track weighted action progress

Signed-off-by: Ivan Sim <ivan.sim@kasten.io>

* Revert auto-changes by codegen

Signed-off-by: Ivan Sim <ivan.sim@kasten.io>

* Address Eugen's feedback

Signed-off-by: Ivan Sim <ivan.sim@kasten.io>

* Address Pavan's feedback

Signed-off-by: Ivan Sim <ivan.sim@kasten.io>

* Address Vivek's feedback

Signed-off-by: Ivan Sim <ivan.sim@kasten.io>

* Fix tests

Signed-off-by: Ivan Sim <ivan.sim@kasten.io>

* Fix a bug while updating lastTransitionTime in <1.20 cluster

In the k8s cluster < 1.20 we got into a problem while updating
actionset progress. Details can be found [here](kanisterio#1576).
This tries to fix that.

* Remove extra newline from codegen.sh

* Address review comments

- Improve test a bit
- Add `omitempty` to another field of `actionProgress`

Co-authored-by: Ivan Sim <ivan.sim@kasten.io>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

Successfully merging this pull request may close these issues.

[BUG] Kanister Controller cannot update Status on Kubernetes versions < v1.20
4 participants