Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Implement ManifestWorkReplicaSet RollOut strategy #259

Merged

Conversation

serngawy
Copy link
Member

Summary

Related issue(s)

Fixes #

@serngawy
Copy link
Member Author

/hold

Signed-off-by: melserngawy <melserng@redhat.com>
@codecov
Copy link

codecov bot commented Oct 10, 2023

Codecov Report

Attention: 41 lines in your changes are missing coverage. Please review.

Comparison is base (5903140) 61.06% compared to head (c182f35) 61.61%.
Report is 9 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #259      +/-   ##
==========================================
+ Coverage   61.06%   61.61%   +0.55%     
==========================================
  Files         129      132       +3     
  Lines       13771    13956     +185     
==========================================
+ Hits         8409     8599     +190     
- Misses       4588     4591       +3     
+ Partials      774      766       -8     
Flag Coverage Δ
unit 61.61% <75.30%> (+0.55%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...troller/manifestworkreplicaset_status_reconcile.go 86.66% <88.23%> (+4.84%) ⬆️
...setcontroller/manifestworkreplicaset_controller.go 47.87% <45.45%> (-0.33%) ⬇️
pkg/work/helper/helpers.go 66.89% <0.00%> (-2.10%) ⬇️
...troller/manifestworkreplicaset_deploy_reconcile.go 77.65% <80.35%> (+8.72%) ⬆️

... and 16 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@serngawy serngawy force-pushed the mwrSetRollout branch 4 times, most recently from a8b92f9 to 26cd588 Compare October 18, 2023 16:13
@serngawy serngawy changed the title ✨ (WIP) Implement ManifestWorkReplicaSet RollOut strategy ✨ Implement ManifestWorkReplicaSet RollOut strategy Oct 18, 2023
@serngawy
Copy link
Member Author

/unhold

@serngawy
Copy link
Member Author

/cc @qiujian16

@openshift-ci openshift-ci bot requested a review from qiujian16 October 18, 2023 18:35
@serngawy
Copy link
Member Author

/cc @haoqing0110

@openshift-ci openshift-ci bot requested a review from haoqing0110 October 18, 2023 18:35

// Check for the expected manifestWorkReplicaSetSummary
mwrSetSummary := workapiv1alpha1.ManifestWorkReplicaSetSummary{
Total: clsPerGroup + int(placement1.Status.NumberOfSelectedClusters),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there's clusters overlap in placement1 and placement2, will the cluster number be counted twice in the summary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use the mwrSet name to create the mw , so if a cluster is selected twice in placement1 & placement2 what will happen is;

  1. For placement1; the cluster will get selected and the mw will be created with ref to placement 1 in its labels as here
  2. The placement1 Summary will count the cluster
  3. For placement2; the cluster will also get selected and the mw label ref to placement will be updated to refer to placement2 as here.
  4. The placement2 Summary will also count the cluster (cause now it is belong to it)

So yes, it will be counted for each placement Summary (as the placement doesn't know if the cluster is selected in another placement) AND will be counted twice in the mwrSet Summary.

I'm not really sure how the logic should be, should we raise error ? should we skip it ? is it okay to count the cluster twice as it is exist in both placements?

Copy link
Member

@haoqing0110 haoqing0110 Oct 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought is a cluster counted twice makes the meaning of Total confusing to the user, since the actual number of ManifestWorks is less than it.

For addon, one addon only belongs to one placement (the latter one), and use that rollout strategy do rollout. It may not be the same as the mwrSet here, just for reference.

@qiujian16
Copy link
Member

sorry I am stuck by other stuff this week, will go through this PR early next week.

@@ -37,6 +37,10 @@ const (
// TODO move this to the api repo
ManifestWorkReplicaSetControllerNameLabelKey = "work.open-cluster-management.io/manifestworkreplicaset"

// ManifestWorkReplicaSetPlacementNameLabelKey is the label key on manifestwork to ref to the Placement that select
// the managedCluster on the manifestWorkReplicaSet's PlacementRef.
ManifestWorkReplicaSetPlacementNameLabelKey = "work.open-cluster-management.io/PlacementName"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the label key should not have capital in it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ops, done

}
for _, mw := range manifestWorks {
// Check if ManifestWorkTemplate changes, ManifestWork will need to be updated.
if !equality.Semantic.DeepEqual(mwrSet.Spec.ManifestWorkTemplate, mw.Spec) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this check is not quite reliable, since there could diff on somefield with empty or nil values. It is better to user the work applier helper.

I think we should always apply, and check the returned updated value to know if if it is updated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, I used the workApplier.ManifestWorkEqual instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a TODO here: we should have an interface in applier to check whether a mw should be applied.

}
// applied and Progressing conditions return status as Progressing
if apimeta.IsStatusConditionTrue(manifestWork.Status.Conditions, workv1.WorkApplied) ||
apimeta.IsStatusConditionTrue(manifestWork.Status.Conditions, workv1.WorkProgressing) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have a WorkProgressing status right now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so unless you can point me. I checked the manifestWork controllers sounds like there is no WorkProgressing status defined yet. I added comment to clarify that.

clsRolloutStatus.Status = clusterv1alpha1.Succeeded
}
// Degraded condition return status as Failed
if apimeta.IsStatusConditionTrue(manifestWork.Status.Conditions, workv1.WorkDegraded) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is not degraded status defined yet. We should add some note here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, same there is no degraded status defined. I Considered the Applied condition with False status as Failed, added comments to clarify that.

Signed-off-by: melserngawy <melserng@redhat.com>
Copy link
Member

@qiujian16 qiujian16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@haoqing0110 would you take a final look on this?

Copy link
Contributor

openshift-ci bot commented Nov 1, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: qiujian16, serngawy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved label Nov 1, 2023
@haoqing0110
Copy link
Member

LGTM
Only one concern for the overlapped clusters being counted twice in the summary message. This could be a follow-up fix when receiving more feedback.

@serngawy
Copy link
Member Author

serngawy commented Nov 1, 2023

LGTM Only one concern for the overlapped clusters being counted twice in the summary message. This could be a follow-up fix when receiving more feedback.

Yes, we decided to keep it as it is for now (manifestWork will be associated with the latest placement) cause it can be valid for both ways. Will wait for feedback if any changes is necessary.

@qiujian16
Copy link
Member

/lgtm

@openshift-ci openshift-ci bot added the lgtm label Nov 2, 2023
@openshift-ci openshift-ci bot merged commit 35680c3 into open-cluster-management-io:main Nov 2, 2023
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants