Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add work probe mode #77

Merged

Conversation

qiujian16
Copy link
Member

Signed-off-by: Jian Qiu jqiu@redhat.com

#74

Signed-off-by: Jian Qiu <jqiu@redhat.com>
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 23, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: qiujian16

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@qiujian16 qiujian16 changed the title Add work probe mode [WIP] Add work probe mode Feb 23, 2022
@qiujian16
Copy link
Member Author

adding tests.

@qiujian16
Copy link
Member Author

/assign @yue9944882

1. add more test cases
2. add deployment prober

Signed-off-by: Jian Qiu <jqiu@redhat.com>
pkg/agent/inteface.go Outdated Show resolved Hide resolved
}

func (d *DeploymentProber) HealthCheck(identifier workapiv1.ResourceIdentifier, result workapiv1.StatusFeedbackResult) error {
if len(result.Values) == 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

returning a non-nil error means the health-check controller will retry in the default backoff strategy which has a milli-sec backoff in the first 5-ish attempts IIRC, while the default interval of refreshing the feedback values is 30s. i think 0-length feedbacks can be somewhat reasonable when the workloads are applied for the first time (it may take a short period for the controller to process), so the health controller should better retry the health-check in a more "lazy" strategy e.g. constantly a 10-sec interval. otherwise the error log will spam.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, this is a good point.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we check the len of values in controller. if there is no value returned, set the condition to unknown. I think we may not need resync, since work status update will trigger another check on this.

return probeFields
}

func (d *DeploymentProber) HealthCheck(identifier workapiv1.ResourceIdentifier, result workapiv1.StatusFeedbackResult) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the per-resource conditions provides a status named "StatusFeedbackSynced", how do you think about checking the overall condition before digging into the fields?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here is a part to check the condition before checking using check func https://github.com/open-cluster-management-io/addon-framework/pull/77/files#diff-6cbce2e82f874dcc6e37cfb90ab66b42db275761808cca2abfd5c8b294d67e11R172. I will add a comment on that

Signed-off-by: Jian Qiu <jqiu@redhat.com>
Copy link
Member

@yue9944882 yue9944882 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm label Mar 8, 2022
@qiujian16 qiujian16 changed the title [WIP] Add work probe mode Add work probe mode Mar 8, 2022
@openshift-merge-robot openshift-merge-robot merged commit df7ea69 into open-cluster-management-io:main Mar 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants