-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add work probe mode #77
Add work probe mode #77
Conversation
Signed-off-by: Jian Qiu <jqiu@redhat.com>
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: qiujian16 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
adding tests. |
/assign @yue9944882 |
1. add more test cases 2. add deployment prober Signed-off-by: Jian Qiu <jqiu@redhat.com>
} | ||
|
||
func (d *DeploymentProber) HealthCheck(identifier workapiv1.ResourceIdentifier, result workapiv1.StatusFeedbackResult) error { | ||
if len(result.Values) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
returning a non-nil error means the health-check controller will retry in the default backoff strategy which has a milli-sec backoff in the first 5-ish attempts IIRC, while the default interval of refreshing the feedback values is 30s. i think 0-length feedbacks can be somewhat reasonable when the workloads are applied for the first time (it may take a short period for the controller to process), so the health controller should better retry the health-check in a more "lazy" strategy e.g. constantly a 10-sec interval. otherwise the error log will spam.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, this is a good point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we check the len of values in controller. if there is no value returned, set the condition to unknown. I think we may not need resync, since work status update will trigger another check on this.
pkg/utils/probe_helper.go
Outdated
return probeFields | ||
} | ||
|
||
func (d *DeploymentProber) HealthCheck(identifier workapiv1.ResourceIdentifier, result workapiv1.StatusFeedbackResult) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the per-resource conditions provides a status named "StatusFeedbackSynced", how do you think about checking the overall condition before digging into the fields?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here is a part to check the condition before checking using check func https://github.com/open-cluster-management-io/addon-framework/pull/77/files#diff-6cbce2e82f874dcc6e37cfb90ab66b42db275761808cca2abfd5c8b294d67e11R172. I will add a comment on that
Signed-off-by: Jian Qiu <jqiu@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Signed-off-by: Jian Qiu jqiu@redhat.com
#74