-
Notifications
You must be signed in to change notification settings - Fork 239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix metrics collector panic #949
fix metrics collector panic #949
Conversation
Skipping CI for Draft Pull Request. |
@@ -98,6 +99,11 @@ func (csi CSICollector) Collect(ch chan<- prometheus.Metric) { | |||
} | |||
|
|||
func execute(name string, c Collector, ch chan<- prometheus.Metric) { | |||
defer func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just stop collector when summary collector panic, Retry is pointless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The summary collector currently will not panic anymore after fix.
Abount this snippet, just in case any individual collector panic occurred, the execute
should recover from panic, allowing the metrics server to maintain accurate reporting of all other metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotcha
Some metrics are missing in stats summary of low-version kubelet
5156ac2
to
e8d5690
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mowangdk, tydra-wang The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Cherry pick #949 fix kubelet stats summary collector panic
What type of PR is this?
/kind bug
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
kubelet stats summary collector may panic in k8s 1.20 version.
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: