-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-12210: Prevent partially filled HPA behaviors from crashing kube-controller-manager #1876
OCPBUGS-12210: Prevent partially filled HPA behaviors from crashing kube-controller-manager #1876
Conversation
@jkyros: This pull request references Jira Issue OCPBUGS-12210, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@jkyros: the contents of this pull request could be automatically validated. The following commits are valid:
Comment |
/jira refresh |
@jkyros: This pull request references Jira Issue OCPBUGS-12210, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/lgtm |
/lgtm I am fine with this approach |
the referenced upstream doesn't exist: 11358 |
a98f50a
to
0ef37f6
Compare
@jkyros: the contents of this pull request could be automatically validated. The following commits are valid:
Comment |
Fixed, I'm a clown, I left out the 4, should have been |
@@ -449,6 +450,20 @@ func Convert_v1_HorizontalPodAutoscaler_To_autoscaling_HorizontalPodAutoscaler(i | |||
// drop round-tripping annotations after converting to internal | |||
out.Annotations, _ = autoscaling.DropRoundTripHorizontalPodAutoscalerAnnotations(out.Annotations) | |||
|
|||
// Until kube 1.27 we're still storing autoscaling as v1, but the HPA controller is consuming it as v2. Behaviors are an annotation in v1 and can be partially |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this do to the annotations in question in a flow like
- write object to API/etcd, scaleup gets set
- read object as v1, do annotations make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They get converted back out properly, but I assume as part out of the round-trip to the "internal" version when the controller updates it as v2 (I didn't think it round tripped v1 -> internal -> v1 on the way in on initial creation, but maybe I'm wrong and it does).
e.g. if I omit ScaleUp:
cat << EOF | oc create -f -
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: crasher
namespace: test
labels:
app: test
annotations:
autoscaling.alpha.kubernetes.io/behavior: '{"ScaleDown":{"StabilizationWindowSeconds":600,"SelectPolicy":"Max","Policies":[{"type":"Pods","value":1,"periodSeconds":1}]}}'
spec:
scaleTargetRef:
kind: Deployment
name: test
apiVersion: apps/v1
minReplicas: 8
maxReplicas: 25
targetCPUUtilizationPercentage: 120
EOF
When I ask for it back, ScaleUp in the autoscaling.alpha.kubernetes.io/behavior:
annotation has been filled in by the default value.
[jkyros@jkyros-thinkpadp1gen5 ocpbugs-12210]$ oc get hpa.v1.autoscaling -n test crasher -o yaml
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
annotations:
autoscaling.alpha.kubernetes.io/behavior: '{"ScaleUp":{"StabilizationWindowSeconds":0,"SelectPolicy":"Max","Policies":[{"Type":"Pods","Value":4,"PeriodSeconds":15},{"Type":"Percent","Value":100,"PeriodSeconds":15}]},"ScaleDown":{"StabilizationWindowSeconds":600,"SelectPolicy":"Max","Policies":[{"Type":"Pods","Value":1,"PeriodSeconds":1}]}}'
autoscaling.alpha.kubernetes.io/conditions: '[{"type":"AbleToScale","status":"True","lastTransitionTime":"2024-01-31T20:54:22Z","reason":"ScaleDownStabilized","message":"recent
recommendations were higher than current one, applying the highest recent recommendation"},{"type":"ScalingActive","status":"True","lastTransitionTime":"2024-01-31T20:54:22Z","reason":"ValidMetricFound","message":"the
HPA was able to successfully calculate a replica count from cpu resource utilization
(percentage of request)"},{"type":"ScalingLimited","status":"False","lastTransitionTime":"2024-01-31T20:54:22Z","reason":"DesiredWithinRange","message":"the
desired count is within the acceptable range"}]'
autoscaling.alpha.kubernetes.io/current-metrics: '[{"type":"Resource","resource":{"name":"cpu","currentAverageUtilization":0,"currentAverageValue":"0"}}]'
converted.jkyros.io: autoscaling -> v1
creationTimestamp: "2024-01-31T20:54:20Z"
labels:
app: test
name: crasher
namespace: test
resourceVersion: "92141"
uid: 3b0646ec-ed98-4b2f-a989-8bc201d7a56f
spec:
maxReplicas: 25
minReplicas: 8
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: test
targetCPUUtilizationPercentage: 120
status:
currentCPUUtilizationPercentage: 0
currentReplicas: 8
desiredReplicas: 8
TL;DR yep the converters work, the annotations make sense
where does the crash happen? is it in conversion code? if the failure happens in the kube-controller-manager, I'd rather see a kube-controller-manager patch because the blast radius is much smaller and doesn't have any persistent effect in the cluster. |
0ef37f6
to
c302002
Compare
@jkyros: the contents of this pull request could be automatically validated. The following commits are valid:
Comment |
The description in the autoscaling API for the HorizontalPodAutoscaler suggests that HorizontalPodAutoscalerSpec's Behavior field (and its ScaleUp and ScaleDown members) are optional. And that if not supplied, defaults will be used. That's true if the entire Behavior is nil because, we go through "normalizeDesiredReplicas" instead of "normalizeDesiredReplicasWithBehaviors", but if the structure is only partially supplied, leaving some members nil, it results in nil dereferences when we end up going though normalizeDesiredReplicasWithBehaviors. So we end up in a situation where: - If Behavior is entirely absent (nil) we use defaults (good) - If Behavior is partially specified we panic (very bad) - If stabilizationWindowSeconds is nil in either ScaleUp or Scaledown, we panic (also very bad) In general, this only happens with pre-v2 HPA objects because v2 does properly fill in the default values. This commit prevents the panic by using the defaulters to ensure that unpopulated fields in the behavior objects get filled in with their defaults before they are used, so they can safely be dereferenced by later code that performs calculations on them.
c302002
to
0e0f66f
Compare
@jkyros: the contents of this pull request could be automatically validated. The following commits are valid:
Comment |
Eads and I talked over slack (thanks David). In summary, since the crash is in the controller, not the converter, we'd prefer to fix it in the controller to reduce the blast radius since this fix is only for two old versions. I've retooled this PR such that:
This does come with the side effect that if the controller modifies the object, it gets written back out with the missing values filled in, but that seems better to me than refusing to scale until someone tediously fills them in? |
@jkyros: This pull request references Jira Issue OCPBUGS-12210, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/test e2e-aws-ovn-serial |
Agreed, thank you /approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: aravindhp, deads2k, jkyros, joelsmith The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/label backport-risk-assessed |
@joelsmith: Can not set label backport-risk-assessed: Must be member in one of these teams: [openshift-staff-engineers] In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/label backport-risk-assessed |
@weinliu I know this isn't an actual "cherry pick" since we're stuffing it straight into 4.13, and it's not in a part of the repo that we "own", but would you be able to take a look here for the |
/label cherry-pick-approved |
@weinliu: Can not set label cherry-pick-approved: Must be member in one of these teams: [openshift-staff-engineers] In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@sunilcio could you help? |
/label cherry-pick-approved |
8f85140
into
openshift:release-4.13
@jkyros: Jira Issue OCPBUGS-12210: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-12210 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
[ART PR BUILD NOTIFIER] This PR has been included in build openshift-enterprise-pod-container-v4.13.0-202402060538.p0.g8f85140.assembly.stream for distgit openshift-enterprise-pod. |
Thanks everyone! I'm only going to take this back one more to 4.12 (since it's EUS): |
@jkyros: #1876 failed to apply on top of branch "release-4.12":
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Fix included in accepted release 4.13.0-0.nightly-2024-02-06-120750 |
The short version here is that:
ScaleUp
but notScaleDown
, etc ) in kube < 1.27, it will send thekube-controller-manager
intoCrashLoopBackOff
This PR:
Defauts any nil behaviors when converting from v1 -> internalUpdstream details:
Here is a straightforward crasher ( you might have to wait a little bit until the HPA touches it, but you should be able to see
kube-controller-manager
pods go intoCrashLoopBackoff
):Fixes: OCPBUGS-12210