Continuous HPA updates with CPU Utilization trigger #5821

uucloud · 2024-05-23T08:26:12Z

Report

When I submit a ScaledObject that includes both CPU utilization triggers and other resource triggers, the KEDA operator may continuously update hpa and never stops.

Expected Behavior

Only one update occurs

Actual Behavior

Continuous HPA updates

Steps to Reproduce the Problem

Create a ScaledObject with both CPU utilization triggers and other resource triggers.
Ensure the CPU utilization trigger is not the last one in the ScaledObject.
Use a Kubernetes cluster with a version below 1.27 (e.g., 1.26).
Observe the continuous triggering of "Found difference in the HPA spec according to ScaledObject" by the KEDA operator.

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: uucloud-test-so
  namespace: default
spec:
  scaleTargetRef:
    name: uucloud-test  # your deployment
  minReplicaCount:  1                                     
  maxReplicaCount:  2                                   
  triggers:
    - metadata:
        value: "50"
      metricType: Utilization
      type: cpu
    - metadata:
        value: "50"
      metricType: Utilization
      type: memory

Logs from KEDA operator

No response

KEDA Version

2.12.1

Kubernetes Version

< 1.27

Scaler Details

Resource

Anything else?

This issue is fundamentally the same as the one encountered in kubernetes/kubernetes#74099. The root cause is Kubernetes reordering spec.metrics, the HPA v1 conversion logic causing the position of the CPU utilization HPA metric to be adjusted.

When creating or updating an HPA, the conversion logic in these segments of code link1 and link2 converts the first CPU utilization metric into the HPA v1 metric (if there are multiple CPU utilization triggers, the others are lost), and stores the remaining metrics in annotations. When converting back from HPA v1 to HPA, it appends the CPU utilization metric to the end (link3).

This results in a situation where, if the ScaledObject has multiple resource triggers and one of them is a CPU utilization trigger, the final HPA will always have the CPU utilization trigger at the end. Additionally, if there are multiple CPU utilization triggers, only one will remain (though having multiple CPU utilization triggers in one HPA configuration might seem to have little practical value...). This causes the KEDA operator to continuously detect differences and persistently update the HPA.

In Kubernetes 1.27 and later, this issue is resolved because the autoscaling v1 schema is deprioritized behind v2, meaning it no longer defaults to converting to HPA v1. The relevant change is shown below:

diff --git a/pkg/apis/autoscaling/install/install.go b/pkg/apis/autoscaling/install/install.go
index 3740aee3155..424fc5ce85d 100644
--- a/pkg/apis/autoscaling/install/install.go
+++ b/pkg/apis/autoscaling/install/install.go
@@ -40,6 +40,5 @@ func Install(scheme *runtime.Scheme) {
        utilruntime.Must(v2.AddToScheme(scheme))
        utilruntime.Must(v2beta1.AddToScheme(scheme))
        utilruntime.Must(v1.AddToScheme(scheme))
-       // TODO: move v2 to the front of the list in 1.24
-       utilruntime.Must(scheme.SetVersionPriority(v1.SchemeGroupVersion, v2.SchemeGroupVersion, v2beta1.SchemeGroupVersion, v2beta2.SchemeGroupVersion))
+       utilruntime.Must(scheme.SetVersionPriority(v2.SchemeGroupVersion, v1.SchemeGroupVersion, v2beta1.SchemeGroupVersion, v2beta2.SchemeGroupVersion))

The text was updated successfully, but these errors were encountered:

JorTurFer · 2024-05-26T13:51:03Z

Hello,
Thanks for reporting this. Just to understand the issue, this affects to k8s 1.26 or bellow, and 1.27 has already fixed this, right? We currently only support >= 1.27 officially. Personally I don't have troubles fixing this if it's easy, but I'd like to know @tomkerkhove and @zroubalik thoughts

stale · 2024-07-26T01:27:10Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

stale · 2024-08-02T16:34:32Z

This issue has been automatically closed due to inactivity.

uucloud added the bug Something isn't working label May 23, 2024

uucloud mentioned this issue May 23, 2024

fix: unnecessary HPA updates when cpu utilization trigger is used #5822

Closed

7 tasks

stale bot added the stale All issues that are marked as stale due to inactivity label Jul 26, 2024

stale bot closed this as completed Aug 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Continuous HPA updates with CPU Utilization trigger #5821

Continuous HPA updates with CPU Utilization trigger #5821

uucloud commented May 23, 2024 •

edited

Loading

JorTurFer commented May 26, 2024

stale bot commented Jul 26, 2024

stale bot commented Aug 2, 2024

Continuous HPA updates with CPU Utilization trigger #5821

Continuous HPA updates with CPU Utilization trigger #5821

Comments

uucloud commented May 23, 2024 • edited Loading

Report

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Logs from KEDA operator

KEDA Version

Kubernetes Version

Scaler Details

Anything else?

JorTurFer commented May 26, 2024

stale bot commented Jul 26, 2024

stale bot commented Aug 2, 2024

uucloud commented May 23, 2024 •

edited

Loading