Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Karpenter constantly scales-up for a single pod #2821

Closed
jonathan-innis opened this issue Nov 8, 2022 · 4 comments
Closed

Karpenter constantly scales-up for a single pod #2821

jonathan-innis opened this issue Nov 8, 2022 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@jonathan-innis
Copy link
Contributor

jonathan-innis commented Nov 8, 2022

Version

Karpenter Version: v0.16.3

Expected Behavior

Karpenter should only provision a single node for the pod that needs to be scheduled on the cluster.

Actual Behavior

Karpenter schedules a node, the pod does not schedule to that node so Karpenter continually provisions nodes every 20s.

Steps to Reproduce the Problem

Unclear what the repro steps are at this time.

Resource Specs and Logs

Deployment Spec

apiVersion: apps/v1
kind: Deployment
metadata:
  name: app
  labels:
    imageTag: "7a3e144"
    app: app
    chart: apiHelm-0.1.0
    heritage: "Helm"
    release: "app"
spec:
  replicas: 1
  selector:
    matchLabels:
      app: app
  template:
    metadata:
      annotations:
        checksum/config: 95ad22c90ba988c45e8501e4be7583b090ba71aaa3dc7ff1c6517b472dd1f29f
        checksum/external-secrets: 5bf89530b7f9a34d97c64e8149f5ed441fe1a01ae708582c06a1a09c55ef81a8
      labels:
        app: app
        release: "app"
    spec:
      serviceAccountName: eng-18041-app
      containers:
      - name: app
        image: "*********.dkr.ecr.ap-southeast-1.amazonaws.com/app:7a3e144"
        imagePullPolicy: Always
        command:
        - "dumb-init"
        - "bundle"
        - "exec"
        - "rails"
        - "s"
        - "-b"
        - "[::]"
        envFrom:
        - configMapRef:
            name: "app-config"
        - secretRef:
            name: app-parameter
        ports:
        - name: http
          containerPort: 3000
          protocol: TCP
        resources:
          limits:
            memory: 512Mi
          requests:
            cpu: 500m
            memory: 512Mi
        readinessProbe:
          httpGet:
            path: /health_check
            port: http
          initialDelaySeconds: 0
          periodSeconds: 1
          timeoutSeconds: 1
          successThreshold: 1
          failureThreshold: 3
      nodeSelector:
        kubernetes.io/arch: "amd64"
      topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: "kubernetes.io/hostname"
        whenUnsatisfiable: ScheduleAnyway
        labelSelector:
          matchLabels:
            app: app

Screen Shot 2022-11-03 at 1 06 58 PM
Screen Shot 2022-11-03 at 1 53 42 PM
Screen Shot 2022-11-03 at 1 53 55 PM
Screen Shot 2022-11-04 at 3 06 09 AM

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@jonathan-innis jonathan-innis added the bug Something isn't working label Nov 8, 2022
@jonathan-innis jonathan-innis self-assigned this Nov 8, 2022
@ellistarn
Copy link
Contributor

ellistarn commented Nov 8, 2022

Can we add a integ test that enforces this? Runaway scaling test? Maybe a separate suite?

@jonathan-innis
Copy link
Contributor Author

Can we add a integ test that enforces this? Runaway scaling test? Maybe a separate suite?

This fits into the category of chaos/failure testing to me. Was planning to create a chaos testing Describe block in E2ETesting

@jonathan-innis
Copy link
Contributor Author

Able to repro a hypothesis for the runaway scaling issue by continually tainting the node after launch using Karpenter v0.16.3 with the script

while true
do
    kubectl get nodes -A --selector karpenter.sh/provisioner-name | cut -d " " -f 1 | xargs -I "{}" kubectl taint node {} special=true:NoExecute
    sleep 1
done

This issue was resolved with #2614 by removing the stabalization window, which would have prevented empty node removal during the infinite scale-up. This change will be released as part of the v0.19.0 and should mitigate the issue.

Provisioners that were using ttlSecondsAfterEmpty would not have been impacted by this runaway scale-up issue since ttlSecondsAfterEmpty has no stabilization window involved during node deletion consideration.

It's worth noting that without enabling either ttlSecondsAfterEmpty or consolidation.enabled, this issue persists for nodes that would receive taints that Karpenter is unaware of after node launch

@jonathan-innis
Copy link
Contributor Author

Tracked down the issue: The problem is that max-pods == 110 in the userData but Karpenter is unaware of the value. It does scheduling calculations based on ENI_LIMITED_POD_DENSITY, which for the instance type it launches (t4g.small) is 11 (see https://karpenter.sh/v0.18.1/aws/instance-types/#t4gsmall). This number of pods is used to calculate kube-reserved for the node that is being launched. Karpenter does the calculation and thinks that the node will have a larger allocatable than it actually does (which Bottlerocket calculates based on the 110 metric). This means that the node is launched with a smaller allocatable capacity than Karpenter thinks it will have, leading to Karpenter realizing this and then continually launching a new node.

The solution here is to migrate the userData configuration with max-pods and cluster-dns-ip to the spec.kubeletConfiguration in the Provisioner so that Karpenter is aware of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants