Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

karpenter scale up new node with EFS #5691

Closed
Mieszko96 opened this issue Feb 19, 2024 · 7 comments
Closed

karpenter scale up new node with EFS #5691

Mieszko96 opened this issue Feb 19, 2024 · 7 comments
Labels
question Further information is requested

Comments

@Mieszko96
Copy link

Description

Observed Behavior:
We have cluster with 1 NODE and full limit pods 58/58(but i believe same problem will be with more nodes)
We are installing additional pods with EFS, but pod is failing cuz of some race problem EFS on new nodes, related ticket- kubernetes-sigs/aws-efs-csi-driver#1069

This bug is already "workarounded" by EFS using taints for nodes link to EFS docs
https://github.com/kubernetes-sigs/aws-efs-csi-driver/blob/master/docs/README.md#configure-node-startup-taint

So i added to karpenter nodepool this taint.

      taints:
        - key: efs.csi.aws.com/agent-not-ready
          effect: NoExecute

And if using this taint.

karpenter will not scale up new node, cuz no pods is in state pending

Pod is waiting with failedscheduling
image

Expected Behavior:
Some way to trigger creation of extra nodes, before max limit of number pods pods.
Or some other idea.

Reproduction Steps (Please include YAML):

Versions:

  • Chart Version:
  • Kubernetes Version (kubectl version):
  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@Mieszko96 Mieszko96 added bug Something isn't working needs-triage Issues that need to be triaged labels Feb 19, 2024
@jmdeal
Copy link
Contributor

jmdeal commented Feb 19, 2024

You should use a startupTaint here, that lets Karpenter know that the taint will eventually be removed once the node is ready for pods to schedule. An example with a cilium taint: example.

@jmdeal jmdeal added question Further information is requested and removed bug Something isn't working needs-triage Issues that need to be triaged labels Feb 19, 2024
@Mieszko96
Copy link
Author

@jmdeal

Hey, i tested using startupTaint, forgot to mention.

with

startupTaints:
- key: efs.csi.aws.com/agent-not-ready
effect: NoExecute

Is not working, but i think it's more ticket to EFS.
It was spawning node, but pod which requires EFS was not working(some race problem)

if there is no way to spawn node before limit is reached.

@jmdeal
Copy link
Contributor

jmdeal commented Feb 19, 2024

Was the pod being scheduled? Was the EFS driver removing the startupTaint? If there's any more details about this race you could give, hopefully we can definitively say if it's on Karpenter or the efs-csi-driver.

@Mieszko96
Copy link
Author

"Was the EFS driver removing the startupTaint?"

i assume yes, cuz it is set in nodeclaim, and right now node don't have any taints.
karpenter version v0.32.1
efs helmchart 2.5.5

kubectl get pods

update-permissionx-a0c6988c   update-permissionx-a0c6988c-rb-58d85c4d66-8g476                0/1     Init:CrashLoopBackOff   7 (4m27s ago)   19m

describe

  Warning  FailedScheduling  20m                   default-scheduler  0/1 nodes are available: 1 Too many pods. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod..
  Normal   Nominated         20m                   karpenter          Pod should schedule on: nodeclaim/hxps-66ae7155-6992-40d7-a887-ae5a746c5c25-dev-mseries-btnkm
  Normal   Scheduled         19m                   default-scheduler  Successfully assigned update-permissionx-a0c6988c/update-permissionx-a0c6988c-rb-58d85c4d66-8g476 to ip-10-1-45-204.ec2.internal

kubectl get nodes -o json | jq '.items[].spec.taints'
null
null

kubectl get nodeclaim hxps-66ae7155-6992-40d7-a887-ae5a746c5c25-dev-mseries-btnkm -o yaml

apiVersion: karpenter.sh/v1beta1
kind: NodeClaim
metadata:
  annotations:
    karpenter.k8s.aws/ec2nodeclass-hash: "9341502479897301960"
    karpenter.k8s.aws/tagged: "true"
    karpenter.sh/managed-by: hxps-66ae7155-6992-40d7-a887-ae5a746c5c25-dev
    karpenter.sh/nodepool-hash: "5584896928084822447"
  creationTimestamp: "2024-02-19T23:06:20Z"
  finalizers:
  - karpenter.sh/termination
  generateName: hxps-66ae7155-6992-40d7-a887-ae5a746c5c25-dev-mseries-
  generation: 1
  labels:
    karpenter.k8s.aws/instance-category: m
    karpenter.k8s.aws/instance-cpu: "4"
    karpenter.k8s.aws/instance-encryption-in-transit-supported: "true"
    karpenter.k8s.aws/instance-family: m6a
    karpenter.k8s.aws/instance-generation: "6"
    karpenter.k8s.aws/instance-hypervisor: nitro
    karpenter.k8s.aws/instance-memory: "16384"
    karpenter.k8s.aws/instance-network-bandwidth: "1562"
    karpenter.k8s.aws/instance-size: xlarge
    karpenter.sh/capacity-type: on-demand
    karpenter.sh/nodepool: hxps-66ae7155-6992-40d7-a887-ae5a746c5c25-dev-mseries
    kubernetes.io/arch: amd64
    kubernetes.io/os: linux
    node.kubernetes.io/instance-type: m6a.xlarge
    topology.kubernetes.io/region: us-east-1
    topology.kubernetes.io/zone: us-east-1c
  name: hxps-66ae7155-6992-40d7-a887-ae5a746c5c25-dev-mseries-btnkm
  ownerReferences:
  - apiVersion: karpenter.sh/v1beta1
    blockOwnerDeletion: true
    kind: NodePool
    name: hxps-66ae7155-6992-40d7-a887-ae5a746c5c25-dev-mseries
    uid: b48361b2-a4b3-4f01-8901-6c38150366cb
  resourceVersion: "67460512"
  uid: 873e0425-43d3-4d5a-9982-2bfae8078e1c
spec:
  kubelet:
    maxPods: 58
  nodeClassRef:
    apiVersion: karpenter.k8s.aws/v1beta1
    kind: EC2NodeClass
    name: bottlerocket
  requirements:
  - key: karpenter.k8s.aws/instance-size
    operator: NotIn
    values:
    - large
    - medium
    - micro
    - nano
    - small
  - key: karpenter.sh/capacity-type
    operator: In
    values:
    - on-demand
  - key: karpenter.sh/nodepool
    operator: In
    values:
    - hxps-66ae7155-6992-40d7-a887-ae5a746c5c25-dev-mseries
  - key: karpenter.k8s.aws/instance-family
    operator: In
    values:
    - m6a
    - m7a
  - key: node.kubernetes.io/instance-type
    operator: In
    values:
    - m6a.12xlarge
    - m6a.16xlarge
    - m6a.24xlarge
    - m6a.2xlarge
    - m6a.32xlarge
    - m6a.48xlarge
    - m6a.4xlarge
    - m6a.8xlarge
    - m6a.metal
    - m6a.xlarge
    - m7a.12xlarge
    - m7a.16xlarge
    - m7a.24xlarge
    - m7a.2xlarge
    - m7a.32xlarge
    - m7a.48xlarge
    - m7a.4xlarge
    - m7a.8xlarge
    - m7a.metal-48xl
    - m7a.xlarge
  resources:
    requests:
      cpu: 210m
      memory: 240Mi
      pods: "7"
  startupTaints:
  - effect: NoExecute
    key: efs.csi.aws.com/agent-not-ready
status:
  allocatable:
    cpu: 3920m
    ephemeral-storage: 26Gi
    memory: 14162Mi
    pods: "58"
    vpc.amazonaws.com/pod-eni: "18"
  capacity:
    cpu: "4"
    ephemeral-storage: 30Gi
    memory: 15155Mi
    pods: "58"
    vpc.amazonaws.com/pod-eni: "18"
  conditions:
  - lastTransitionTime: "2024-02-19T23:06:55Z"
    status: "True"
    type: Initialized
  - lastTransitionTime: "2024-02-19T23:06:23Z"
    status: "True"
    type: Launched
  - lastTransitionTime: "2024-02-19T23:06:55Z"
    status: "True"
    type: Ready
  - lastTransitionTime: "2024-02-19T23:06:40Z"
    status: "True"
    type: Registered
  imageID: ami-0c15e6a43cf8f3491
  nodeName: ip-10-1-45-204.ec2.internal
  providerID: aws:///us-east-1c/i-07da998b4f276732b

@jmdeal
Copy link
Contributor

jmdeal commented Feb 19, 2024

If a node is being provisioned and the pod is being scheduled I'm going to wager the problem is on the EFS side. Without pod logs it's of course impossible to tell for sure, may be good to take up with the EFS folks.

@Mieszko96
Copy link
Author

@jmdeal
made some bash script to see how taint was changing in brand new node

and it seems that efs/karpenter removed this taint to early.

1
[
  {
    "effect": "NoExecute",
    "key": "efs.csi.aws.com/agent-not-ready"
  },
  {
    "effect": "NoSchedule",
    "key": "node.cloudprovider.kubernetes.io/uninitialized",
    "value": "true"
  },
  {
    "effect": "NoSchedule",
    "key": "node.kubernetes.io/not-ready"
  }
]
null
2
[
  {
    "effect": "NoExecute",
    "key": "efs.csi.aws.com/agent-not-ready"
  },
  {
    "effect": "NoSchedule",
    "key": "node.kubernetes.io/not-ready"
  }
]
null
3
[
  {
    "effect": "NoExecute",
    "key": "efs.csi.aws.com/agent-not-ready"
  },
  {
    "effect": "NoSchedule",
    "key": "node.kubernetes.io/not-ready"
  },
  {
    "effect": "NoExecute",
    "key": "node.kubernetes.io/not-ready",
    "timeAdded": "2024-02-20T13:58:55Z"
  }
]
null
4
[
  {
    "effect": "NoSchedule",
    "key": "node.kubernetes.io/not-ready"
  },
  {
    "effect": "NoExecute",
    "key": "node.kubernetes.io/not-ready",
    "timeAdded": "2024-02-20T13:58:55Z"
  }
]
null
5
[
  {
    "effect": "NoSchedule",
    "key": "node.kubernetes.io/not-ready"
  },
  {
    "effect": "NoExecute",
    "key": "node.kubernetes.io/not-ready",
    "timeAdded": "2024-02-20T13:58:55Z"
  }
]
null
6
[
  {
    "effect": "NoSchedule",
    "key": "node.kubernetes.io/not-ready"
  },
  {
    "effect": "NoExecute",
    "key": "node.kubernetes.io/not-ready",
    "timeAdded": "2024-02-20T13:58:55Z"
  }
]
null
7
[
  {
    "effect": "NoSchedule",
    "key": "node.kubernetes.io/not-ready"
  },
  {
    "effect": "NoExecute",
    "key": "node.kubernetes.io/not-ready",
    "timeAdded": "2024-02-20T13:58:55Z"
  }
]
null
8
[
  {
    "effect": "NoSchedule",
    "key": "node.kubernetes.io/not-ready"
  },
  {
    "effect": "NoExecute",
    "key": "node.kubernetes.io/not-ready",
    "timeAdded": "2024-02-20T13:58:55Z"
  }
]
null
9
[
  {
    "effect": "NoExecute",
    "key": "node.kubernetes.io/not-ready",
    "timeAdded": "2024-02-20T13:58:55Z"
  }
]
null
10
[
  {
    "effect": "NoExecute",
    "key": "node.kubernetes.io/not-ready",
    "timeAdded": "2024-02-20T13:58:55Z"
  }
]
null
11
null

@jmdeal
Copy link
Contributor

jmdeal commented Feb 20, 2024

Karpenter is not responsible for removing the taint, at that point it seems like it would be an aws-efs-cis-driver issue. I see you already opened an issue there, I'm going to close out this one for the time being. Feel free to reopen if something points it back to Karpenter.

@jmdeal jmdeal closed this as completed Feb 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants