Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Elastic Fabric Adapter (EFA) to Karpenter #3127

Closed
iankouls-aws opened this issue Dec 30, 2022 · 5 comments · Fixed by #5068
Closed

Add support for Elastic Fabric Adapter (EFA) to Karpenter #3127

iankouls-aws opened this issue Dec 30, 2022 · 5 comments · Fixed by #5068
Assignees
Labels
api Issues that require API changes feature New feature or request v1 Issues requiring resolution by the v1 milestone

Comments

@iankouls-aws
Copy link

Problem Statement:

Some workloads [distributed training, simulations, HPC applications] require high performance networking on AWS provided by instances that are enabled with Elastic Fabric Adapter. The specific instance types are documented here. Currently Karpenter does not recognize EFA resource requests or limits specified in Kubernetes manifests as described below.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: inflate
spec:
  replicas: 1
  selector:
    matchLabels:
      app: inflate
  template:
    metadata:
      labels:
        app: inflate
    spec:
      terminationGracePeriodSeconds: 0
      containers:
        - name: inflate
          image: public.ecr.aws/eks-distro/kubernetes/pause:3.2
          resources:
            requests:
              vpc.amazonaws.com/efa: 1
            limits:
              vpc.amazonaws.com/efa: 1

When such a manifest is applied to a Karpenter-enabled cluster, the Karpenter controller produces an error like the following:

ERROR   controller.provisioning Could not schedule pod, incompatible with provisioner "default", no instance type satisfied resources {"pods":"1","vpc.amazonaws.com/efa":"1"} and requirements karpenter.k8s.aws/instance-generation Exists >2, karpenter.sh/provisioner-name In [default], karpenter.sh/capacity-type In [on-demand spot], kubernetes.io/os In [linux], kubernetes.io/arch In [amd64], karpenter.k8s.aws/instance-category In [c m r]   {"commit": "f290d37-dirty", "pod": "default/inflate-fd9bc9f9b-xmkq2"}

Feature Request:

Add capability in Karpenter to recognize resource vpc.amazonaws.com/efa, identify, and provision a suitable EC2 instance type with EFA enabled.

@njtran
Copy link
Contributor

njtran commented Jan 3, 2023

IIUC, this falls under the scope of custom resources, which should be handled with #2390. @jonathan-innis can you confirm this?

@sftim
Copy link
Contributor

sftim commented Jan 3, 2023

We could also look to add early support for Dynamic Resource Allocation. See https://kubernetes.io/blog/2022/12/15/dynamic-resource-allocation/

EFAs are an interesting bit of (virtual) hardware because the OS-bypass networking can only happen within the same subnet. That then could mean that the scheduler needs to be aware of that limitation in order to place Pods appropriately.

There are some other considerations, such as what security group to use for the EFAs. I could imagine a large cluster having more than one kind of EFA, perhaps each different kind is associated with a different security group.

@jonathan-innis jonathan-innis added feature New feature or request needs-design Design required api Issues that require API changes labels Jan 11, 2023
@bwagner5
Copy link
Contributor

bwagner5 commented Feb 9, 2023

EFAs are an interesting bit of (virtual) hardware because the OS-bypass networking can only happen within the same subnet. That then could mean that the scheduler needs to be aware of that limitation in order to place Pods appropriately.

There are some other considerations, such as what security group to use for the EFAs. I could imagine a large cluster having more than one kind of EFA, perhaps each different kind is associated with a different security group.

As a first step, it might be okay to leave the single subnet setup up to the user. They could configure a single provisioner with EFA support. Placement groups would also need to be setup, so it may be convenient to set both of those up at the Provisioner level and then target that provisioner. Security groups would also be setup at the provisioner level as normal.

@iankouls-aws
Copy link
Author

Until this feature is implemented, a temporary workaround is documented here:
https://github.com/aws-samples/aws-do-eks/tree/main/Container-Root/eks/deployment/karpenter#how-to-use-kaprenter-with-efa
This is by no means a design for EFA support in Karpenter. It uses a custom launch template and mounts the EFA device in the pod, which requires privileged mode.
The goal of this feature should be for pods to be able to specify vpc.amazonaws.com/efa: 1 resources and Karpenter to understand the request and add ec2 instances that are appropriate for the pods AND have EFA enabled.

@bredr
Copy link

bredr commented Jul 10, 2023

This feature is really important for making distributed training on EKS viable and is currently forming a bottleneck for us training large models. Its especially hard getting the launch template working well in combination with mounting nvme disks.

@billrayburn billrayburn added the v1 Issues requiring resolution by the v1 milestone label Sep 27, 2023
@billrayburn billrayburn removed the needs-design Design required label Nov 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Issues that require API changes feature New feature or request v1 Issues requiring resolution by the v1 milestone
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants