Skip to content

Commit

Permalink
Use kustomize for deployment
Browse files Browse the repository at this point in the history
supported overlays:
- no-webhook - deploy without webhook
- certmanager - deploy with webhook to k8s cluster where
certmanager is available
- opeshift - deploy with webhook to the Openshift cluster

Signed-off-by: Yury Kulazhenkov <ykulazhenkov@nvidia.com>
  • Loading branch information
ykulazhenkov committed Sep 26, 2023
1 parent bf17352 commit 4f86ec0
Show file tree
Hide file tree
Showing 32 changed files with 689 additions and 304 deletions.
33 changes: 30 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,16 @@ NVIDIA IPAM plugin consists of 3 main components:
A Kubernetes(K8s) controller that Watches on IPPools CRs in a predefined Namespace.
It then proceeds by assiging each node via IPPools Status a cluster unique range of IPs of the defined IP Pools.

#### Validation webhook

ipam-controller implements validation webhook for IPPool resource.
The webhook can prevent the creation of IPPool resources with invalid configurations.
Supported X.509 certificate management system should be available in the cluster to enable the webhook.
Currently supported systems are [certmanager](https://cert-manager.io/) and
[Openshift certificate management](https://docs.openshift.com/container-platform/4.13/security/certificates/service-serving-certificate.html)

Activation of the validation webhook is optional. Check the [Deployment](#deployment) section for details.

### ipam-node

The daemon is responsible for:
Expand Down Expand Up @@ -331,11 +341,28 @@ interface should have two IP addresses: one IPv4 and one IPv6. (default: network

### Deploy IPAM plugin

> _NOTE:_ This command will deploy latest dev build with default configuration
> _NOTE:_ These commands will deploy latest dev build with default configuration
The plugin can be deployed with kustomize.

Supported overlays are:

`no-webhook` - deploy without webhook

```shell
kubectl kustomize https://github.com/mellanox/nvidia-k8s-ipam/deploy/overlays/no-webhook?ref=main | kubectl apply -f -
```

`certmanager` - deploy with webhook to the Kubernetes cluster where certmanager is available

```shell
kubectl kustomize https://github.com/mellanox/nvidia-k8s-ipam/deploy/overlays/certmanager?ref=main | kubectl apply -f -
```

`opeshift` - deploy with webhook to the Openshift cluster

```shell
kubectl apply -f https://raw.githubusercontent.com/Mellanox/nvidia-k8s-ipam/main/deploy/crds/nv-ipam.nvidia.com_ippools.yaml
kubectl apply -f https://raw.githubusercontent.com/Mellanox/nvidia-k8s-ipam/main/deploy/nv-ipam.yaml
kubectl kustomize https://github.com/mellanox/nvidia-k8s-ipam/deploy/overlays/openshift?ref=main | kubectl apply -f -
```

### Create IPPool CR
Expand Down
25 changes: 25 additions & 0 deletions deploy/manifests/certmanager/certificate.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# The following manifests contain a self-signed issuer CR and a certificate CR.
# More document can be found at https://docs.cert-manager.io
# WARNING: Targets CertManager v1.0. Check https://cert-manager.io/docs/installation/upgrading/ for breaking changes.
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: selfsigned-issuer
namespace: system
spec:
selfSigned: {}
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: serving-cert # this name should match the one appeared in kustomizeconfig.yaml
namespace: system
spec:
# $(SERVICE_NAME) and $(SERVICE_NAMESPACE) will be substituted by kustomize
dnsNames:
- $(SERVICE_NAME).$(SERVICE_NAMESPACE).svc
- $(SERVICE_NAME).$(SERVICE_NAMESPACE).svc.cluster.local
issuerRef:
kind: Issuer
name: selfsigned-issuer
secretName: nv-ipam-webhook-server-cert # this secret will not be prefixed, since it's not managed by kustomize
5 changes: 5 additions & 0 deletions deploy/manifests/certmanager/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
resources:
- certificate.yaml

configurations:
- kustomizeconfig.yaml
16 changes: 16 additions & 0 deletions deploy/manifests/certmanager/kustomizeconfig.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# This configuration is for teaching kustomize how to update name ref and var substitution
nameReference:
- kind: Issuer
group: cert-manager.io
fieldSpecs:
- kind: Certificate
group: cert-manager.io
path: spec/issuerRef/name

varReference:
- kind: Certificate
group: cert-manager.io
path: spec/commonName
- kind: Certificate
group: cert-manager.io
path: spec/dnsNames
99 changes: 99 additions & 0 deletions deploy/manifests/controller/deployment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
kind: Deployment
apiVersion: apps/v1
metadata:
name: controller
namespace: system
annotations:
kubernetes.io/description: |
This deployment launches the nv-ipam controller for nv-ipam.
spec:
strategy:
type: RollingUpdate
replicas: 1
selector:
matchLabels:
name: controller
template:
metadata:
labels:
name: controller
spec:
priorityClassName: system-cluster-critical
serviceAccountName: controller
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: name
operator: In
values:
- controller
topologyKey: "kubernetes.io/hostname"
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: node-role.kubernetes.io/master
operator: In
values:
- ""
- weight: 1
preference:
matchExpressions:
- key: node-role.kubernetes.io/control-plane
operator: In
values:
- ""
tolerations:
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoSchedule
- key: nvidia.com/gpu
operator: Exists
effect: NoSchedule
containers:
- name: controller
image: ghcr.io/mellanox/nvidia-k8s-ipam:latest
imagePullPolicy: IfNotPresent
command: [ "/ipam-controller" ]
args:
- --leader-elect=true
- --leader-elect-namespace=$(POD_NAMESPACE)
- --ippools-namespace=$(POD_NAMESPACE)
env:
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- "ALL"
livenessProbe:
httpGet:
path: /healthz
port: 8081
initialDelaySeconds: 15
periodSeconds: 20
readinessProbe:
httpGet:
path: /readyz
port: 8081
initialDelaySeconds: 5
periodSeconds: 10
resources:
requests:
cpu: 100m
memory: 300Mi
ports:
- containerPort: 9443
name: webhook-server
protocol: TCP
5 changes: 5 additions & 0 deletions deploy/manifests/controller/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
resources:
- deployment.yaml
- role.yaml
- role_binding.yaml
- service_account.yaml
60 changes: 60 additions & 0 deletions deploy/manifests/controller/role.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: controller
rules:
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- list
- patch
- update
- watch
- apiGroups:
- ""
resources:
- configmaps
verbs:
- get
- list
- watch
- delete
- apiGroups:
- nv-ipam.nvidia.com
resources:
- ippools
verbs:
- get
- list
- watch
- create
- apiGroups:
- nv-ipam.nvidia.com
resources:
- ippools/status
verbs:
- get
- update
- patch
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs:
- get
- list
- watch
- create
- update
- patch
- delete
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
12 changes: 12 additions & 0 deletions deploy/manifests/controller/role_binding.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: controller
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: controller
subjects:
- kind: ServiceAccount
name: controller
namespace: system
5 changes: 5 additions & 0 deletions deploy/manifests/controller/service_account.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
apiVersion: v1
kind: ServiceAccount
metadata:
name: controller
namespace: system
90 changes: 90 additions & 0 deletions deploy/manifests/node/daemonset.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: node-ds
namespace: system
labels:
tier: node
app: nv-ipam-node
name: nv-ipam-node
spec:
selector:
matchLabels:
name: nv-ipam-node
updateStrategy:
type: RollingUpdate
template:
metadata:
labels:
tier: node
app: nv-ipam-node
name: nv-ipam-node
spec:
tolerations:
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoSchedule
- key: nvidia.com/gpu
operator: Exists
effect: NoSchedule
serviceAccountName: node
containers:
- name: node
image: ghcr.io/mellanox/nvidia-k8s-ipam:latest
imagePullPolicy: IfNotPresent
env:
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
securityContext:
privileged: true
command: [ "/ipam-node" ]
args:
- --node-name=$(NODE_NAME)
- --v=1 # log level for ipam-node
- --logging-format=json
- --bind-address=unix:///var/lib/cni/nv-ipam/daemon.sock
- --store-file=/var/lib/cni/nv-ipam/store
- --cni-daemon-socket=unix:///var/lib/cni/nv-ipam/daemon.sock
- --cni-daemon-call-timeout=5 # 5 seconds
- --cni-bin-dir=/opt/cni/bin
- --cni-conf-dir=/etc/cni/net.d/nv-ipam.d
- --cni-log-file=/var/log/nv-ipam-cni.log
- --cni-log-level=info # log level for shim CNI
- --ippools-namespace=$(POD_NAMESPACE)
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "300m"
memory: "300Mi"
volumeMounts:
- name: cnibin
mountPath: /opt/cni/bin
- name: cniconf
mountPath: /etc/cni/net.d/nv-ipam.d
- name: daemonstate
mountPath: /var/lib/cni/nv-ipam/
terminationGracePeriodSeconds: 10
volumes:
- name: cnibin
hostPath:
path: /opt/cni/bin
type: DirectoryOrCreate
- name: cniconf
hostPath:
path: /etc/cni/net.d/nv-ipam.d
type: DirectoryOrCreate
- name: daemonstate
hostPath:
path: /var/lib/cni/nv-ipam/
type: DirectoryOrCreate
5 changes: 5 additions & 0 deletions deploy/manifests/node/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
resources:
- daemonset.yaml
- role.yaml
- role_binding.yaml
- service_account.yaml
Loading

0 comments on commit 4f86ec0

Please sign in to comment.