Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDI: two sriovnodepolicy configs(8 vfs) but only four deviceNode #576

Closed
cyclinder opened this issue Jul 19, 2024 · 4 comments · Fixed by #583
Closed

CDI: two sriovnodepolicy configs(8 vfs) but only four deviceNode #576

cyclinder opened this issue Jul 19, 2024 · 4 comments · Fixed by #583
Assignees
Labels
bug Something isn't working

Comments

@cyclinder
Copy link
Contributor

cyclinder commented Jul 19, 2024

What happened?

I have two sriovnodepolicy configs, see below:

root@controller-node-1:/home/cyclinder/sriov# kubectl get sriovnetworknodepolicies.sriovnetwork.openshift.io -A -o wide
NAMESPACE     NAME      AGE
kube-system   policy1   43m
kube-system   policy2   7m44s
root@controller-node-1:/home/cyclinder/sriov# kubectl get sriovnetworknodestates.sriovnetwork.openshift.io -n kube-system -o yaml
apiVersion: v1
items:
- apiVersion: sriovnetwork.openshift.io/v1
  kind: SriovNetworkNodeState
  metadata:
    annotations:
      sriovnetwork.openshift.io/current-state: Idle
      sriovnetwork.openshift.io/desired-state: Idle
    creationTimestamp: "2024-07-16T03:50:05Z"
    generation: 5
    name: worker-node-1
    namespace: kube-system
    ownerReferences:
    - apiVersion: sriovnetwork.openshift.io/v1
      blockOwnerDeletion: true
      controller: true
      kind: SriovOperatorConfig
      name: default
      uid: 2427ae73-ef95-4f57-aa85-c681ff9a48bb
    resourceVersion: "40147316"
    uid: 67f59ed6-85a3-4913-a1bf-3697dd008310
  spec:
    interfaces:
    - name: enp4s0f0np0
      numVfs: 4
      pciAddress: "0000:04:00.0"
      vfGroups:
      - isRdma: true
        policyName: policy1
        resourceName: rdma_resource
        vfRange: 0-3
    - name: enp4s0f1np1
      numVfs: 4
      pciAddress: "0000:04:00.1"
      vfGroups:
      - isRdma: true
        policyName: policy2
        resourceName: rdma_resource1
        vfRange: 0-3
  status:
    interfaces:
    - Vfs:
      - deviceID: "1018"
        driver: mlx5_core
        mac: e6:bc:60:22:14:6c
        mtu: 1500
        name: enp4s0f0v0
        pciAddress: "0000:04:00.2"
        vendor: 15b3
        vfID: 0
      - deviceID: "1018"
        driver: mlx5_core
        mac: a2:d7:89:ad:5d:b7
        mtu: 1500
        name: enp4s0f0v1
        pciAddress: "0000:04:00.3"
        vendor: 15b3
        vfID: 1
      - deviceID: "1018"
        driver: mlx5_core
        mac: d2:0b:3f:c9:ab:a4
        mtu: 1500
        name: enp4s0f0v2
        pciAddress: "0000:04:00.4"
        vendor: 15b3
        vfID: 2
      - deviceID: "1018"
        driver: mlx5_core
        mac: 4e:37:ab:b2:68:d7
        mtu: 1500
        name: enp4s0f0v3
        pciAddress: "0000:04:00.5"
        vendor: 15b3
        vfID: 3
      deviceID: "1017"
      driver: mlx5_core
      eSwitchMode: legacy
      linkSpeed: 25000 Mb/s
      linkType: ETH
      mac: 04:3f:72:d0:d2:b2
      mtu: 1500
      name: enp4s0f0np0
      numVfs: 4
      pciAddress: "0000:04:00.0"
      totalvfs: 4
      vendor: 15b3
    - Vfs:
      - deviceID: "1018"
        driver: mlx5_core
        mac: 3e:3a:7f:af:11:99
        mtu: 1500
        name: enp4s0f1v0
        pciAddress: "0000:04:00.6"
        vendor: 15b3
        vfID: 0
      - deviceID: "1018"
        driver: mlx5_core
        mac: 6e:c1:0e:52:ea:d8
        mtu: 1500
        name: enp4s0f1v1
        pciAddress: "0000:04:00.7"
        vendor: 15b3
        vfID: 1
      - deviceID: "1018"
        driver: mlx5_core
        mac: 8e:c8:1d:fc:69:0d
        mtu: 1500
        name: enp4s0f1v2
        pciAddress: "0000:04:01.0"
        vendor: 15b3
        vfID: 2
      - deviceID: "1018"
        driver: mlx5_core
        mac: 52:4c:5c:b1:1d:44
        mtu: 1500
        name: enp4s0f1v3
        pciAddress: "0000:04:01.1"
        vendor: 15b3
        vfID: 3
      deviceID: "1017"
      driver: mlx5_core
      eSwitchMode: legacy
      linkSpeed: 10000 Mb/s
      linkType: ETH
      mac: 04:3f:72:d0:d2:b3
      mtu: 1500
      name: enp4s0f1np1
      numVfs: 4
      pciAddress: "0000:04:00.1"
      totalvfs: 4
      vendor: 15b3
    syncStatus: Succeeded
kind: List
metadata:
  resourceVersion: ""
root@worker-node-1:~# cat /var/run/cdi/sriov-dp-spidernet.io.yaml
cdiVersion: 0.5.0
containerEdits: {}
devices:
- containerEdits:
    deviceNodes:
    - hostPath: /dev/infiniband/issm6
      path: /dev/infiniband/issm6
      permissions: rw
    - hostPath: /dev/infiniband/umad6
      path: /dev/infiniband/umad6
      permissions: rw
    - hostPath: /dev/infiniband/uverbs6
      path: /dev/infiniband/uverbs6
      permissions: rw
    - hostPath: /dev/infiniband/rdma_cm
      path: /dev/infiniband/rdma_cm
      permissions: rw
  name: "0000:04:00.6"
- containerEdits:
    deviceNodes:
    - hostPath: /dev/infiniband/issm7
      path: /dev/infiniband/issm7
      permissions: rw
    - hostPath: /dev/infiniband/umad7
      path: /dev/infiniband/umad7
      permissions: rw
    - hostPath: /dev/infiniband/uverbs7
      path: /dev/infiniband/uverbs7
      permissions: rw
    - hostPath: /dev/infiniband/rdma_cm
      path: /dev/infiniband/rdma_cm
      permissions: rw
  name: "0000:04:00.7"
- containerEdits:
    deviceNodes:
    - hostPath: /dev/infiniband/issm8
      path: /dev/infiniband/issm8
      permissions: rw
    - hostPath: /dev/infiniband/umad8
      path: /dev/infiniband/umad8
      permissions: rw
    - hostPath: /dev/infiniband/uverbs8
      path: /dev/infiniband/uverbs8
      permissions: rw
    - hostPath: /dev/infiniband/rdma_cm
      path: /dev/infiniband/rdma_cm
      permissions: rw
  name: "0000:04:01.0"
- containerEdits:
    deviceNodes:
    - hostPath: /dev/infiniband/issm9
      path: /dev/infiniband/issm9
      permissions: rw
    - hostPath: /dev/infiniband/umad9
      path: /dev/infiniband/umad9
      permissions: rw
    - hostPath: /dev/infiniband/uverbs9
      path: /dev/infiniband/uverbs9
      permissions: rw
    - hostPath: /dev/infiniband/rdma_cm
      path: /dev/infiniband/rdma_cm
      permissions: rw
  name: "0000:04:01.1"
kind: spidernet.io/net-pci

What did you expect to happen?

What are the minimal steps needed to reproduce the bug?

Anything else we need to know?

k8snetworkplumbingwg/sriov-network-operator#735

Component Versions

Please fill in the below table with the version numbers of components used.

Component Version
SR-IOV Network Device Plugin
SR-IOV CNI Plugin
Multus
Kubernetes
OS

Config Files

Config file locations may be config dependent.

Device pool config file location (Try '/etc/pcidp/config.json')
Multus config (Try '/etc/cni/multus/net.d')
CNI config (Try '/etc/cni/net.d/')
Kubernetes deployment type ( Bare Metal, Kubeadm etc.)
Kubeconfig file
SR-IOV Network Custom Resource Definition

Logs

SR-IOV Network Device Plugin Logs (use kubectl logs $PODNAME)
Multus logs (If enabled. Try '/var/log/multus.log' )
Kubelet logs (journalctl -u kubelet)
@adrianchiris
Copy link
Contributor

please see discussions in [1]

[1] k8snetworkplumbingwg/sriov-network-operator#735

@adrianchiris adrianchiris added the bug Something isn't working label Jul 21, 2024
@souleb
Copy link
Contributor

souleb commented Aug 4, 2024

When use-cdi is enabled, a cdiSpec is created for each resourcePool on every call to ListWatch of the grpc server. The cdiSpec is then written to DefaultDynamicDir+cdiSpecPrefix+resourcePrefix with expand to /var/run/cdi/sriov-dp-nvidia.com.yaml. The function that write the cdiSpec does an atomic write, i.e. write to a temp file, save and then rename to the target name. This is in conflict with our desire to write all specs to the same file.

In order to fix this, we should either generate a unique file name for each resourcePool using GenerateNameForTransientSpec or create a shared memory cache that would handle writes to the local file.

@cyclinder
Copy link
Contributor Author

I'm interested in this, I can help fix it later.

@adrianchiris
Copy link
Contributor

In order to fix this, we should either generate a unique file name for each resourcePool

what needs to be kept in mind is that device plugin resource configuration may change in that case "old" cdi files need to get deleted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants