The CustomResourceDefinition "clusterpolicies.nvidia.com" is invalid: metadata.annotations: Too long: must have at most 262144 bytes #194

d-m · 2021-05-25T22:07:16Z

I get the following error when applying the custom resource definition for clusterpolicy objects:

$ kubectl apply -f https://raw.githubusercontent.com/NVIDIA/gpu-operator/master/deployments/gpu-operator/crds/nvidia.com_clusterpolicies_crd.yaml
The CustomResourceDefinition "clusterpolicies.nvidia.com" is invalid: metadata.annotations: Too long: must have at most 262144 bytes

After researching some, it seems like this is due to the annotations placed on the CRD resource when using kubectl apply (see kubernetes-sigs/kubebuilder#1140 (comment)). I verified that using kubectl create works, however I'm concerned that updating the CRD with kubectl replace going forward may cause issues with deployed cluster policy objects by deleting and recreating the CRD.

The text was updated successfully, but these errors were encountered:

shivamerla · 2021-05-25T23:21:55Z

@d-m Since the ClusterPolicy spec handles creation of eight Daemonsets, size of the CR has become huge. Yes, the limitation with last-applied-configuration annotation will break upgrades. Since all Daemonset values needs to be configurable, not sure we can overcome this with single CRD we have. eg driver spec is below:

driver:
  enabled: true
  repository: nvcr.io/nvidia
  image: driver
  version: "460.73.01"
  imagePullPolicy: IfNotPresent
  imagePullSecrets: []
  env: []
  tolerations:
  - key: nvidia.com/gpu
    operator: Exists
    effect: NoSchedule
  nodeSelector:
    nvidia.com/gpu.deploy.driver: "true"
  affinity: {}
  podSecurityContext: {}
  securityContext:
    privileged: true
    seLinuxOptions:
      level: "s0"
  resources: {}
  # private mirror repository configuration
  repoConfig:
    configMapName: ""
    destinationDir: ""
  # vGPU licensing configuration
  licensingConfig:
    configMapName: ""
  priorityClassName: system-node-critical

With the limitation on max size of CR(which will end up in last-applied-configuration annotation), it would make sense to split each Daemonset config into a separate CRD (i.e NvidiaDriver, NvidiaDevicePlugin, NvidiaDCGMExporter, NvidiaGPUFeatureDiscovery, NvidiaMIGManager, NvidiaContainerToolkit, NvidiaValidator) and individual CR's control configuration for each Daemonset we deploy.

Currently we don't support upgrade of ClusterPolicy types, so un-install and install is always recommended. We are looking to support upgrades in future releases, so this will be a design discussion we will have.

shivamerla · 2021-06-29T17:42:25Z

fixed with: f839a70

shivamerla added the enhancement label May 25, 2021

shivamerla self-assigned this May 25, 2021

shivamerla closed this as completed Jun 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The CustomResourceDefinition "clusterpolicies.nvidia.com" is invalid: metadata.annotations: Too long: must have at most 262144 bytes #194

The CustomResourceDefinition "clusterpolicies.nvidia.com" is invalid: metadata.annotations: Too long: must have at most 262144 bytes #194

d-m commented May 25, 2021

shivamerla commented May 25, 2021

shivamerla commented Jun 29, 2021

The CustomResourceDefinition "clusterpolicies.nvidia.com" is invalid: metadata.annotations: Too long: must have at most 262144 bytes #194

The CustomResourceDefinition "clusterpolicies.nvidia.com" is invalid: metadata.annotations: Too long: must have at most 262144 bytes #194

Comments

d-m commented May 25, 2021

shivamerla commented May 25, 2021

shivamerla commented Jun 29, 2021