Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster Local Model CR #3839

Merged
merged 39 commits into from
Sep 17, 2024
Merged

Conversation

greenmoon55
Copy link
Contributor

@greenmoon55 greenmoon55 commented Aug 6, 2024

What this PR does / why we need it:
This PR introduces a cluster model cache CR and a node group CR. With these two PRs, you can define a model that would be downloaded to multiple nodes which are defined by a node group CR (node selectors).

Example Model Cache CR

apiVersion: serving.kserve.io/v1alpha1
kind: ClusterLocalModel
metadata:
  name: iris
spec:
  modelSize: 1Gi
  nodeGroups:
    - gpu1
  sourceModelUri: gs://kfserving-examples/models/sklearn/1.0/model

NodeGroup CR

apiVersion: serving.kserve.io/v1alpha1
kind: LocalModelNodeGroup 
metadata:
  name: gpu1
spec:
  persistentVolumeSpec:
    accessModes:
    - ReadWriteOnce
    volumeMode: Filesystem
    capacity:
      storage: 2Gi
    hostPath:
      path: /models
      type: ""
    persistentVolumeReclaimPolicy: Delete
    storageClassName: standard
    nodeAffinity:
      required:
        nodeSelectorTerms:
        - matchExpressions:
          - key: node.kubernetes.io/instance-type
            operator: In
            values:
              - gpu1
  persistentVolumeClaimSpec:
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: 2Gi
    storageClassName: standard
    volumeMode: Filesystem
    volumeName: blah

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Type of changes
Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Feature/Issue validation/testing:

Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

  • Test A

  • Test B

  • Logs

Special notes for your reviewer:

  1. Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

Checklist:

  • Have you added unit/e2e tests that prove your fix is effective or that this feature works?
  • Has code been commented, particularly in hard-to-understand areas?
  • Have you made corresponding changes to the documentation?

Release note:


Re-running failed tests

  • /rerun-all - rerun all failed workflows.
  • /rerun-workflow <workflow name> - rerun a specific failed workflow. Only one workflow name can be specified. Multiple /rerun-workflow commands are allowed per comment.

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
@greenmoon55 greenmoon55 force-pushed the cluster-model-cache-cr-new branch from ca5d830 to 072109d Compare August 6, 2024 17:32
Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
@greenmoon55 greenmoon55 force-pushed the cluster-model-cache-cr-new branch from 69eafde to 6ad65f0 Compare August 6, 2024 19:35
Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

type CachedModelStatus struct {
OverallStatus OverallStatus `json:"overallStatus,omitempty"`
NodeStatus map[string]NodeStatus `json:"nodeStatus,omitempty"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

example: nodeX: Downloading

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest NodeModelStatus


// How many nodes have the model available
// +optional
ModelCopies *ModelCopies `json:"copies,omitempty"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for example: Suceessful: 10 total: 20

// How many nodes have the model available
// +optional
ModelCopies *ModelCopies `json:"copies,omitempty"`
InferenceServices []NamespacedName `json:"infereneceServices,omitempty"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ns1: isvc1, ns2: isvc2 uses this cached model

// Container spec for the storage initializer init container

// +kubebuilder:validation:XValidation:rule="self == oldSelf",message="StorageUri is immutable"
StorageUri string `json:"storageUri" validate:"required"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full match vs match prefix like "storageUriPrefix"

// +kubebuilder:validation:XValidation:rule="self == oldSelf",message="StorageUri is immutable"
StorageUri string `json:"storageUri" validate:"required"`
ModelSize resource.Quantity `json:"modelSize" validate:"required"`
NodeGroups []string `json:"nodeGroups" validate:"required"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one model can be cached on multiple nodegroups (gpu-1, gpu-2, gpu-3)

NodeGroups []string `json:"nodeGroups" validate:"required"`
// only local is supported for now
StorageType StorageType `json:"storageType" validate:"required"`
CleanupPolicy CleanupPolicy `json:"cleanupPolicy" validate:"required"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we also need a "download policy" that expects the model is on the node without creating jobs. Just for special maintenances.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove this field?


// StorageContainerSpec defines the container spec for the storage initializer init container, and the protocols it supports.
// +k8s:openapi-gen=true
type ClusterCachedModelSpec struct {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently this is cluster scope only.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still prefer the name ClusterLocalModel as the implementation of this CR is based on caching models locally.

// ModelCacheNodeGroupSpec defines a group of nodes for to download the model to.
// +k8s:openapi-gen=true
type ModelCacheNodeGroupSpec struct {
StorageLimit resource.Quantity `json:"storageLimit"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use the storageLimit on the PersistentVolume instead of this field. However that field is immutable which is less flexible.

// +k8s:openapi-gen=true
type ModelCacheNodeGroupSpec struct {
StorageLimit resource.Quantity `json:"storageLimit"`
NodeSelector map[string]string `json:"nodeSelector"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed with @sukumargaonkar. Two approaches here.
Approach 1 - It's up to the user to make sure they use the correct label selector so two node groups never have overlaps.
Approach 2 - Require a specific label set on nodes. e.g. kserve-nodegroup-name: xyz
Otherwise if we allow overlapping in nodegroups. storage size calculation would be tricky.

NodeSelector map[string]string `json:"nodeSelector"`
// +kubebuilder:validation:XValidation:rule="self == oldSelf",message="persistentVolume is immutable"
// PV spec template
PersistentVolume corev1.PersistentVolume `json:"persistentVolume"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will always create PVs for now. In the future, users might be able to leverage PersistentVolumeClaim with a custom CSI driver which makes sure different PVs map to the same disk location.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future we may have a namespace-scoped CR for NFS backed PV which does not share across namespaces. For non-local disks, sharing across namespaces is tricky as you would need different PVs and the volume plugin need to make sure they map to the same storage.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can we make it namespace scoped? Now the PV/PVC is defined on the node group CR but this probably never should be made namespace scoped.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another namespace-scoped model cache CR that doesn't use nodegroup cr.

// only local is supported for now
StorageType StorageType `json:"storageType" validate:"required"`
CleanupPolicy CleanupPolicy `json:"cleanupPolicy" validate:"required"`
}
Copy link
Contributor Author

@greenmoon55 greenmoon55 Aug 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Envvars for auth can be put in a configmap and added to the pod spec.

envFrom:
  - configMapRef:
      name: <config-file>
    - name: AWS_ACCESS_KEY_ID
      valueFrom:
        secretKeyRef:
          key: access-key
          name: name

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
@greenmoon55 greenmoon55 mentioned this pull request Aug 14, 2024
9 tasks

// +kubebuilder:validation:XValidation:rule="self == oldSelf",message="StorageUri is immutable"
// Original StorageUri
StorageUri string `json:"storageUri" validate:"required"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest renaming to SourceModelUri

// group of nodes to cache the model on.
NodeGroups []string `json:"nodeGroups" validate:"required"`
// only local is supported for now
StorageType StorageType `json:"storageType" validate:"required"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this CR is very specifically designed for local model cache, for other type of storage like AWS EBS fields like NodeGroup, cleanup policy would not apply I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about temporarily disabling deletion jobs for some reason, say we want to delete and then recreate the CR but keeping the files on the disk.

)

// CleanupPolicy enum
// +kubebuilder:validation:Enum="";DeleteModel;Ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are the other possible values? this sounds like a flag

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I can't think of other values for now.

@@ -32,6 +32,7 @@ type StorageContainerSpec struct {

// List of URI formats that this container supports
SupportedUriFormats []SupportedUriFormat `json:"supportedUriFormats" validate:"required"`
UseCase UseCase `json:"useCase" validate:"required"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest workloadType: InitContainer or Job

package v1alpha1

type CachedModelStatus struct {
OverallStatus OverallStatus `json:"overallStatus,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just call status?

}

// +k8s:openapi-gen=true
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@yuzisun yuzisun Aug 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replaced with +kubebuilder:object:root=true

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow we still call deepcopy-gen and removing it would cause an error while generating swagger files.

}

// +k8s:openapi-gen=true
// +kubebuilder:object:root=true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only need this line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

type ClusterLocalModelSpec struct {
// +kubebuilder:validation:XValidation:rule="self == oldSelf",message="StorageUri is immutable"
// Original StorageUri
SourceModelUri string `json:"sourceModelUri" validate:"required"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we need to consider supporting multiple source modelUris

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make it a list of models here or support multiple local model crs for one inference service.

@@ -0,0 +1,24 @@
/*
Copyright 2021 The KServe Authors.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Copyright 2021 The KServe Authors.
Copyright 2024 The KServe Authors.

@@ -0,0 +1,60 @@
/*
Copyright 2023 The KServe Authors.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Copyright 2023 The KServe Authors.
Copyright 2024 The KServe Authors.

Comment on lines +31 to +33
// +k8s:openapi-gen=true
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
// +genclient
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// +k8s:openapi-gen=true
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
// +genclient

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
}

// +k8s:openapi-gen=true
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow we still call deepcopy-gen and removing it would cause an error while generating swagger files.

type ClusterLocalModelSpec struct {
// +kubebuilder:validation:XValidation:rule="self == oldSelf",message="StorageUri is immutable"
// Original StorageUri
SourceModelUri string `json:"sourceModelUri" validate:"required"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make it a list of models here or support multiple local model crs for one inference service.

InferenceServices []NamespacedName `json:"inferenceServices,omitempty"`
}

type NamespacedName struct {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defining a new struct here because I got encountered struct field "Namespace" without JSON tag in type "NamespacedName" if I use types.NamespacedName.

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
}

// +k8s:openapi-gen=true
// +kubebuilder:object:root=true
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

type NodeStatus string

// NodeStatus Enum values
const (
Copy link
Contributor Author

@greenmoon55 greenmoon55 Aug 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the daemonset in the future, Node status should be ModelExists or ModelDoesNotExist reported by the daemonset. For other status, we can look at individual jobs?
It still makes sense if we do not go down the daemonset approach.

…ntainers.yaml

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
@greenmoon55
Copy link
Contributor Author

/rerun

@greenmoon55
Copy link
Contributor Author

/rerun-all

@greenmoon55 greenmoon55 force-pushed the cluster-model-cache-cr-new branch from 4a78624 to 531ae53 Compare September 12, 2024 18:20
Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
@greenmoon55 greenmoon55 force-pushed the cluster-model-cache-cr-new branch from 531ae53 to c6fef73 Compare September 12, 2024 18:29

type ClusterLocalModelStatus struct {
// Status of the model on a node, like NodeDownloaded or NodeNotReady
NodeStatus map[string]NodeStatus `json:"nodeStatus,omitempty"`
Copy link
Member

@yuzisun yuzisun Sep 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can create a LocalModelNodeGroup controller and create the LocalModelNode resource per node, in this way we can spawn the model reconciler daemonset and update the status on each LocalModelNode.

@yuzisun
Copy link
Member

yuzisun commented Sep 17, 2024

/lgtm
/approve

@yuzisun
Copy link
Member

yuzisun commented Sep 17, 2024

Discussed with @greenmoon55 to create a follow up PR to add a LocalModelNode custom resource to track the model status for each node in the group.

@yuzisun yuzisun merged commit acfc887 into kserve:master Sep 17, 2024
57 checks passed
hustxiayang pushed a commit to hustxiayang/kserve that referenced this pull request Sep 20, 2024
* new model cache cr

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* update crd

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Fix genereted python tests

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Fix test failure

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Make nodegroup a list field in model cache cr

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* fix lint

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* minor updates to model cache cr

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Add usecase field to cluster storage container

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Fix test failures

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Change variable name

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Fix lint

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Fix default storage container cr

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* fix defualt.yaml

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Remove storagelimit field from node group

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Fix python code

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Change some fields

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Rename crd

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Fix lint error in python test files

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Rename CR

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Add status to local model node group

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Add missing node status

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Remove files related to ClusterLocalNodeGroup

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Add default value for workload type

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Fix StorageContainerSpec WorkloadType default value

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* nodegroups -> nodegroup

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Add comments

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Add back storageLimit

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

* Update charts/kserve-crd/templates/serving.kserve.io_clusterstoragecontainers.yaml

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>

---------

Signed-off-by: Jin Dong <greenmoon55@users.noreply.github.com>
@greenmoon55 greenmoon55 deleted the cluster-model-cache-cr-new branch December 23, 2024 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants