Resources and autoscaling fields in BentoDeployment CRD meaning. #3822
-
There are |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hi @dzhelonkin - the |
Beta Was this translation helpful? Give feedback.
Hi @dzhelonkin - the
spec.resources
means the resource allocated for the API servers, essentially the pod spec for running users' service API code. Whereasspec.runners[].resources
are pod specs for Runners used in the service. E.g. if you have a model that uses GPU, you should put the GPU resource underspec.runners[].resources
. The same applies to theautoscaling
field. Note that when deploying BentoDeployment CR with Yatai, runners and API server will be distributed across different pods allowing it to scale and utilize resources more efficiently.