-
Notifications
You must be signed in to change notification settings - Fork 863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve kserve protocol version handling #2957
Improve kserve protocol version handling #2957
Conversation
d10ac9f
to
93ffaa8
Compare
I currently don't know why the changed file linting failed as I installed the pre-commit hooks and just to be sure I manually ran them on the files I changed and they where all deemed clean. |
93ffaa8
to
3992bc4
Compare
@sgaist Can you please mention in the PR what the current issue and what this PR is addressing |
5675c03
to
effb601
Compare
@agunapal Description fixed ! Sorry for the wrong content, I had it opened in parallel to the ticket and thought I had updated the description before saving. |
@agunapal I just realized that in order for the test to run properly, kserve must be installed and based on my experience the What would you recommend to go forward ? Isolate the test and create a dedicate flow for it ? |
Yes, I'm not sure if we want to do this. Can you please check if you can add a test here https://github.com/pytorch/serve/blob/master/kubernetes/kserve/tests/scripts/test_mnist.sh |
I think the overhead is not worth it. After having fiddled a bit more with
Number one has the downside of using an outdated version of KServe. The options are listed in ascending order in terms of resource/time impact. Which one would you like to have ? |
Thanks for looking into this. FYI, we are making a release soon. So, we will review this after the release. Also curious, if this is resolved in the latest version of kserve, can't you use that? |
The current implementation retrieves the protocol to use from the "PROTOCOL_VERSION" environment variable however there's no default value which will trigger an error when served through Kserve as the base class does a protocol check in the predict method that will fail with None. The default protocol uses the same value as the base class.
This will allow to make the wrapper easier to test beside making it possible to to change where the file should be looked for.
In case None is passed, keep the default value set in the base __init__. This makes TorchserveModel behave in the same fashion as the base class.
It now validates that the protocol version is passed properly as well as failure if an invalid protocol is given.
The overhead required to run the tests outweighs it benefits. The thing to keep in mind is that the fix allows to run models through kserve deployments prior to 0.11.1.
378e1c3
to
15336a2
Compare
KServe 0.11.0 run for protocol v2 prior to this PRCheck status of the pod NAME READY STATUS RESTARTS AGE torchserve-mnist-v2-predictor-00001-deployment-9569d6f8b-8w6pb 1/2 Running 0 2m Name: torchserve-mnist-v2-predictor-00001-deployment-9569d6f8b-8w6pb Namespace: default Priority: 0 Service Account: default Node: minikube/192.168.49.2 Start Time: Thu, 29 Feb 2024 10:26:12 +0100 Labels: app=torchserve-mnist-v2-predictor-00001 component=predictor pod-template-hash=9569d6f8b service.istio.io/canonical-name=torchserve-mnist-v2-predictor service.istio.io/canonical-revision=torchserve-mnist-v2-predictor-00001 serviceEnvelope=kservev2 serving.knative.dev/configuration=torchserve-mnist-v2-predictor serving.knative.dev/configurationGeneration=1 serving.knative.dev/configurationUID=a40d926b-2a09-4d00-9aa5-34e09fed7608 serving.knative.dev/revision=torchserve-mnist-v2-predictor-00001 serving.knative.dev/revisionUID=cdff5033-616f-454c-a27c-03984823cd74 serving.knative.dev/service=torchserve-mnist-v2-predictor serving.knative.dev/serviceUID=f04ce05a-78a7-44ed-8473-3cf46fb117b6 serving.kserve.io/inferenceservice=torchserve-mnist-v2 Annotations: autoscaling.knative.dev/class: kpa.autoscaling.knative.dev autoscaling.knative.dev/min-scale: 1 internal.serving.kserve.io/storage-initializer-sourceuri: gs://kfserving-examples/models/torchserve/image_classifier/v2 prometheus.kserve.io/path: /metrics prometheus.kserve.io/port: 8082 serving.knative.dev/creator: system:serviceaccount:kserve:kserve-controller-manager serving.kserve.io/enable-metric-aggregation: false serving.kserve.io/enable-prometheus-scraping: false Status: Running IP: 10.244.0.17 IPs: IP: 10.244.0.17 Controlled By: ReplicaSet/torchserve-mnist-v2-predictor-00001-deployment-9569d6f8b Init Containers: storage-initializer: Container ID: docker://6f8e446c5344730eea59403bf3d9b3307fe96c3d8df8c6cdcd63773a852aec84 Image: kserve/storage-initializer:v0.11.0 Image ID: docker-pullable://kserve/storage-initializer@sha256:962682077dc30c21113822f50b63400d759ce6c49518d0c9caa638b6f77c7fed Port: Host Port: Args: gs://kfserving-examples/models/torchserve/image_classifier/v2 /mnt/models State: Terminated Reason: Completed Exit Code: 0 Started: Thu, 29 Feb 2024 10:26:39 +0100 Finished: Thu, 29 Feb 2024 10:26:56 +0100 Ready: True Restart Count: 0 Limits: cpu: 1 memory: 1Gi Requests: cpu: 100m memory: 100Mi Environment: Mounts: /mnt/models from kserve-provision-location (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-8wbm2 (ro) Containers: kserve-container: Container ID: docker://cc99e7eecaf2c842fcea9cc30fb301e6427a4cfa259f7b42cb0806e6a92ea1e0 Image: index.docker.io/pytorch/torchserve-kfs-nightly@sha256:5ec33cf3085b56b5eedb6b11c67faa5b7e669019573a26459eb49a2394531be2 Image ID: docker-pullable://pytorch/torchserve-kfs-nightly@sha256:5ec33cf3085b56b5eedb6b11c67faa5b7e669019573a26459eb49a2394531be2 Port: 8080/TCP Host Port: 0/TCP Args: torchserve --start --model-store=/mnt/models/model-store --ts-config=/mnt/models/config/config.properties State: Running Started: Thu, 29 Feb 2024 10:28:02 +0100 Ready: True Restart Count: 0 Limits: cpu: 1 memory: 1Gi Requests: cpu: 100m memory: 256Mi Environment: TS_SERVICE_ENVELOPE: kservev2 PORT: 8080 K_REVISION: torchserve-mnist-v2-predictor-00001 K_CONFIGURATION: torchserve-mnist-v2-predictor K_SERVICE: torchserve-mnist-v2-predictor Mounts: /mnt/models from kserve-provision-location (ro) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-8wbm2 (ro) queue-proxy: Container ID: docker://ad160b0a42f80dc14996c9acf54185125788bca57ec79eb36ca9193823a37521 Image: gcr.io/knative-releases/knative.dev/serving/cmd/queue@sha256:65c427aaab3be9cea1afea32cdef26d5855c69403077d2dc3439f75c26a1e83f Image ID: docker-pullable://gcr.io/knative-releases/knative.dev/serving/cmd/queue@sha256:65c427aaab3be9cea1afea32cdef26d5855c69403077d2dc3439f75c26a1e83f Ports: 8022/TCP, 9090/TCP, 9091/TCP, 8012/TCP, 8112/TCP Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP State: Running Started: Thu, 29 Feb 2024 10:28:07 +0100 Ready: False Restart Count: 0 Requests: cpu: 25m Readiness: http-get http://:8012/ delay=0s timeout=1s period=10s #success=1 #failure=3 Environment: SERVING_NAMESPACE: default SERVING_SERVICE: torchserve-mnist-v2-predictor SERVING_CONFIGURATION: torchserve-mnist-v2-predictor SERVING_REVISION: torchserve-mnist-v2-predictor-00001 QUEUE_SERVING_PORT: 8012 QUEUE_SERVING_TLS_PORT: 8112 CONTAINER_CONCURRENCY: 0 REVISION_TIMEOUT_SECONDS: 300 REVISION_RESPONSE_START_TIMEOUT_SECONDS: 0 REVISION_IDLE_TIMEOUT_SECONDS: 0 SERVING_POD: torchserve-mnist-v2-predictor-00001-deployment-9569d6f8b-8w6pb (v1:metadata.name) SERVING_POD_IP: (v1:status.podIP) SERVING_LOGGING_CONFIG: SERVING_LOGGING_LEVEL: SERVING_REQUEST_LOG_TEMPLATE: {"httpRequest": {"requestMethod": "{{.Request.Method}}", "requestUrl": "{{js .Request.RequestURI}}", "requestSize": "{{.Request.ContentLength}}", "status": {{.Response.Code}}, "responseSize": "{{.Response.Size}}", "userAgent": "{{js .Request.UserAgent}}", "remoteIp": "{{js .Request.RemoteAddr}}", "serverIp": "{{.Revision.PodIP}}", "referer": "{{js .Request.Referer}}", "latency": "{{.Response.Latency}}s", "protocol": "{{.Request.Proto}}"}, "traceId": "{{index .Request.Header "X-B3-Traceid"}}"} SERVING_ENABLE_REQUEST_LOG: false SERVING_REQUEST_METRICS_BACKEND: prometheus TRACING_CONFIG_BACKEND: none TRACING_CONFIG_ZIPKIN_ENDPOINT: TRACING_CONFIG_DEBUG: false TRACING_CONFIG_SAMPLE_RATE: 0.1 USER_PORT: 8080 SYSTEM_NAMESPACE: knative-serving METRICS_DOMAIN: knative.dev/internal/serving SERVING_READINESS_PROBE: {"tcpSocket":{"port":8080,"host":"127.0.0.1"},"successThreshold":1} ENABLE_PROFILING: false SERVING_ENABLE_PROBE_REQUEST_LOG: false METRICS_COLLECTOR_ADDRESS: HOST_IP: (v1:status.hostIP) ENABLE_HTTP2_AUTO_DETECTION: false ROOT_CA: Mounts: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-8wbm2 (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: kube-api-access-8wbm2: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true kserve-provision-location: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: SizeLimit: QoS Class: Burstable Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 2m default-scheduler Successfully assigned default/torchserve-mnist-v2-predictor-00001-deployment-9569d6f8b-8w6pb to minikube Normal Pulling 117s kubelet Pulling image "kserve/storage-initializer:v0.11.0" Normal Pulled 100s kubelet Successfully pulled image "kserve/storage-initializer:v0.11.0" in 17.382s (17.382s including waiting) Normal Created 93s kubelet Created container storage-initializer Normal Started 93s kubelet Started container storage-initializer Normal Pulling 76s kubelet Pulling image "index.docker.io/pytorch/torchserve-kfs-nightly@sha256:5ec33cf3085b56b5eedb6b11c67faa5b7e669019573a26459eb49a2394531be2" Normal Pulled 13s kubelet Successfully pulled image "index.docker.io/pytorch/torchserve-kfs-nightly@sha256:5ec33cf3085b56b5eedb6b11c67faa5b7e669019573a26459eb49a2394531be2" in 1m2.971s (1m2.971s including waiting) Normal Created 11s kubelet Created container kserve-container Normal Started 10s kubelet Started container kserve-container Normal Pulling 10s kubelet Pulling image "gcr.io/knative-releases/knative.dev/serving/cmd/queue@sha256:65c427aaab3be9cea1afea32cdef26d5855c69403077d2dc3439f75c26a1e83f" Normal Pulled 6s kubelet Successfully pulled image "gcr.io/knative-releases/knative.dev/serving/cmd/queue@sha256:65c427aaab3be9cea1afea32cdef26d5855c69403077d2dc3439f75c26a1e83f" in 4.128s (4.128s including waiting) Normal Created 5s kubelet Created container queue-proxy Normal Started 5s kubelet Started container queue-proxy Warning Unhealthy 2s (x2 over 4s) kubelet Readiness probe failed: Get "http://10.244.0.17:8012/": context deadline exceeded (Client.Timeout exceeded while awaiting headers)Wait for inference service to be ready KServe 0.11.0 run for default protocol prior to this PRMNIST KServe V1 test begin Deploying the cluster inferenceservice.serving.kserve.io/torchserve created Waiting for pod to come up... Check status of the pod NAME READY STATUS RESTARTS AGE torchserve-predictor-00001-deployment-c8b8c4595-mxtpf 1/2 Running 0 115s Name: torchserve-predictor-00001-deployment-c8b8c4595-mxtpf Namespace: default Priority: 0 Service Account: default Node: minikube/192.168.49.2 Start Time: Thu, 29 Feb 2024 09:35:17 +0100 Labels: app=torchserve-predictor-00001 component=predictor pod-template-hash=c8b8c4595 service.istio.io/canonical-name=torchserve-predictor service.istio.io/canonical-revision=torchserve-predictor-00001 serviceEnvelope=kserve serving.knative.dev/configuration=torchserve-predictor serving.knative.dev/configurationGeneration=1 serving.knative.dev/configurationUID=36ca148d-50d5-402e-9c78-3becaa325179 serving.knative.dev/revision=torchserve-predictor-00001 serving.knative.dev/revisionUID=a1de2f03-1a4f-4620-a18c-e96e568709b8 serving.knative.dev/service=torchserve-predictor serving.knative.dev/serviceUID=aeb06d26-b27d-42fc-8167-46c1521410ea serving.kserve.io/inferenceservice=torchserve Annotations: autoscaling.knative.dev/class: kpa.autoscaling.knative.dev autoscaling.knative.dev/min-scale: 1 internal.serving.kserve.io/storage-initializer-sourceuri: gs://kfserving-examples/models/torchserve/image_classifier/v1 prometheus.kserve.io/path: /metrics prometheus.kserve.io/port: 8082 serving.knative.dev/creator: system:serviceaccount:kserve:kserve-controller-manager serving.kserve.io/enable-metric-aggregation: false serving.kserve.io/enable-prometheus-scraping: false Status: Running IP: 10.244.0.17 IPs: IP: 10.244.0.17 Controlled By: ReplicaSet/torchserve-predictor-00001-deployment-c8b8c4595 Init Containers: storage-initializer: Container ID: docker://98f5c07ad65f6095f52916559fa118f45c64ffb342637c2caac5b55aa8e718a6 Image: kserve/storage-initializer:v0.11.0 Image ID: docker-pullable://kserve/storage-initializer@sha256:962682077dc30c21113822f50b63400d759ce6c49518d0c9caa638b6f77c7fed Port: Host Port: Args: gs://kfserving-examples/models/torchserve/image_classifier/v1 /mnt/models State: Terminated Reason: Completed Exit Code: 0 Started: Thu, 29 Feb 2024 09:35:43 +0100 Finished: Thu, 29 Feb 2024 09:35:59 +0100 Ready: True Restart Count: 0 Limits: cpu: 1 memory: 1Gi Requests: cpu: 100m memory: 100Mi Environment: Mounts: /mnt/models from kserve-provision-location (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-pfjj8 (ro) Containers: kserve-container: Container ID: docker://9f5a3a87a2cb3d393e24806e178f7fbd66cdc74bf99875334336ca8897d6cac7 Image: index.docker.io/pytorch/torchserve-kfs-nightly@sha256:5ec33cf3085b56b5eedb6b11c67faa5b7e669019573a26459eb49a2394531be2 Image ID: docker-pullable://pytorch/torchserve-kfs-nightly@sha256:5ec33cf3085b56b5eedb6b11c67faa5b7e669019573a26459eb49a2394531be2 Port: 8080/TCP Host Port: 0/TCP Args: torchserve --start --model-store=/mnt/models/model-store --ts-config=/mnt/models/config/config.properties State: Running Started: Thu, 29 Feb 2024 09:37:05 +0100 Ready: True Restart Count: 0 Limits: cpu: 1 memory: 1Gi Requests: cpu: 100m memory: 256Mi Environment: TS_SERVICE_ENVELOPE: kserve PORT: 8080 K_REVISION: torchserve-predictor-00001 K_CONFIGURATION: torchserve-predictor K_SERVICE: torchserve-predictor Mounts: /mnt/models from kserve-provision-location (ro) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-pfjj8 (ro) queue-proxy: Container ID: docker://eb7c37a231243d0e19973d392aa8a9425429eb10aea1c8a415353768cae214ad Image: gcr.io/knative-releases/knative.dev/serving/cmd/queue@sha256:65c427aaab3be9cea1afea32cdef26d5855c69403077d2dc3439f75c26a1e83f Image ID: docker-pullable://gcr.io/knative-releases/knative.dev/serving/cmd/queue@sha256:65c427aaab3be9cea1afea32cdef26d5855c69403077d2dc3439f75c26a1e83f Ports: 8022/TCP, 9090/TCP, 9091/TCP, 8012/TCP, 8112/TCP Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP State: Running Started: Thu, 29 Feb 2024 09:37:11 +0100 Ready: False Restart Count: 0 Requests: cpu: 25m Readiness: http-get http://:8012/ delay=0s timeout=1s period=10s #success=1 #failure=3 Environment: SERVING_NAMESPACE: default SERVING_SERVICE: torchserve-predictor SERVING_CONFIGURATION: torchserve-predictor SERVING_REVISION: torchserve-predictor-00001 QUEUE_SERVING_PORT: 8012 QUEUE_SERVING_TLS_PORT: 8112 CONTAINER_CONCURRENCY: 0 REVISION_TIMEOUT_SECONDS: 300 REVISION_RESPONSE_START_TIMEOUT_SECONDS: 0 REVISION_IDLE_TIMEOUT_SECONDS: 0 SERVING_POD: torchserve-predictor-00001-deployment-c8b8c4595-mxtpf (v1:metadata.name) SERVING_POD_IP: (v1:status.podIP) SERVING_LOGGING_CONFIG: SERVING_LOGGING_LEVEL: SERVING_REQUEST_LOG_TEMPLATE: {"httpRequest": {"requestMethod": "{{.Request.Method}}", "requestUrl": "{{js .Request.RequestURI}}", "requestSize": "{{.Request.ContentLength}}", "status": {{.Response.Code}}, "responseSize": "{{.Response.Size}}", "userAgent": "{{js .Request.UserAgent}}", "remoteIp": "{{js .Request.RemoteAddr}}", "serverIp": "{{.Revision.PodIP}}", "referer": "{{js .Request.Referer}}", "latency": "{{.Response.Latency}}s", "protocol": "{{.Request.Proto}}"}, "traceId": "{{index .Request.Header "X-B3-Traceid"}}"} SERVING_ENABLE_REQUEST_LOG: false SERVING_REQUEST_METRICS_BACKEND: prometheus TRACING_CONFIG_BACKEND: none TRACING_CONFIG_ZIPKIN_ENDPOINT: TRACING_CONFIG_DEBUG: false TRACING_CONFIG_SAMPLE_RATE: 0.1 USER_PORT: 8080 SYSTEM_NAMESPACE: knative-serving METRICS_DOMAIN: knative.dev/internal/serving SERVING_READINESS_PROBE: {"tcpSocket":{"port":8080,"host":"127.0.0.1"},"successThreshold":1} ENABLE_PROFILING: false SERVING_ENABLE_PROBE_REQUEST_LOG: false METRICS_COLLECTOR_ADDRESS: HOST_IP: (v1:status.hostIP) ENABLE_HTTP2_AUTO_DETECTION: false ROOT_CA: Mounts: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-pfjj8 (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: kube-api-access-pfjj8: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true kserve-provision-location: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: SizeLimit: QoS Class: Burstable Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 115s default-scheduler Successfully assigned default/torchserve-predictor-00001-deployment-c8b8c4595-mxtpf to minikube Normal Pulling 113s kubelet Pulling image "kserve/storage-initializer:v0.11.0" Normal Pulled 95s kubelet Successfully pulled image "kserve/storage-initializer:v0.11.0" in 17.862s (17.862s including waiting) Normal Created 89s kubelet Created container storage-initializer Normal Started 89s kubelet Started container storage-initializer Normal Pulling 72s kubelet Pulling image "index.docker.io/pytorch/torchserve-kfs-nightly@sha256:5ec33cf3085b56b5eedb6b11c67faa5b7e669019573a26459eb49a2394531be2" Normal Pulled 9s kubelet Successfully pulled image "index.docker.io/pytorch/torchserve-kfs-nightly@sha256:5ec33cf3085b56b5eedb6b11c67faa5b7e669019573a26459eb49a2394531be2" in 1m2.861s (1m2.861s including waiting) Normal Created 7s kubelet Created container kserve-container Normal Started 7s kubelet Started container kserve-container Normal Pulling 7s kubelet Pulling image "gcr.io/knative-releases/knative.dev/serving/cmd/queue@sha256:65c427aaab3be9cea1afea32cdef26d5855c69403077d2dc3439f75c26a1e83f" Normal Pulled 2s kubelet Successfully pulled image "gcr.io/knative-releases/knative.dev/serving/cmd/queue@sha256:65c427aaab3be9cea1afea32cdef26d5855c69403077d2dc3439f75c26a1e83f" in 4.483s (4.483s including waiting) Normal Created 1s kubelet Created container queue-proxy Normal Started 1s kubelet Started container queue-proxyWait for inference service to be ready I did two runs to show that having the While I can update the version of KServe, I think many people creating models to be used by torchserve do not handle the underlying infrastructure it will run on, so fixing the wrapper will have a broader impact. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* fix(kserve): ensure there's a default protocol configured The current implementation retrieves the protocol to use from the "PROTOCOL_VERSION" environment variable however there's no default value which will trigger an error when served through Kserve as the base class does a protocol check in the predict method that will fail with None. The default protocol uses the same value as the base class. * fix(kserve): ensure the protocol version configured is a valid value * feat(kserve): make configuration file path configurable This will allow to make the wrapper easier to test beside making it possible to to change where the file should be looked for. * test: add KServe wrapper test * test(kserve_wrapper): add protobuf code generation * fix(kserse): add None handling to the wrapper In case None is passed, keep the default value set in the base __init__. This makes TorchserveModel behave in the same fashion as the base class. * refactor(test): rewrite wrapper test to be more complete It now validates that the protocol version is passed properly as well as failure if an invalid protocol is given. * test: remove kserve pytest tests The overhead required to run the tests outweighs it benefits. The thing to keep in mind is that the fix allows to run models through kserve deployments prior to 0.11.1.
Description
This PR ensures that a default value for the KServe protocol is passed forward to the
ModelServer
class and that the class itself uses a default value if instantiated directly.With the current implementation, if the
PROTOCOL_VERSION
environment variable is not set, theprotocol
property of theModelServer
instance will be as well and this will trigger an exception when on data return after processing as the base class does a check on its value.Fixes #2956
Type of change
Please delete options that are not relevant.
Feature/Issue validation/testing
A new test has been added to ensure that:
PROTOCOL_VERSION
environment variable is not set, a default is usedPROTOCOL_VERSION
environment variable is set to a valid value, it's indeed that one that is usedPROTOCOL_VERSION
environment variable is set to an invalid value, the wrapper fails to startChecklist: