Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added huggingfaceserver rock and tests #103

Merged
merged 61 commits into from
Jan 30, 2025
Merged

Conversation

BON4
Copy link
Contributor

@BON4 BON4 commented Dec 19, 2024

Description


  • Add rockcraft.yaml
  • Add tests
  • Add tox.ini

This is re-open request for #91.

Logs from tests

juju status after all tests are passed:

Model     Controller          Cloud/Region        Version  SLA          Timestamp
kubeflow  microk8s-localhost  microk8s/localhost  3.6.1    unsupported  00:59:28Z

App                      Version                Status   Scale  Charm                    Channel          Rev  Address         Exposed  Message
grafana-agent-k8s        0.40.4                 blocked      1  grafana-agent-k8s        latest/stable     80  10.152.183.167  no       Missing ['grafana-cloud-config']|['logging-consumer'] for logging-provider; ['grafana-cloud-config']|['send-remote-wr...
istio-ingressgateway                            active       1  istio-gateway            1.16/stable     1005  10.152.183.244  no       
istio-pilot                                     active       1  istio-pilot              1.16/stable      662  10.152.183.243  no       
knative-operator                                active       1  knative-operator         latest/edge      542  10.152.183.230  no       
knative-serving                                 active       1  knative-serving          latest/edge      572  10.152.183.100  no       
kserve-controller                               blocked      1  kserve-controller                           0  10.152.183.123  no       Cannot parse a config-defined images list from config '{' - thisconfig input will be ignored.
metacontroller-operator                         active       1  metacontroller-operator  latest/edge      401  10.152.183.89   no       
minio                    res:oci-image@1755999  active       1  minio                    ckf-1.7/stable   214  10.152.183.68   no       
resource-dispatcher                             active       1  resource-dispatcher      latest/edge      236  10.152.183.181  no       

Unit                        Workload  Agent  Address      Ports          Message
grafana-agent-k8s/0*        blocked   idle   10.1.172.40                 Missing ['grafana-cloud-config']|['logging-consumer'] for logging-provider; ['grafana-cloud-config']|['send-remote-wr...
istio-ingressgateway/0*     active    idle   10.1.172.35                 
istio-pilot/0*              active    idle   10.1.172.34                 
knative-operator/0*         active    idle   10.1.172.46                 
knative-serving/0*          active    idle   10.1.172.47                 
kserve-controller/0*        blocked   idle   10.1.172.38                 Cannot parse a config-defined images list from config '{' - thisconfig input will be ignored.
metacontroller-operator/0*  active    idle   10.1.172.60                 
minio/0*                    active    idle   10.1.172.59  9000-9001/TCP  
resource-dispatcher/0*      active    idle   10.1.172.62                 

Logs of passed integration tests tox -vve integration -- --model kubeflow --keep-models (only last included due to PR message maximum length):

---------------------------------------------------------------------------------------------- live log teardown -----------------------------------------------------------------------------------------------
INFO     pytest_operator.plugin:plugin.py:903 Model status:

Model     Controller          Cloud/Region        Version  SLA          Timestamp
kubeflow  microk8s-localhost  microk8s/localhost  3.6.1    unsupported  00:51:28Z

App                      Version                Status   Scale  Charm                    Channel          Rev  Address         Exposed  Message
grafana-agent-k8s        0.40.4                 blocked      1  grafana-agent-k8s        latest/stable     80  10.152.183.167  no       Missing ['grafana-cloud-config']|['logging-consumer'] for logging-provider; ['grafana-cloud-config']|['send-remote-wr...
istio-ingressgateway                            active       1  istio-gateway            1.16/stable     1005  10.152.183.244  no       
istio-pilot                                     active       1  istio-pilot              1.16/stable      662  10.152.183.243  no       
knative-operator                                active       1  knative-operator         latest/edge      542  10.152.183.230  no       
knative-serving                                 active       1  knative-serving          latest/edge      572  10.152.183.100  no       
kserve-controller                               blocked      1  kserve-controller                           0  10.152.183.123  no       Cannot parse a config-defined images list from config '{' - thisconfig input will be ignored.
metacontroller-operator                         active       1  metacontroller-operator  latest/edge      401  10.152.183.89   no       
minio                    res:oci-image@1755999  active       1  minio                    ckf-1.7/stable   214  10.152.183.68   no       
resource-dispatcher                             active       1  resource-dispatcher      latest/edge      236  10.152.183.181  no       

Unit                        Workload  Agent  Address      Ports          Message
grafana-agent-k8s/0*        blocked   idle   10.1.172.40                 Missing ['grafana-cloud-config']|['logging-consumer'] for logging-provider; ['grafana-cloud-config']|['send-remote-wr...
istio-ingressgateway/0*     active    idle   10.1.172.35                 
istio-pilot/0*              active    idle   10.1.172.34                 
knative-operator/0*         active    idle   10.1.172.46                 
knative-serving/0*          active    idle   10.1.172.47                 
kserve-controller/0*        blocked   idle   10.1.172.38                 Cannot parse a config-defined images list from config '{' - thisconfig input will be ignored.
metacontroller-operator/0*  active    idle   10.1.172.60                 
minio/0*                    active    idle   10.1.172.59  9000-9001/TCP  
resource-dispatcher/0*      active    idle   10.1.172.62                 

INFO     pytest_operator.plugin:plugin.py:909 Juju error logs:

unit-kserve-controller-0: 00:32:03 ERROR unit.kserve-controller/0.juju-log Failed to handle <InstallEvent via KServeControllerCharm/on/install[1]> with error: Please relate to istio-pilot:gateway-info
unit-kserve-controller-0: 00:46:30 ERROR unit.kserve-controller/0.juju-log object-storage:6: Failed to handle <RelationChangedEvent via KServeControllerCharm/on/object_storage_relation_changed[151]> with error: Waiting for object-storage relation data
unit-kserve-controller-0: 00:51:08 ERROR unit.kserve-controller/0.juju-log Failed to handle <ConfigChangedEvent via KServeControllerCharm/on/config_changed[206]> with error: Cannot parse a config-defined images list from config '{' - thisconfig input will be ignored.

INFO     pytest_operator.plugin:plugin.py:991 Forgetting model main...
INFO     httpx:_client.py:1038 HTTP Request: DELETE https://127.0.0.1:16443/api/v1/namespaces/test-namespace-resource-dispatcher "HTTP/1.1 200 OK"


======================================================================================= 18 passed in 2862.29s (0:47:42) ========================================================================================
integration: 2864218 I exit 0 (2863.82 seconds) /home/ubuntu/kserve-operators/charms/kserve-controller> pytest -v --tb native --ignore=/home/ubuntu/kserve-operators/charms/kserve-controller/tests/unit --log-cli-level=INFO -s --model kubeflow --keep-models pid=237295 [tox/execute/api.py:286]
  integration: OK (2863.91=setup[0.09]+cmd[2863.82] seconds)
  congratulations :) (2864.02 seconds)

@BON4 BON4 changed the title Huggingfaceserver Added huggingfaceserver rock and tests Dec 20, 2024
Copy link
Member

@misohu misohu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BON4
Copy link
Contributor Author

BON4 commented Jan 10, 2025

@misohu Added missing vllm pip package and cuda-12-1 runtime.

@BON4 BON4 requested a review from misohu January 13, 2025 12:35
@BON4 BON4 requested a review from misohu January 20, 2025 23:46
Copy link
Member

@misohu misohu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a problem with the rock when trying to run it with docker ... container is stuck in restarts (that is expected). But when I bash into container and check logs

docker exec -ti d7b09f4eb8c5 bash
pebble logs

I am getting

_daemon_@d7b09f4eb8c5:/$ pebble logs
2025-01-27T10:59:37.671Z [huggingfaceserver] : No module named huggingfaceserver
2025-01-27T10:59:38.223Z [huggingfaceserver] : No module named huggingfaceserver
2025-01-27T10:59:39.332Z [huggingfaceserver] : No module named huggingfaceserver
2025-01-27T10:59:41.514Z [huggingfaceserver] : No module named huggingfaceserver
2025-01-27T10:59:45.927Z [huggingfaceserver] : No module named huggingfaceserver
2025-01-27T10:59:53.972Z [huggingfaceserver] : No module named huggingfaceserver

@BON4
Copy link
Contributor Author

BON4 commented Jan 27, 2025

@misohu Issue with "No module named huggingfaceserver" fixed by adding PYTHONPATH: "/usr/local/lib/python3.10/dist-packages" env.

@BON4 BON4 requested a review from misohu January 27, 2025 13:39
@misohu
Copy link
Member

misohu commented Jan 29, 2025

I was able to test this rock with kserrve operator by running the Inferenceservice for google's bert with hugging face serving runtime. The test is now part of this PR canonical/kserve-operators#298. Test was running also this rock.

@misohu
Copy link
Member

misohu commented Jan 30, 2025

We agreed to merge this rock based on manual tests.

@misohu misohu merged commit c9c9c80 into canonical:main Jan 30, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants