-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add certificate doc #326
add certificate doc #326
Conversation
Signed-off-by: jooho <jlee@redhat.com>
✅ Deploy Preview for elastic-nobel-0aef7a ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
mkdocs.yml
Outdated
@@ -88,6 +88,8 @@ nav: | |||
- Inference Observability: | |||
- Prometheus Metrics: modelserving/observability/prometheus_metrics.md | |||
- Grafana Dashboards: modelserving/observability/grafana_dashboards.md | |||
- Certiricate: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Jooho Can we add this under Model Storage
section ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also there is a typo
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Jooho, yuzisun The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* add certificate doc Signed-off-by: jooho <jlee@redhat.com> * Update mkdocs.yml Signed-off-by: Dan Sun <dsun20@bloomberg.net> --------- Signed-off-by: jooho <jlee@redhat.com> Signed-off-by: Dan Sun <dsun20@bloomberg.net> Co-authored-by: Dan Sun <dsun20@bloomberg.net>
author Dan Sun <dsun20@bloomberg.net> 1698039744 -0400 committer agriffith50 <agriffith50@bloomberg.net> 1716219052 -0400 parent 2257489 author Dan Sun <dsun20@bloomberg.net> 1698039744 -0400 committer agriffith50 <agriffith50@bloomberg.net> 1716218313 -0400 parent 2257489 author Dan Sun <dsun20@bloomberg.net> 1698039744 -0400 committer agriffith50 <agriffith50@bloomberg.net> 1716217744 -0400 Add TorchServe Huggingface accelerate example (kserve#304) * Add LLM example for huggingface accelerate Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Add inputs Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update storage uri Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Add to LLM runtime to index Signed-off-by: Dan Sun <dsun20@bloomberg.net> --------- Signed-off-by: Dan Sun <dsun20@bloomberg.net> 0.11 release blog (kserve#310) * Add 0.11 release blog Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update blog Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Add vllm runtime doc Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Add vllm example doc Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update blog link Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Add vLLM intro Signed-off-by: Dan Sun <dsun20@bloomberg.net> * add python runtime open inference protocol tutorials Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Fix warning Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Add warning Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Address comments Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Fix newline Signed-off-by: Dan Sun <dsun20@bloomberg.net> --------- Signed-off-by: Dan Sun <dsun20@bloomberg.net> Fix torchserve llm example link Signed-off-by: Dan Sun <dsun20@bloomberg.net> Fixed formatting in get_started (kserve#319) Signed-off-by: Helber Belmiro <helber.belmiro@gmail.com> clarify prometheus annotation (kserve#316) Signed-off-by: JuHyung-Son <sonju0427@gmail.com> Document servingruntime constraint introduced by kserve/kserve#3181 (kserve#320) * Document serving runtime constraint introduced by kserve/kserve#3181 Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Set content type for predict/explainer curl requests Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Update docs/modelserving/servingruntimes.md Signed-off-by: Dan Sun <dsun20@bloomberg.net> --------- Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Signed-off-by: Dan Sun <dsun20@bloomberg.net> Co-authored-by: Dan Sun <dsun20@bloomberg.net> Add kubeflow summit 2023 Jooho's presentation link (kserve#325) add kubeflow summit 2023 Jooho's presentation link Signed-off-by: jooho <jlee@redhat.com> docs: Add one related presentations from Kubeflow Summit 2023 (kserve#327) * docs: Add two new related presentations from Kubeflow Summit 2023Update presentations.md Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> * Update presentations.md Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> --------- Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> Added example for torchserve grpc v1 and v2. (kserve#307) * Added example for torchserve grpc v1 and v2. Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * Schema order changed. Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * corrected v2 REST input. Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * Updated grpc-v2 protocolVersion. Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * Update README.md * Update README.md * Update README.md --------- Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Co-authored-by: Dan Sun <dsun20@bloomberg.net> Add link to release process doc in developer.md (kserve#330) Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> Update tranformer collocation docs for specifying storage uri (kserve#323) Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Fix incorrect edit URL to docs (kserve#329) Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> Set resources for inferencegraph example (kserve#322) Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Fixes kserve#331 - broken link to AMD Inference Server (kserve#332) Tested locally with mkdocs serve Render KServe Python Runtime API doc with mkdoc (kserve#333) * Update KServe python sdk docs Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update serving runtime doc Signed-off-by: Dan Sun <dsun20@bloomberg.net> --------- Signed-off-by: Dan Sun <dsun20@bloomberg.net> Fix build: Install kserve for rendering the docstring (kserve#334) * Update KServe python sdk docs Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Install kserve sdk for mkdocstring Signed-off-by: Dan Sun <dsun20@bloomberg.net> --------- Signed-off-by: Dan Sun <dsun20@bloomberg.net> Onnx docs update (kserve#275) * Updated Onnx example. Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * Reverting sklearn doc update as there is a separate PR Signed-off-by: andyi2it <andrews.arokiam@ideas2it.com> * Added new schema in onnx example. Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * protocolVersion and old schema updated with onnx example. Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> --------- Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Signed-off-by: andyi2it <andrews.arokiam@ideas2it.com> Standardized schema order (kserve#318) * Standardized schema's order. Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * Fix v2 spec for torch serve --------- Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Signed-off-by: Dan Sun <dsun20@bloomberg.net> Co-authored-by: Dan Sun <dsun20@bloomberg.net> Update link to Slack instructions Signed-off-by: Yuan (Terry) Tang <terrytangyuan@gmail.com> Update README.md (kserve#344) Fix incorrect storage uri prefix Signed-off-by: zoramt <zoramthanga@yahoo.com> Added steps to delete model-store-pod (kserve#343) Signed-off-by: murata.yu <murata.yu@jp.fujitsu.com> Update README.md Signed-off-by: Dan Sun <dsun20@bloomberg.net> Add documentation for modelcars (kserve#337) * Add documentation for modelcars, introduced in 0.12 as experimental feature Signed-off-by: Roland Huß <rhuss@redhat.com> * added some references to this feature Signed-off-by: Roland Huß <rhuss@redhat.com> --------- Signed-off-by: Roland Huß <rhuss@redhat.com> add certificate doc (kserve#326) * add certificate doc Signed-off-by: jooho <jlee@redhat.com> * Update mkdocs.yml Signed-off-by: Dan Sun <dsun20@bloomberg.net> --------- Signed-off-by: jooho <jlee@redhat.com> Signed-off-by: Dan Sun <dsun20@bloomberg.net> Co-authored-by: Dan Sun <dsun20@bloomberg.net> docs: fix the emoji deprecation message and invalid file name (kserve#348) Signed-off-by: Peter Jausovec <peter.jausovec@solo.io> Add documentation for GCS (kserve#351) * Add documentation for GCS Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com> * Update mkdocs to include GCS Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com> * Fix formatting Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com> --------- Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com> Add ModelRegistry custom storage intializer example (kserve#346) * Add ModelRegistry custom storage intializer example Signed-off-by: Andrea Lamparelli <a.lamparelli95@gmail.com> * Update docs/modelserving/storage/storagecontainers.md Co-authored-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: Andrea Lamparelli <a.lamparelli95@gmail.com> --------- Signed-off-by: Andrea Lamparelli <a.lamparelli95@gmail.com> Co-authored-by: Dan Sun <dsun20@bloomberg.net> Updated docs for autoscaling on gpu. (kserve#328) Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Update version matrix for 0.12 (kserve#353) * Update version matrix for 0.12 Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update kubernetes_deployment.md Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update notes for gRPC issues Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update kserve install Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update kubernetes_deployment.md Signed-off-by: Dan Sun <dsun20@bloomberg.net> --------- Signed-off-by: Dan Sun <dsun20@bloomberg.net> docs: update kserve resource yaml file (kserve#356) fix docs Signed-off-by: Niels ten Boom <nielstenboom@gmail.com> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update serving runtime version for 0.12 release and add some notes (kserve#354) * Fix few bugs, add quick install failure note and update docs for release 0.12.0 Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Add warning about control plane namespaces Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Resolve comments Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> --------- Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Add Helm installation commands in get started guide Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Revert "Add Helm installation commands in get started guide" This reverts commit bc90c25. Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Add Helm installation commands in get started guide (kserve#358) Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update README.md (kserve#359) Fix broken link to Ray doc on fractional GPU allocation. Signed-off-by: zoramt <zoramthanga@yahoo.com> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Add Huggingface Serving Runtime example with Llama2 (kserve#345) * Add Huggingface Serving Runtime example with Llama2 Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * Fix examples Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * Fix examples Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * fix review comments Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * add linking Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * fix comments Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * Update huggingface vllm runtime doc Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update mkdocs.yml Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update triton doc Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Fix Hugging Face Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix newline Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix newline Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix newline Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix Hugging Face Signed-off-by: Dan Sun <dsun20@bloomberg.net> --------- Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> Signed-off-by: Dan Sun <dsun20@bloomberg.net> Co-authored-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update adopters.md (kserve#361) Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Point users to vLLM production server (kserve#362) The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead. So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers. Signed-off-by: Pierre Dulac <pierre@dotprod.ai> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> initial draft of kserve release blog Signed-off-by: agriffith50 <agriffith50@bloomberg.net> change title Signed-off-by: agriffith50 <agriffith50@bloomberg.net> resolving comments Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> update comment Signed-off-by: agriffith50 <agriffith50@bloomberg.net> update for vllm comment Signed-off-by: agriffith50 <agriffith50@bloomberg.net> add more info about completions endpoints Signed-off-by: agriffith50 <agriffith50@bloomberg.net> add hf img Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Sample requests update in HuggingFace runtime with vLLM support (kserve#364) Update Sample requests for HF runtime Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> add new kserve img Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update future plan and other changes Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Add Huggingface Serving Runtime example with Llama2 (kserve#345) * Add Huggingface Serving Runtime example with Llama2 Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * Fix examples Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * Fix examples Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * fix review comments Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * add linking Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * fix comments Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * Update huggingface vllm runtime doc Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update mkdocs.yml Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update triton doc Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Fix Hugging Face Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix newline Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix newline Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix newline Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix Hugging Face Signed-off-by: Dan Sun <dsun20@bloomberg.net> --------- Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> Signed-off-by: Dan Sun <dsun20@bloomberg.net> Co-authored-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Sample requests update in HuggingFace runtime with vLLM support (kserve#364) Update Sample requests for HF runtime Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update huggingface triton yaml Signed-off-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update blog link Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Add triton huggingface reference Signed-off-by: agriffith50 <agriffith50@bloomberg.net> resolve merge Signed-off-by: agriffith50 <agriffith50@bloomberg.net> docs: update kserve resource yaml file (kserve#356) fix docs Signed-off-by: Niels ten Boom <nielstenboom@gmail.com> Add Helm installation commands in get started guide Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Revert "Add Helm installation commands in get started guide" This reverts commit bc90c25. Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Add Helm installation commands in get started guide (kserve#358) Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update README.md (kserve#359) Fix broken link to Ray doc on fractional GPU allocation. Signed-off-by: zoramt <zoramthanga@yahoo.com> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Add Huggingface Serving Runtime example with Llama2 (kserve#345) * Add Huggingface Serving Runtime example with Llama2 Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * Fix examples Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * Fix examples Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * fix review comments Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * add linking Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * fix comments Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * Update huggingface vllm runtime doc Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update mkdocs.yml Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update triton doc Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Fix Hugging Face Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix newline Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix newline Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix newline Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix Hugging Face Signed-off-by: Dan Sun <dsun20@bloomberg.net> --------- Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> Signed-off-by: Dan Sun <dsun20@bloomberg.net> Co-authored-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update adopters.md (kserve#361) Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Point users to vLLM production server (kserve#362) The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead. So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers. Signed-off-by: Pierre Dulac <pierre@dotprod.ai> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> initial draft of kserve release blog Signed-off-by: agriffith50 <agriffith50@bloomberg.net> change title Signed-off-by: agriffith50 <agriffith50@bloomberg.net> resolving comments Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> update comment Signed-off-by: agriffith50 <agriffith50@bloomberg.net> update for vllm comment Signed-off-by: agriffith50 <agriffith50@bloomberg.net> add hf img Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Sample requests update in HuggingFace runtime with vLLM support (kserve#364) Update Sample requests for HF runtime Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> add new kserve img Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update future plan and other changes Add Huggingface Serving Runtime example with Llama2 (kserve#345) * Add Huggingface Serving Runtime example with Llama2 Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * Fix examples Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * Fix examples Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * fix review comments Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * add linking Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * fix comments Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * Update huggingface vllm runtime doc Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update mkdocs.yml Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update triton doc Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Fix Hugging Face Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix newline Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix newline Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix newline Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix Hugging Face Signed-off-by: Dan Sun <dsun20@bloomberg.net> --------- Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> Signed-off-by: Dan Sun <dsun20@bloomberg.net> Co-authored-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Sample requests update in HuggingFace runtime with vLLM support (kserve#364) Update Sample requests for HF runtime Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> Update huggingface triton yaml Signed-off-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update blog link Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Add triton huggingface reference Signed-off-by: agriffith50 <agriffith50@bloomberg.net> resolve merge Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Add Helm installation commands in get started guide Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> Revert "Add Helm installation commands in get started guide" This reverts commit bc90c25. Add Helm installation commands in get started guide (kserve#358) Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> Update README.md (kserve#359) Fix broken link to Ray doc on fractional GPU allocation. Signed-off-by: zoramt <zoramthanga@yahoo.com> Update adopters.md (kserve#361) Point users to vLLM production server (kserve#362) The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead. So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers. Signed-off-by: Pierre Dulac <pierre@dotprod.ai> Sample requests update in HuggingFace runtime with vLLM support (kserve#364) Update Sample requests for HF runtime Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> Update huggingface triton yaml Signed-off-by: Dan Sun <dsun20@bloomberg.net>
* parent 2257489 author Dan Sun <dsun20@bloomberg.net> 1698039744 -0400 committer agriffith50 <agriffith50@bloomberg.net> 1716219052 -0400 parent 2257489 author Dan Sun <dsun20@bloomberg.net> 1698039744 -0400 committer agriffith50 <agriffith50@bloomberg.net> 1716218313 -0400 parent 2257489 author Dan Sun <dsun20@bloomberg.net> 1698039744 -0400 committer agriffith50 <agriffith50@bloomberg.net> 1716217744 -0400 Add TorchServe Huggingface accelerate example (#304) * Add LLM example for huggingface accelerate Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Add inputs Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update storage uri Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Add to LLM runtime to index Signed-off-by: Dan Sun <dsun20@bloomberg.net> --------- Signed-off-by: Dan Sun <dsun20@bloomberg.net> 0.11 release blog (#310) * Add 0.11 release blog Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update blog Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Add vllm runtime doc Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Add vllm example doc Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update blog link Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Add vLLM intro Signed-off-by: Dan Sun <dsun20@bloomberg.net> * add python runtime open inference protocol tutorials Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Fix warning Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Add warning Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Address comments Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Fix newline Signed-off-by: Dan Sun <dsun20@bloomberg.net> --------- Signed-off-by: Dan Sun <dsun20@bloomberg.net> Fix torchserve llm example link Signed-off-by: Dan Sun <dsun20@bloomberg.net> Fixed formatting in get_started (#319) Signed-off-by: Helber Belmiro <helber.belmiro@gmail.com> clarify prometheus annotation (#316) Signed-off-by: JuHyung-Son <sonju0427@gmail.com> Document servingruntime constraint introduced by kserve/kserve#3181 (#320) * Document serving runtime constraint introduced by kserve/kserve#3181 Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Set content type for predict/explainer curl requests Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Update docs/modelserving/servingruntimes.md Signed-off-by: Dan Sun <dsun20@bloomberg.net> --------- Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Signed-off-by: Dan Sun <dsun20@bloomberg.net> Co-authored-by: Dan Sun <dsun20@bloomberg.net> Add kubeflow summit 2023 Jooho's presentation link (#325) add kubeflow summit 2023 Jooho's presentation link Signed-off-by: jooho <jlee@redhat.com> docs: Add one related presentations from Kubeflow Summit 2023 (#327) * docs: Add two new related presentations from Kubeflow Summit 2023Update presentations.md Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> * Update presentations.md Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> --------- Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> Added example for torchserve grpc v1 and v2. (#307) * Added example for torchserve grpc v1 and v2. Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * Schema order changed. Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * corrected v2 REST input. Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * Updated grpc-v2 protocolVersion. Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * Update README.md * Update README.md * Update README.md --------- Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Co-authored-by: Dan Sun <dsun20@bloomberg.net> Add link to release process doc in developer.md (#330) Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> Update tranformer collocation docs for specifying storage uri (#323) Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Fix incorrect edit URL to docs (#329) Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> Set resources for inferencegraph example (#322) Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Fixes #331 - broken link to AMD Inference Server (#332) Tested locally with mkdocs serve Render KServe Python Runtime API doc with mkdoc (#333) * Update KServe python sdk docs Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update serving runtime doc Signed-off-by: Dan Sun <dsun20@bloomberg.net> --------- Signed-off-by: Dan Sun <dsun20@bloomberg.net> Fix build: Install kserve for rendering the docstring (#334) * Update KServe python sdk docs Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Install kserve sdk for mkdocstring Signed-off-by: Dan Sun <dsun20@bloomberg.net> --------- Signed-off-by: Dan Sun <dsun20@bloomberg.net> Onnx docs update (#275) * Updated Onnx example. Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * Reverting sklearn doc update as there is a separate PR Signed-off-by: andyi2it <andrews.arokiam@ideas2it.com> * Added new schema in onnx example. Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * protocolVersion and old schema updated with onnx example. Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> --------- Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Signed-off-by: andyi2it <andrews.arokiam@ideas2it.com> Standardized schema order (#318) * Standardized schema's order. Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> * Fix v2 spec for torch serve --------- Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Signed-off-by: Dan Sun <dsun20@bloomberg.net> Co-authored-by: Dan Sun <dsun20@bloomberg.net> Update link to Slack instructions Signed-off-by: Yuan (Terry) Tang <terrytangyuan@gmail.com> Update README.md (#344) Fix incorrect storage uri prefix Signed-off-by: zoramt <zoramthanga@yahoo.com> Added steps to delete model-store-pod (#343) Signed-off-by: murata.yu <murata.yu@jp.fujitsu.com> Update README.md Signed-off-by: Dan Sun <dsun20@bloomberg.net> Add documentation for modelcars (#337) * Add documentation for modelcars, introduced in 0.12 as experimental feature Signed-off-by: Roland Huß <rhuss@redhat.com> * added some references to this feature Signed-off-by: Roland Huß <rhuss@redhat.com> --------- Signed-off-by: Roland Huß <rhuss@redhat.com> add certificate doc (#326) * add certificate doc Signed-off-by: jooho <jlee@redhat.com> * Update mkdocs.yml Signed-off-by: Dan Sun <dsun20@bloomberg.net> --------- Signed-off-by: jooho <jlee@redhat.com> Signed-off-by: Dan Sun <dsun20@bloomberg.net> Co-authored-by: Dan Sun <dsun20@bloomberg.net> docs: fix the emoji deprecation message and invalid file name (#348) Signed-off-by: Peter Jausovec <peter.jausovec@solo.io> Add documentation for GCS (#351) * Add documentation for GCS Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com> * Update mkdocs to include GCS Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com> * Fix formatting Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com> --------- Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com> Add ModelRegistry custom storage intializer example (#346) * Add ModelRegistry custom storage intializer example Signed-off-by: Andrea Lamparelli <a.lamparelli95@gmail.com> * Update docs/modelserving/storage/storagecontainers.md Co-authored-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: Andrea Lamparelli <a.lamparelli95@gmail.com> --------- Signed-off-by: Andrea Lamparelli <a.lamparelli95@gmail.com> Co-authored-by: Dan Sun <dsun20@bloomberg.net> Updated docs for autoscaling on gpu. (#328) Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com> Update version matrix for 0.12 (#353) * Update version matrix for 0.12 Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update kubernetes_deployment.md Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update notes for gRPC issues Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update kserve install Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update kubernetes_deployment.md Signed-off-by: Dan Sun <dsun20@bloomberg.net> --------- Signed-off-by: Dan Sun <dsun20@bloomberg.net> docs: update kserve resource yaml file (#356) fix docs Signed-off-by: Niels ten Boom <nielstenboom@gmail.com> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update serving runtime version for 0.12 release and add some notes (#354) * Fix few bugs, add quick install failure note and update docs for release 0.12.0 Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Add warning about control plane namespaces Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Resolve comments Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> --------- Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Add Helm installation commands in get started guide Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Revert "Add Helm installation commands in get started guide" This reverts commit bc90c25. Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Add Helm installation commands in get started guide (#358) Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update README.md (#359) Fix broken link to Ray doc on fractional GPU allocation. Signed-off-by: zoramt <zoramthanga@yahoo.com> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Add Huggingface Serving Runtime example with Llama2 (#345) * Add Huggingface Serving Runtime example with Llama2 Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * Fix examples Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * Fix examples Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * fix review comments Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * add linking Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * fix comments Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * Update huggingface vllm runtime doc Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update mkdocs.yml Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update triton doc Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Fix Hugging Face Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix newline Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix newline Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix newline Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix Hugging Face Signed-off-by: Dan Sun <dsun20@bloomberg.net> --------- Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> Signed-off-by: Dan Sun <dsun20@bloomberg.net> Co-authored-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update adopters.md (#361) Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Point users to vLLM production server (#362) The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead. So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers. Signed-off-by: Pierre Dulac <pierre@dotprod.ai> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> initial draft of kserve release blog Signed-off-by: agriffith50 <agriffith50@bloomberg.net> change title Signed-off-by: agriffith50 <agriffith50@bloomberg.net> resolving comments Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> update comment Signed-off-by: agriffith50 <agriffith50@bloomberg.net> update for vllm comment Signed-off-by: agriffith50 <agriffith50@bloomberg.net> add more info about completions endpoints Signed-off-by: agriffith50 <agriffith50@bloomberg.net> add hf img Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Sample requests update in HuggingFace runtime with vLLM support (#364) Update Sample requests for HF runtime Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> add new kserve img Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update future plan and other changes Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Add Huggingface Serving Runtime example with Llama2 (#345) * Add Huggingface Serving Runtime example with Llama2 Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * Fix examples Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * Fix examples Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * fix review comments Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * add linking Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * fix comments Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * Update huggingface vllm runtime doc Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update mkdocs.yml Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update triton doc Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Fix Hugging Face Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix newline Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix newline Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix newline Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix Hugging Face Signed-off-by: Dan Sun <dsun20@bloomberg.net> --------- Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> Signed-off-by: Dan Sun <dsun20@bloomberg.net> Co-authored-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Sample requests update in HuggingFace runtime with vLLM support (#364) Update Sample requests for HF runtime Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update huggingface triton yaml Signed-off-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update blog link Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Add triton huggingface reference Signed-off-by: agriffith50 <agriffith50@bloomberg.net> resolve merge Signed-off-by: agriffith50 <agriffith50@bloomberg.net> docs: update kserve resource yaml file (#356) fix docs Signed-off-by: Niels ten Boom <nielstenboom@gmail.com> Add Helm installation commands in get started guide Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Revert "Add Helm installation commands in get started guide" This reverts commit bc90c25. Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Add Helm installation commands in get started guide (#358) Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update README.md (#359) Fix broken link to Ray doc on fractional GPU allocation. Signed-off-by: zoramt <zoramthanga@yahoo.com> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Add Huggingface Serving Runtime example with Llama2 (#345) * Add Huggingface Serving Runtime example with Llama2 Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * Fix examples Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * Fix examples Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * fix review comments Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * add linking Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * fix comments Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * Update huggingface vllm runtime doc Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update mkdocs.yml Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update triton doc Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Fix Hugging Face Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix newline Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix newline Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix newline Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix Hugging Face Signed-off-by: Dan Sun <dsun20@bloomberg.net> --------- Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> Signed-off-by: Dan Sun <dsun20@bloomberg.net> Co-authored-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update adopters.md (#361) Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Point users to vLLM production server (#362) The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead. So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers. Signed-off-by: Pierre Dulac <pierre@dotprod.ai> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> initial draft of kserve release blog Signed-off-by: agriffith50 <agriffith50@bloomberg.net> change title Signed-off-by: agriffith50 <agriffith50@bloomberg.net> resolving comments Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> update comment Signed-off-by: agriffith50 <agriffith50@bloomberg.net> update for vllm comment Signed-off-by: agriffith50 <agriffith50@bloomberg.net> add hf img Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Sample requests update in HuggingFace runtime with vLLM support (#364) Update Sample requests for HF runtime Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> add new kserve img Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update future plan and other changes Add Huggingface Serving Runtime example with Llama2 (#345) * Add Huggingface Serving Runtime example with Llama2 Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * Fix examples Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * Fix examples Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * fix review comments Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * add linking Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * fix comments Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * Update huggingface vllm runtime doc Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update mkdocs.yml Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Update triton doc Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Fix Hugging Face Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix newline Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix newline Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix newline Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix Hugging Face Signed-off-by: Dan Sun <dsun20@bloomberg.net> --------- Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> Signed-off-by: Dan Sun <dsun20@bloomberg.net> Co-authored-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Sample requests update in HuggingFace runtime with vLLM support (#364) Update Sample requests for HF runtime Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> Update huggingface triton yaml Signed-off-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Update blog link Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Add triton huggingface reference Signed-off-by: agriffith50 <agriffith50@bloomberg.net> resolve merge Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Add Helm installation commands in get started guide Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> Revert "Add Helm installation commands in get started guide" This reverts commit bc90c25. Add Helm installation commands in get started guide (#358) Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> Update README.md (#359) Fix broken link to Ray doc on fractional GPU allocation. Signed-off-by: zoramt <zoramthanga@yahoo.com> Update adopters.md (#361) Point users to vLLM production server (#362) The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead. So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers. Signed-off-by: Pierre Dulac <pierre@dotprod.ai> Sample requests update in HuggingFace runtime with vLLM support (#364) Update Sample requests for HF runtime Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> Update huggingface triton yaml Signed-off-by: Dan Sun <dsun20@bloomberg.net> * fix merge Signed-off-by: agriffith50 <agriffith50@bloomberg.net> * fix more merge issue Signed-off-by: agriffith50 <agriffith50@bloomberg.net> * Move up the diagram Signed-off-by: agriffith50 <agriffith50@bloomberg.net> * fix flag naming Signed-off-by: agriffith50 <agriffith50@bloomberg.net> * update slack Signed-off-by: agriffith50 <agriffith50@bloomberg.net> * Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Yuan Tang <terrytangyuan@gmail.com> Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> * Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Yuan Tang <terrytangyuan@gmail.com> Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> * Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Yuan Tang <terrytangyuan@gmail.com> Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> * fix Hugging Face Signed-off-by: agriffith50 <agriffith50@bloomberg.net> --------- Signed-off-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net> Signed-off-by: agriffith50 <agriffith50@bloomberg.net> Co-authored-by: Dan Sun <dsun20@bloomberg.net> Co-authored-by: Yuan Tang <terrytangyuan@gmail.com>
"Fixes #issue-number" or "Add description of the problem this PR solves"
Proposed Changes