Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ModelRegistry custom storage intializer example #346

Merged
merged 2 commits into from
Mar 24, 2024

Conversation

lampajr
Copy link
Contributor

@lampajr lampajr commented Feb 26, 2024

Fixes kserve/kserve#3343

Proposed Changes

Signed-off-by: Andrea Lamparelli <a.lamparelli95@gmail.com>
Copy link

netlify bot commented Feb 26, 2024

Deploy Preview for elastic-nobel-0aef7a ready!

Name Link
🔨 Latest commit 326d9a4
🔍 Latest deploy log https://app.netlify.com/sites/elastic-nobel-0aef7a/deploys/65f8434ea83e5a0008bf341e
😎 Deploy Preview https://deploy-preview-346--elastic-nobel-0aef7a.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@lampajr
Copy link
Contributor Author

lampajr commented Feb 26, 2024

There are still a couple of open points on my side:

  • I am referencing this external repo I created to showcase the extension/integration to avoid polluting the website with some (useless?) details. I'd be happy to transfer the example into a dedicated repository, if you have any, containing some examples.
  • I hope the details I added are clear enough to be a valuable addition for other users as well

@lampajr lampajr marked this pull request as ready for review February 26, 2024 17:30
@oss-prow-bot oss-prow-bot bot requested a review from theofpa February 26, 2024 17:30
Copy link
Member

@terrytangyuan terrytangyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR!

!!! note
KServe controller will take care of properly injecting your container image and invoking it with those proper arguments.

A more concrete example can be found [here](https://github.com/lampajr/model-registry-storage-initializer), where the storage initializer query an existing `model registry` service in order to retrieve the original location of the model that the user requested to deploy.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a preference on the repo location but we definitely want to keep this doc concise.

apiVersion: "serving.kserve.io/v1alpha1"
kind: ClusterStorageContainer
metadata:
name: abc
spec:
container:
name: storage-initializer
image: abc/custom-storage-initializer:latest
image: abc/model-registry-storage-initializer:latest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we intended to show the kubeflow model registry example here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is not a tagged and published version of the kf/model-registry-initializer yet, so I think that for now we could keep this as generic by using abc/model-registry-storage-initializer:latest.

The source code of that initiliazer has been created here https://github.com/kubeflow/model-registry/tree/main/csi, as soon as an image is tagged we could change the image here as well, wdyt?

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Andrea Lamparelli <a.lamparelli95@gmail.com>
@yuzisun
Copy link
Member

yuzisun commented Mar 24, 2024

Thanks @lampajr !!

/lgtm
/approve

Copy link

oss-prow-bot bot commented Mar 24, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lampajr, yuzisun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@oss-prow-bot oss-prow-bot bot merged commit 144d603 into kserve:main Mar 24, 2024
6 checks passed
@lampajr lampajr deleted the kserve_3343_csc_example branch March 25, 2024 15:39
alexagriffith pushed a commit to alexagriffith/website that referenced this pull request May 20, 2024
author Dan Sun <dsun20@bloomberg.net> 1698039744 -0400
committer agriffith50 <agriffith50@bloomberg.net> 1716219052 -0400

parent 2257489
author Dan Sun <dsun20@bloomberg.net> 1698039744 -0400
committer agriffith50 <agriffith50@bloomberg.net> 1716218313 -0400

parent 2257489
author Dan Sun <dsun20@bloomberg.net> 1698039744 -0400
committer agriffith50 <agriffith50@bloomberg.net> 1716217744 -0400

Add TorchServe Huggingface accelerate example (kserve#304)

* Add LLM example for huggingface accelerate

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add inputs

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update storage uri

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add to LLM runtime to index

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

0.11 release blog (kserve#310)

* Add 0.11 release blog

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update blog

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add vllm runtime doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add vllm example doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update blog link

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add vLLM intro

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* add python runtime open inference protocol tutorials

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix warning

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add warning

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Address comments

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

Fix torchserve llm example link

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

Fixed formatting in get_started (kserve#319)

Signed-off-by: Helber Belmiro <helber.belmiro@gmail.com>

clarify prometheus annotation (kserve#316)

Signed-off-by: JuHyung-Son <sonju0427@gmail.com>

Document servingruntime constraint introduced by kserve/kserve#3181 (kserve#320)

* Document serving runtime constraint introduced by kserve/kserve#3181

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Set content type for predict/explainer curl requests

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Update docs/modelserving/servingruntimes.md

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>

Add kubeflow summit 2023 Jooho's presentation link (kserve#325)

add kubeflow summit 2023 Jooho's presentation link

Signed-off-by: jooho <jlee@redhat.com>

docs: Add one related presentations from Kubeflow Summit 2023 (kserve#327)

* docs: Add two new related presentations from Kubeflow Summit 2023Update presentations.md

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* Update presentations.md

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

---------

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

Added example for torchserve grpc v1 and v2. (kserve#307)

* Added example for torchserve grpc v1 and v2.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* Schema order changed.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* corrected v2 REST input.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* Updated grpc-v2 protocolVersion.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* Update README.md

* Update README.md

* Update README.md

---------

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>

Add link to release process doc in developer.md (kserve#330)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

Update tranformer collocation docs for specifying storage uri (kserve#323)

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Fix incorrect edit URL to docs (kserve#329)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

Set resources for inferencegraph example (kserve#322)

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Fixes kserve#331 - broken link to AMD Inference Server (kserve#332)

Tested locally with mkdocs serve

Render KServe Python Runtime API doc with mkdoc (kserve#333)

* Update KServe python sdk docs

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update serving runtime doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

Fix build: Install kserve for rendering the docstring (kserve#334)

* Update KServe python sdk docs

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Install kserve sdk for mkdocstring

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

Onnx docs update (kserve#275)

* Updated Onnx example.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* Reverting sklearn doc update as there is a separate PR

Signed-off-by: andyi2it <andrews.arokiam@ideas2it.com>

* Added new schema in onnx example.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* protocolVersion and old schema updated with onnx example.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

---------

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
Signed-off-by: andyi2it <andrews.arokiam@ideas2it.com>

Standardized schema order (kserve#318)

* Standardized schema's order.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* Fix v2 spec for torch serve

---------

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>

Update link to Slack instructions

Signed-off-by: Yuan (Terry) Tang <terrytangyuan@gmail.com>

Update README.md (kserve#344)

Fix incorrect storage uri prefix

Signed-off-by: zoramt <zoramthanga@yahoo.com>

Added steps to delete model-store-pod (kserve#343)

Signed-off-by: murata.yu <murata.yu@jp.fujitsu.com>

Update README.md

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

Add documentation for modelcars (kserve#337)

* Add documentation for modelcars, introduced in 0.12 as experimental feature

Signed-off-by: Roland Huß <rhuss@redhat.com>

* added some references to this feature

Signed-off-by: Roland Huß <rhuss@redhat.com>

---------

Signed-off-by: Roland Huß <rhuss@redhat.com>

add certificate doc (kserve#326)

* add certificate doc

Signed-off-by: jooho <jlee@redhat.com>

* Update mkdocs.yml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: jooho <jlee@redhat.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>

docs: fix the emoji deprecation message and invalid file name (kserve#348)

Signed-off-by: Peter Jausovec <peter.jausovec@solo.io>

Add documentation for GCS (kserve#351)

* Add documentation for GCS

Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>

* Update mkdocs to include GCS

Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>

* Fix formatting

Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>

---------

Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>

Add ModelRegistry custom storage intializer example (kserve#346)

* Add ModelRegistry custom storage intializer example

Signed-off-by: Andrea Lamparelli <a.lamparelli95@gmail.com>

* Update docs/modelserving/storage/storagecontainers.md

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Andrea Lamparelli <a.lamparelli95@gmail.com>

---------

Signed-off-by: Andrea Lamparelli <a.lamparelli95@gmail.com>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>

Updated docs for autoscaling on gpu. (kserve#328)

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

Update version matrix for 0.12 (kserve#353)

* Update version matrix for 0.12

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update kubernetes_deployment.md

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update notes for gRPC issues

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update kserve install

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update kubernetes_deployment.md

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

docs: update kserve resource yaml file (kserve#356)

fix docs

Signed-off-by: Niels ten Boom <nielstenboom@gmail.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update serving runtime version for 0.12 release and add some notes (kserve#354)

* Fix few bugs, add quick install failure note and update docs for release 0.12.0

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Add warning about control plane namespaces

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Resolve comments

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

---------

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Helm installation commands in get started guide

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Revert "Add Helm installation commands in get started guide"

This reverts commit bc90c25.

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Helm installation commands in get started guide (kserve#358)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update README.md (kserve#359)

Fix broken link to Ray doc on fractional GPU allocation.

Signed-off-by: zoramt <zoramthanga@yahoo.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Huggingface Serving Runtime example with Llama2 (kserve#345)

* Add Huggingface Serving Runtime example with Llama2

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix review comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* add linking

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Update huggingface vllm runtime doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update mkdocs.yml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update triton doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update adopters.md (kserve#361)

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Point users to vLLM production server (kserve#362)

The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead.

So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers.

Signed-off-by: Pierre Dulac <pierre@dotprod.ai>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

initial draft of kserve release blog

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

change title

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

resolving comments

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

update comment

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

update for vllm comment

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

add more info about completions endpoints

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

add hf img

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Sample requests update in HuggingFace runtime with vLLM support (kserve#364)

Update Sample requests for HF runtime

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

add new kserve img

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update future plan and other changes

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Huggingface Serving Runtime example with Llama2 (kserve#345)

* Add Huggingface Serving Runtime example with Llama2

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix review comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* add linking

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Update huggingface vllm runtime doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update mkdocs.yml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update triton doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Sample requests update in HuggingFace runtime with vLLM support (kserve#364)

Update Sample requests for HF runtime

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update huggingface triton yaml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update blog link

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add triton huggingface reference

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

resolve merge

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

docs: update kserve resource yaml file (kserve#356)

fix docs

Signed-off-by: Niels ten Boom <nielstenboom@gmail.com>

Add Helm installation commands in get started guide

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Revert "Add Helm installation commands in get started guide"

This reverts commit bc90c25.

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Helm installation commands in get started guide (kserve#358)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update README.md (kserve#359)

Fix broken link to Ray doc on fractional GPU allocation.

Signed-off-by: zoramt <zoramthanga@yahoo.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Huggingface Serving Runtime example with Llama2 (kserve#345)

* Add Huggingface Serving Runtime example with Llama2

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix review comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* add linking

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Update huggingface vllm runtime doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update mkdocs.yml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update triton doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update adopters.md (kserve#361)

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Point users to vLLM production server (kserve#362)

The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead.

So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers.

Signed-off-by: Pierre Dulac <pierre@dotprod.ai>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

initial draft of kserve release blog

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

change title

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

resolving comments

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

update comment

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

update for vllm comment

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

add hf img

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Sample requests update in HuggingFace runtime with vLLM support (kserve#364)

Update Sample requests for HF runtime

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

add new kserve img

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update future plan and other changes

Add Huggingface Serving Runtime example with Llama2 (kserve#345)

* Add Huggingface Serving Runtime example with Llama2

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix review comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* add linking

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Update huggingface vllm runtime doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update mkdocs.yml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update triton doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Sample requests update in HuggingFace runtime with vLLM support (kserve#364)

Update Sample requests for HF runtime

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

Update huggingface triton yaml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update blog link

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add triton huggingface reference

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

resolve merge

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Helm installation commands in get started guide

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

Revert "Add Helm installation commands in get started guide"

This reverts commit bc90c25.

Add Helm installation commands in get started guide (kserve#358)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

Update README.md (kserve#359)

Fix broken link to Ray doc on fractional GPU allocation.

Signed-off-by: zoramt <zoramthanga@yahoo.com>

Update adopters.md (kserve#361)

Point users to vLLM production server (kserve#362)

The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead.

So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers.

Signed-off-by: Pierre Dulac <pierre@dotprod.ai>

Sample requests update in HuggingFace runtime with vLLM support (kserve#364)

Update Sample requests for HF runtime

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

Update huggingface triton yaml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>
oss-prow-bot bot pushed a commit that referenced this pull request May 24, 2024
* parent 2257489
author Dan Sun <dsun20@bloomberg.net> 1698039744 -0400
committer agriffith50 <agriffith50@bloomberg.net> 1716219052 -0400

parent 2257489
author Dan Sun <dsun20@bloomberg.net> 1698039744 -0400
committer agriffith50 <agriffith50@bloomberg.net> 1716218313 -0400

parent 2257489
author Dan Sun <dsun20@bloomberg.net> 1698039744 -0400
committer agriffith50 <agriffith50@bloomberg.net> 1716217744 -0400

Add TorchServe Huggingface accelerate example (#304)

* Add LLM example for huggingface accelerate

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add inputs

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update storage uri

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add to LLM runtime to index

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

0.11 release blog (#310)

* Add 0.11 release blog

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update blog

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add vllm runtime doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add vllm example doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update blog link

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add vLLM intro

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* add python runtime open inference protocol tutorials

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix warning

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add warning

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Address comments

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

Fix torchserve llm example link

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

Fixed formatting in get_started (#319)

Signed-off-by: Helber Belmiro <helber.belmiro@gmail.com>

clarify prometheus annotation (#316)

Signed-off-by: JuHyung-Son <sonju0427@gmail.com>

Document servingruntime constraint introduced by kserve/kserve#3181 (#320)

* Document serving runtime constraint introduced by kserve/kserve#3181

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Set content type for predict/explainer curl requests

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Update docs/modelserving/servingruntimes.md

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>

Add kubeflow summit 2023 Jooho's presentation link (#325)

add kubeflow summit 2023 Jooho's presentation link

Signed-off-by: jooho <jlee@redhat.com>

docs: Add one related presentations from Kubeflow Summit 2023 (#327)

* docs: Add two new related presentations from Kubeflow Summit 2023Update presentations.md

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* Update presentations.md

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

---------

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

Added example for torchserve grpc v1 and v2. (#307)

* Added example for torchserve grpc v1 and v2.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* Schema order changed.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* corrected v2 REST input.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* Updated grpc-v2 protocolVersion.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* Update README.md

* Update README.md

* Update README.md

---------

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>

Add link to release process doc in developer.md (#330)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

Update tranformer collocation docs for specifying storage uri (#323)

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Fix incorrect edit URL to docs (#329)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

Set resources for inferencegraph example (#322)

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Fixes #331 - broken link to AMD Inference Server (#332)

Tested locally with mkdocs serve

Render KServe Python Runtime API doc with mkdoc (#333)

* Update KServe python sdk docs

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update serving runtime doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

Fix build: Install kserve for rendering the docstring (#334)

* Update KServe python sdk docs

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Install kserve sdk for mkdocstring

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

Onnx docs update (#275)

* Updated Onnx example.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* Reverting sklearn doc update as there is a separate PR

Signed-off-by: andyi2it <andrews.arokiam@ideas2it.com>

* Added new schema in onnx example.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* protocolVersion and old schema updated with onnx example.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

---------

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
Signed-off-by: andyi2it <andrews.arokiam@ideas2it.com>

Standardized schema order (#318)

* Standardized schema's order.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* Fix v2 spec for torch serve

---------

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>

Update link to Slack instructions

Signed-off-by: Yuan (Terry) Tang <terrytangyuan@gmail.com>

Update README.md (#344)

Fix incorrect storage uri prefix

Signed-off-by: zoramt <zoramthanga@yahoo.com>

Added steps to delete model-store-pod (#343)

Signed-off-by: murata.yu <murata.yu@jp.fujitsu.com>

Update README.md

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

Add documentation for modelcars (#337)

* Add documentation for modelcars, introduced in 0.12 as experimental feature

Signed-off-by: Roland Huß <rhuss@redhat.com>

* added some references to this feature

Signed-off-by: Roland Huß <rhuss@redhat.com>

---------

Signed-off-by: Roland Huß <rhuss@redhat.com>

add certificate doc (#326)

* add certificate doc

Signed-off-by: jooho <jlee@redhat.com>

* Update mkdocs.yml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: jooho <jlee@redhat.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>

docs: fix the emoji deprecation message and invalid file name (#348)

Signed-off-by: Peter Jausovec <peter.jausovec@solo.io>

Add documentation for GCS (#351)

* Add documentation for GCS

Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>

* Update mkdocs to include GCS

Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>

* Fix formatting

Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>

---------

Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>

Add ModelRegistry custom storage intializer example (#346)

* Add ModelRegistry custom storage intializer example

Signed-off-by: Andrea Lamparelli <a.lamparelli95@gmail.com>

* Update docs/modelserving/storage/storagecontainers.md

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Andrea Lamparelli <a.lamparelli95@gmail.com>

---------

Signed-off-by: Andrea Lamparelli <a.lamparelli95@gmail.com>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>

Updated docs for autoscaling on gpu. (#328)

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

Update version matrix for 0.12 (#353)

* Update version matrix for 0.12

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update kubernetes_deployment.md

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update notes for gRPC issues

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update kserve install

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update kubernetes_deployment.md

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

docs: update kserve resource yaml file (#356)

fix docs

Signed-off-by: Niels ten Boom <nielstenboom@gmail.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update serving runtime version for 0.12 release and add some notes (#354)

* Fix few bugs, add quick install failure note and update docs for release 0.12.0

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Add warning about control plane namespaces

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Resolve comments

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

---------

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Helm installation commands in get started guide

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Revert "Add Helm installation commands in get started guide"

This reverts commit bc90c25.

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Helm installation commands in get started guide (#358)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update README.md (#359)

Fix broken link to Ray doc on fractional GPU allocation.

Signed-off-by: zoramt <zoramthanga@yahoo.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Huggingface Serving Runtime example with Llama2 (#345)

* Add Huggingface Serving Runtime example with Llama2

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix review comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* add linking

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Update huggingface vllm runtime doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update mkdocs.yml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update triton doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update adopters.md (#361)

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Point users to vLLM production server (#362)

The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead.

So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers.

Signed-off-by: Pierre Dulac <pierre@dotprod.ai>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

initial draft of kserve release blog

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

change title

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

resolving comments

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

update comment

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

update for vllm comment

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

add more info about completions endpoints

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

add hf img

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Sample requests update in HuggingFace runtime with vLLM support (#364)

Update Sample requests for HF runtime

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

add new kserve img

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update future plan and other changes

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Huggingface Serving Runtime example with Llama2 (#345)

* Add Huggingface Serving Runtime example with Llama2

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix review comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* add linking

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Update huggingface vllm runtime doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update mkdocs.yml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update triton doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Sample requests update in HuggingFace runtime with vLLM support (#364)

Update Sample requests for HF runtime

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update huggingface triton yaml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update blog link

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add triton huggingface reference

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

resolve merge

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

docs: update kserve resource yaml file (#356)

fix docs

Signed-off-by: Niels ten Boom <nielstenboom@gmail.com>

Add Helm installation commands in get started guide

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Revert "Add Helm installation commands in get started guide"

This reverts commit bc90c25.

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Helm installation commands in get started guide (#358)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update README.md (#359)

Fix broken link to Ray doc on fractional GPU allocation.

Signed-off-by: zoramt <zoramthanga@yahoo.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Huggingface Serving Runtime example with Llama2 (#345)

* Add Huggingface Serving Runtime example with Llama2

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix review comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* add linking

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Update huggingface vllm runtime doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update mkdocs.yml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update triton doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update adopters.md (#361)

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Point users to vLLM production server (#362)

The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead.

So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers.

Signed-off-by: Pierre Dulac <pierre@dotprod.ai>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

initial draft of kserve release blog

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

change title

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

resolving comments

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

update comment

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

update for vllm comment

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

add hf img

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Sample requests update in HuggingFace runtime with vLLM support (#364)

Update Sample requests for HF runtime

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

add new kserve img

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update future plan and other changes

Add Huggingface Serving Runtime example with Llama2 (#345)

* Add Huggingface Serving Runtime example with Llama2

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix review comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* add linking

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Update huggingface vllm runtime doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update mkdocs.yml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update triton doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Sample requests update in HuggingFace runtime with vLLM support (#364)

Update Sample requests for HF runtime

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

Update huggingface triton yaml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update blog link

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add triton huggingface reference

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

resolve merge

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Helm installation commands in get started guide

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

Revert "Add Helm installation commands in get started guide"

This reverts commit bc90c25.

Add Helm installation commands in get started guide (#358)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

Update README.md (#359)

Fix broken link to Ray doc on fractional GPU allocation.

Signed-off-by: zoramt <zoramthanga@yahoo.com>

Update adopters.md (#361)

Point users to vLLM production server (#362)

The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead.

So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers.

Signed-off-by: Pierre Dulac <pierre@dotprod.ai>

Sample requests update in HuggingFace runtime with vLLM support (#364)

Update Sample requests for HF runtime

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

Update huggingface triton yaml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix merge
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

* fix more merge issue

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

* Move up the diagram

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

* fix flag naming

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

* update slack

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

* Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Co-authored-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

* Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Co-authored-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

* Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Co-authored-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

* fix Hugging Face

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Yuan Tang <terrytangyuan@gmail.com>
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Explore usage of ClusterStorageContainer and Model Registry
3 participants