Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add certificate doc #326

Merged
merged 2 commits into from
Mar 17, 2024
Merged

add certificate doc #326

merged 2 commits into from
Mar 17, 2024

Conversation

Jooho
Copy link
Contributor

@Jooho Jooho commented Dec 8, 2023

"Fixes #issue-number" or "Add description of the problem this PR solves"

Proposed Changes

Signed-off-by: jooho <jlee@redhat.com>
Copy link

netlify bot commented Dec 8, 2023

Deploy Preview for elastic-nobel-0aef7a ready!

Name Link
🔨 Latest commit 3e95404
🔍 Latest deploy log https://app.netlify.com/sites/elastic-nobel-0aef7a/deploys/65d14253a96a090008cc6441
😎 Deploy Preview https://deploy-preview-326--elastic-nobel-0aef7a.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

mkdocs.yml Outdated
@@ -88,6 +88,8 @@ nav:
- Inference Observability:
- Prometheus Metrics: modelserving/observability/prometheus_metrics.md
- Grafana Dashboards: modelserving/observability/grafana_dashboards.md
- Certiricate:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Jooho Can we add this under Model Storage section ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also there is a typo

Signed-off-by: Dan Sun <dsun20@bloomberg.net>
@yuzisun
Copy link
Member

yuzisun commented Mar 17, 2024

/lgtm
/approve

Copy link

oss-prow-bot bot commented Mar 17, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Jooho, yuzisun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@oss-prow-bot oss-prow-bot bot merged commit bbce239 into kserve:main Mar 17, 2024
6 checks passed
tjandy98 pushed a commit to tjandy98/website that referenced this pull request Mar 19, 2024
* add certificate doc

Signed-off-by: jooho <jlee@redhat.com>

* Update mkdocs.yml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: jooho <jlee@redhat.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
alexagriffith pushed a commit to alexagriffith/website that referenced this pull request May 20, 2024
author Dan Sun <dsun20@bloomberg.net> 1698039744 -0400
committer agriffith50 <agriffith50@bloomberg.net> 1716219052 -0400

parent 2257489
author Dan Sun <dsun20@bloomberg.net> 1698039744 -0400
committer agriffith50 <agriffith50@bloomberg.net> 1716218313 -0400

parent 2257489
author Dan Sun <dsun20@bloomberg.net> 1698039744 -0400
committer agriffith50 <agriffith50@bloomberg.net> 1716217744 -0400

Add TorchServe Huggingface accelerate example (kserve#304)

* Add LLM example for huggingface accelerate

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add inputs

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update storage uri

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add to LLM runtime to index

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

0.11 release blog (kserve#310)

* Add 0.11 release blog

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update blog

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add vllm runtime doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add vllm example doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update blog link

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add vLLM intro

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* add python runtime open inference protocol tutorials

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix warning

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add warning

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Address comments

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

Fix torchserve llm example link

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

Fixed formatting in get_started (kserve#319)

Signed-off-by: Helber Belmiro <helber.belmiro@gmail.com>

clarify prometheus annotation (kserve#316)

Signed-off-by: JuHyung-Son <sonju0427@gmail.com>

Document servingruntime constraint introduced by kserve/kserve#3181 (kserve#320)

* Document serving runtime constraint introduced by kserve/kserve#3181

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Set content type for predict/explainer curl requests

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Update docs/modelserving/servingruntimes.md

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>

Add kubeflow summit 2023 Jooho's presentation link (kserve#325)

add kubeflow summit 2023 Jooho's presentation link

Signed-off-by: jooho <jlee@redhat.com>

docs: Add one related presentations from Kubeflow Summit 2023 (kserve#327)

* docs: Add two new related presentations from Kubeflow Summit 2023Update presentations.md

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* Update presentations.md

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

---------

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

Added example for torchserve grpc v1 and v2. (kserve#307)

* Added example for torchserve grpc v1 and v2.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* Schema order changed.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* corrected v2 REST input.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* Updated grpc-v2 protocolVersion.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* Update README.md

* Update README.md

* Update README.md

---------

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>

Add link to release process doc in developer.md (kserve#330)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

Update tranformer collocation docs for specifying storage uri (kserve#323)

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Fix incorrect edit URL to docs (kserve#329)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

Set resources for inferencegraph example (kserve#322)

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Fixes kserve#331 - broken link to AMD Inference Server (kserve#332)

Tested locally with mkdocs serve

Render KServe Python Runtime API doc with mkdoc (kserve#333)

* Update KServe python sdk docs

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update serving runtime doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

Fix build: Install kserve for rendering the docstring (kserve#334)

* Update KServe python sdk docs

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Install kserve sdk for mkdocstring

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

Onnx docs update (kserve#275)

* Updated Onnx example.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* Reverting sklearn doc update as there is a separate PR

Signed-off-by: andyi2it <andrews.arokiam@ideas2it.com>

* Added new schema in onnx example.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* protocolVersion and old schema updated with onnx example.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

---------

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
Signed-off-by: andyi2it <andrews.arokiam@ideas2it.com>

Standardized schema order (kserve#318)

* Standardized schema's order.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* Fix v2 spec for torch serve

---------

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>

Update link to Slack instructions

Signed-off-by: Yuan (Terry) Tang <terrytangyuan@gmail.com>

Update README.md (kserve#344)

Fix incorrect storage uri prefix

Signed-off-by: zoramt <zoramthanga@yahoo.com>

Added steps to delete model-store-pod (kserve#343)

Signed-off-by: murata.yu <murata.yu@jp.fujitsu.com>

Update README.md

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

Add documentation for modelcars (kserve#337)

* Add documentation for modelcars, introduced in 0.12 as experimental feature

Signed-off-by: Roland Huß <rhuss@redhat.com>

* added some references to this feature

Signed-off-by: Roland Huß <rhuss@redhat.com>

---------

Signed-off-by: Roland Huß <rhuss@redhat.com>

add certificate doc (kserve#326)

* add certificate doc

Signed-off-by: jooho <jlee@redhat.com>

* Update mkdocs.yml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: jooho <jlee@redhat.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>

docs: fix the emoji deprecation message and invalid file name (kserve#348)

Signed-off-by: Peter Jausovec <peter.jausovec@solo.io>

Add documentation for GCS (kserve#351)

* Add documentation for GCS

Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>

* Update mkdocs to include GCS

Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>

* Fix formatting

Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>

---------

Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>

Add ModelRegistry custom storage intializer example (kserve#346)

* Add ModelRegistry custom storage intializer example

Signed-off-by: Andrea Lamparelli <a.lamparelli95@gmail.com>

* Update docs/modelserving/storage/storagecontainers.md

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Andrea Lamparelli <a.lamparelli95@gmail.com>

---------

Signed-off-by: Andrea Lamparelli <a.lamparelli95@gmail.com>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>

Updated docs for autoscaling on gpu. (kserve#328)

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

Update version matrix for 0.12 (kserve#353)

* Update version matrix for 0.12

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update kubernetes_deployment.md

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update notes for gRPC issues

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update kserve install

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update kubernetes_deployment.md

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

docs: update kserve resource yaml file (kserve#356)

fix docs

Signed-off-by: Niels ten Boom <nielstenboom@gmail.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update serving runtime version for 0.12 release and add some notes (kserve#354)

* Fix few bugs, add quick install failure note and update docs for release 0.12.0

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Add warning about control plane namespaces

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Resolve comments

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

---------

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Helm installation commands in get started guide

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Revert "Add Helm installation commands in get started guide"

This reverts commit bc90c25.

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Helm installation commands in get started guide (kserve#358)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update README.md (kserve#359)

Fix broken link to Ray doc on fractional GPU allocation.

Signed-off-by: zoramt <zoramthanga@yahoo.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Huggingface Serving Runtime example with Llama2 (kserve#345)

* Add Huggingface Serving Runtime example with Llama2

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix review comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* add linking

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Update huggingface vllm runtime doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update mkdocs.yml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update triton doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update adopters.md (kserve#361)

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Point users to vLLM production server (kserve#362)

The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead.

So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers.

Signed-off-by: Pierre Dulac <pierre@dotprod.ai>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

initial draft of kserve release blog

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

change title

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

resolving comments

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

update comment

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

update for vllm comment

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

add more info about completions endpoints

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

add hf img

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Sample requests update in HuggingFace runtime with vLLM support (kserve#364)

Update Sample requests for HF runtime

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

add new kserve img

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update future plan and other changes

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Huggingface Serving Runtime example with Llama2 (kserve#345)

* Add Huggingface Serving Runtime example with Llama2

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix review comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* add linking

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Update huggingface vllm runtime doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update mkdocs.yml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update triton doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Sample requests update in HuggingFace runtime with vLLM support (kserve#364)

Update Sample requests for HF runtime

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update huggingface triton yaml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update blog link

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add triton huggingface reference

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

resolve merge

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

docs: update kserve resource yaml file (kserve#356)

fix docs

Signed-off-by: Niels ten Boom <nielstenboom@gmail.com>

Add Helm installation commands in get started guide

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Revert "Add Helm installation commands in get started guide"

This reverts commit bc90c25.

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Helm installation commands in get started guide (kserve#358)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update README.md (kserve#359)

Fix broken link to Ray doc on fractional GPU allocation.

Signed-off-by: zoramt <zoramthanga@yahoo.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Huggingface Serving Runtime example with Llama2 (kserve#345)

* Add Huggingface Serving Runtime example with Llama2

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix review comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* add linking

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Update huggingface vllm runtime doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update mkdocs.yml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update triton doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update adopters.md (kserve#361)

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Point users to vLLM production server (kserve#362)

The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead.

So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers.

Signed-off-by: Pierre Dulac <pierre@dotprod.ai>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

initial draft of kserve release blog

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

change title

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

resolving comments

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

update comment

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

update for vllm comment

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

add hf img

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Sample requests update in HuggingFace runtime with vLLM support (kserve#364)

Update Sample requests for HF runtime

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

add new kserve img

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update future plan and other changes

Add Huggingface Serving Runtime example with Llama2 (kserve#345)

* Add Huggingface Serving Runtime example with Llama2

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix review comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* add linking

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Update huggingface vllm runtime doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update mkdocs.yml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update triton doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Sample requests update in HuggingFace runtime with vLLM support (kserve#364)

Update Sample requests for HF runtime

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

Update huggingface triton yaml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update blog link

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add triton huggingface reference

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

resolve merge

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Helm installation commands in get started guide

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

Revert "Add Helm installation commands in get started guide"

This reverts commit bc90c25.

Add Helm installation commands in get started guide (kserve#358)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

Update README.md (kserve#359)

Fix broken link to Ray doc on fractional GPU allocation.

Signed-off-by: zoramt <zoramthanga@yahoo.com>

Update adopters.md (kserve#361)

Point users to vLLM production server (kserve#362)

The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead.

So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers.

Signed-off-by: Pierre Dulac <pierre@dotprod.ai>

Sample requests update in HuggingFace runtime with vLLM support (kserve#364)

Update Sample requests for HF runtime

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

Update huggingface triton yaml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>
oss-prow-bot bot pushed a commit that referenced this pull request May 24, 2024
* parent 2257489
author Dan Sun <dsun20@bloomberg.net> 1698039744 -0400
committer agriffith50 <agriffith50@bloomberg.net> 1716219052 -0400

parent 2257489
author Dan Sun <dsun20@bloomberg.net> 1698039744 -0400
committer agriffith50 <agriffith50@bloomberg.net> 1716218313 -0400

parent 2257489
author Dan Sun <dsun20@bloomberg.net> 1698039744 -0400
committer agriffith50 <agriffith50@bloomberg.net> 1716217744 -0400

Add TorchServe Huggingface accelerate example (#304)

* Add LLM example for huggingface accelerate

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add inputs

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update storage uri

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add to LLM runtime to index

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

0.11 release blog (#310)

* Add 0.11 release blog

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update blog

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add vllm runtime doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add vllm example doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update blog link

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add vLLM intro

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* add python runtime open inference protocol tutorials

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix warning

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Add warning

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Address comments

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

Fix torchserve llm example link

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

Fixed formatting in get_started (#319)

Signed-off-by: Helber Belmiro <helber.belmiro@gmail.com>

clarify prometheus annotation (#316)

Signed-off-by: JuHyung-Son <sonju0427@gmail.com>

Document servingruntime constraint introduced by kserve/kserve#3181 (#320)

* Document serving runtime constraint introduced by kserve/kserve#3181

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Set content type for predict/explainer curl requests

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Update docs/modelserving/servingruntimes.md

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>

Add kubeflow summit 2023 Jooho's presentation link (#325)

add kubeflow summit 2023 Jooho's presentation link

Signed-off-by: jooho <jlee@redhat.com>

docs: Add one related presentations from Kubeflow Summit 2023 (#327)

* docs: Add two new related presentations from Kubeflow Summit 2023Update presentations.md

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* Update presentations.md

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

---------

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

Added example for torchserve grpc v1 and v2. (#307)

* Added example for torchserve grpc v1 and v2.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* Schema order changed.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* corrected v2 REST input.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* Updated grpc-v2 protocolVersion.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* Update README.md

* Update README.md

* Update README.md

---------

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>

Add link to release process doc in developer.md (#330)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

Update tranformer collocation docs for specifying storage uri (#323)

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Fix incorrect edit URL to docs (#329)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

Set resources for inferencegraph example (#322)

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Fixes #331 - broken link to AMD Inference Server (#332)

Tested locally with mkdocs serve

Render KServe Python Runtime API doc with mkdoc (#333)

* Update KServe python sdk docs

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update serving runtime doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

Fix build: Install kserve for rendering the docstring (#334)

* Update KServe python sdk docs

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Install kserve sdk for mkdocstring

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

Onnx docs update (#275)

* Updated Onnx example.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* Reverting sklearn doc update as there is a separate PR

Signed-off-by: andyi2it <andrews.arokiam@ideas2it.com>

* Added new schema in onnx example.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* protocolVersion and old schema updated with onnx example.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

---------

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
Signed-off-by: andyi2it <andrews.arokiam@ideas2it.com>

Standardized schema order (#318)

* Standardized schema's order.

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

* Fix v2 spec for torch serve

---------

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>

Update link to Slack instructions

Signed-off-by: Yuan (Terry) Tang <terrytangyuan@gmail.com>

Update README.md (#344)

Fix incorrect storage uri prefix

Signed-off-by: zoramt <zoramthanga@yahoo.com>

Added steps to delete model-store-pod (#343)

Signed-off-by: murata.yu <murata.yu@jp.fujitsu.com>

Update README.md

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

Add documentation for modelcars (#337)

* Add documentation for modelcars, introduced in 0.12 as experimental feature

Signed-off-by: Roland Huß <rhuss@redhat.com>

* added some references to this feature

Signed-off-by: Roland Huß <rhuss@redhat.com>

---------

Signed-off-by: Roland Huß <rhuss@redhat.com>

add certificate doc (#326)

* add certificate doc

Signed-off-by: jooho <jlee@redhat.com>

* Update mkdocs.yml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: jooho <jlee@redhat.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>

docs: fix the emoji deprecation message and invalid file name (#348)

Signed-off-by: Peter Jausovec <peter.jausovec@solo.io>

Add documentation for GCS (#351)

* Add documentation for GCS

Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>

* Update mkdocs to include GCS

Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>

* Fix formatting

Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>

---------

Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>

Add ModelRegistry custom storage intializer example (#346)

* Add ModelRegistry custom storage intializer example

Signed-off-by: Andrea Lamparelli <a.lamparelli95@gmail.com>

* Update docs/modelserving/storage/storagecontainers.md

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Andrea Lamparelli <a.lamparelli95@gmail.com>

---------

Signed-off-by: Andrea Lamparelli <a.lamparelli95@gmail.com>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>

Updated docs for autoscaling on gpu. (#328)

Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>

Update version matrix for 0.12 (#353)

* Update version matrix for 0.12

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update kubernetes_deployment.md

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update notes for gRPC issues

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update kserve install

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update kubernetes_deployment.md

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

docs: update kserve resource yaml file (#356)

fix docs

Signed-off-by: Niels ten Boom <nielstenboom@gmail.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update serving runtime version for 0.12 release and add some notes (#354)

* Fix few bugs, add quick install failure note and update docs for release 0.12.0

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Add warning about control plane namespaces

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

* Resolve comments

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

---------

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Helm installation commands in get started guide

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Revert "Add Helm installation commands in get started guide"

This reverts commit bc90c25.

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Helm installation commands in get started guide (#358)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update README.md (#359)

Fix broken link to Ray doc on fractional GPU allocation.

Signed-off-by: zoramt <zoramthanga@yahoo.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Huggingface Serving Runtime example with Llama2 (#345)

* Add Huggingface Serving Runtime example with Llama2

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix review comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* add linking

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Update huggingface vllm runtime doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update mkdocs.yml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update triton doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update adopters.md (#361)

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Point users to vLLM production server (#362)

The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead.

So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers.

Signed-off-by: Pierre Dulac <pierre@dotprod.ai>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

initial draft of kserve release blog

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

change title

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

resolving comments

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

update comment

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

update for vllm comment

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

add more info about completions endpoints

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

add hf img

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Sample requests update in HuggingFace runtime with vLLM support (#364)

Update Sample requests for HF runtime

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

add new kserve img

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update future plan and other changes

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Huggingface Serving Runtime example with Llama2 (#345)

* Add Huggingface Serving Runtime example with Llama2

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix review comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* add linking

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Update huggingface vllm runtime doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update mkdocs.yml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update triton doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Sample requests update in HuggingFace runtime with vLLM support (#364)

Update Sample requests for HF runtime

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update huggingface triton yaml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update blog link

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add triton huggingface reference

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

resolve merge

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

docs: update kserve resource yaml file (#356)

fix docs

Signed-off-by: Niels ten Boom <nielstenboom@gmail.com>

Add Helm installation commands in get started guide

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Revert "Add Helm installation commands in get started guide"

This reverts commit bc90c25.

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Helm installation commands in get started guide (#358)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update README.md (#359)

Fix broken link to Ray doc on fractional GPU allocation.

Signed-off-by: zoramt <zoramthanga@yahoo.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Huggingface Serving Runtime example with Llama2 (#345)

* Add Huggingface Serving Runtime example with Llama2

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix review comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* add linking

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Update huggingface vllm runtime doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update mkdocs.yml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update triton doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update adopters.md (#361)

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Point users to vLLM production server (#362)

The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead.

So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers.

Signed-off-by: Pierre Dulac <pierre@dotprod.ai>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

initial draft of kserve release blog

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

change title

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

resolving comments

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

update comment

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

update for vllm comment

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

add hf img

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Sample requests update in HuggingFace runtime with vLLM support (#364)

Update Sample requests for HF runtime

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

add new kserve img

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update future plan and other changes

Add Huggingface Serving Runtime example with Llama2 (#345)

* Add Huggingface Serving Runtime example with Llama2

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Fix examples

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix review comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* add linking

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* fix comments

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

* Update huggingface vllm runtime doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update mkdocs.yml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Update triton doc

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* Fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix newline

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix Hugging Face

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

---------

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Sample requests update in HuggingFace runtime with vLLM support (#364)

Update Sample requests for HF runtime

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

Update huggingface triton yaml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Update blog link

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add triton huggingface reference

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

resolve merge

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

Add Helm installation commands in get started guide

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

Revert "Add Helm installation commands in get started guide"

This reverts commit bc90c25.

Add Helm installation commands in get started guide (#358)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

Update README.md (#359)

Fix broken link to Ray doc on fractional GPU allocation.

Signed-off-by: zoramt <zoramthanga@yahoo.com>

Update adopters.md (#361)

Point users to vLLM production server (#362)

The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead.

So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers.

Signed-off-by: Pierre Dulac <pierre@dotprod.ai>

Sample requests update in HuggingFace runtime with vLLM support (#364)

Update Sample requests for HF runtime

Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>

Update huggingface triton yaml

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

* fix merge
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

* fix more merge issue

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

* Move up the diagram

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

* fix flag naming

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

* update slack

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

* Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Co-authored-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

* Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Co-authored-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

* Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md

Co-authored-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

* fix Hugging Face

Signed-off-by: agriffith50 <agriffith50@bloomberg.net>

---------

Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
Signed-off-by: agriffith50 <agriffith50@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Yuan Tang <terrytangyuan@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants