Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Update readme #41

Merged
merged 1 commit into from
May 25, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 7 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Modelz LLM is an inference server that facilitates the utilization of open sourc
- **OpenAI compatible API**: Modelz LLM provides an OpenAI compatible API for LLMs, which means you can use the OpenAI python SDK to interact with the model.
- **Self-hosted**: Modelz LLM can be easily deployed on either local or cloud-based environments.
- **Open source LLMs**: Modelz LLM supports open source LLMs, such as FastChat, LLaMA, and ChatGLM.
- **Modelz integration**: Modelz LLM can be easily integrated with [Modelz](https://docs.modelz.ai), which is a serverless inference platform for LLMs and other foundation models.
- **Cloud native**: We provide docker images for different LLMs, which can be easily deployed on Kubernetes, or other cloud-based environments (e.g. [Modelz](https://docs.modelz.ai))

## Quick Start

Expand All @@ -33,23 +33,24 @@ pip install git+https://github.com/tensorchord/modelz-llm.git[gpu]
Please first start the self-hosted API server by following the instructions:

```bash
export MODELZ_MODEL="THUDM/chatglm-6b-int4"
modelz-llm -m MODELZ_MODEL
modelz-llm -m "THUDM/chatglm-6b-int4"
```

Currently, we support the following models:

| Model Name | Model (`MODELZ_MODEL`) | Docker Image |
| Model Name | Huggingface Model | Docker Image |
| ---------- | ----------- | ---------------- |
| Vicuna 7B Delta V1.1 | `lmsys/vicuna-7b-delta-v1.1` | [modelzai/llm-vicuna-7b](https://hub.docker.com/repository/docker/modelzai/llm-vicuna-7b/general) |
| LLaMA 7B | `decapoda-research/llama-7b-hf` | [modelzai/llm-llama-7b](https://hub.docker.com/repository/docker/modelzai/llm-llama-7b/general) |
| ChatGLM 6B INT4 | `THUDM/chatglm-6b-int4` | [modelzai/llm-chatglm-6b-int4](https://hub.docker.com/repository/docker/modelzai/llm-chatglm-6b-int4/general) |
| ChatGLM 6B | `THUDM/chatglm-6b` | [modelzai/llm-chatglm-6b](https://hub.docker.com/repository/docker/modelzai/llm-chatglm-6b/general) |
| Bloomz 560M | `bigscience/bloomz-560m` | |
| Bloomz 1.7B | `bigscience/bloomz-1b7` | |
| Bloomz 3B | `bigscience/bloomz-3b` | |
| Bloomz 7.1B | `bigscience/bloomz-7b1` | |

<!-- | FastChat T5 3B V1.0 | `lmsys/fastchat-t5-3b-v1.0` | `lmsys/fastchat-t5-3b-v1.0` | -->

You could set the `MODELZ_MODEL` environment variables to specify the model and tokenizer.

### Use OpenAI python SDK

Then you can use the OpenAI python SDK to interact with the model:
Expand Down