Releases: bentoml/OpenLLM
v0.4.39
Installation
pip install openllm==0.4.39
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.4.39
Usage
All available models: openllm models
To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta
To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.39 start HuggingFaceH4/zephyr-7b-beta
Find more information about this release in the CHANGELOG.md
What's Changed
Full Changelog: v0.4.38...v0.4.39
v0.4.38
Installation
pip install openllm==0.4.38
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.4.38
Usage
All available models: openllm models
To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta
To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.38 start HuggingFaceH4/zephyr-7b-beta
Find more information about this release in the CHANGELOG.md
What's Changed
- fix(mixtral): correct chat templates to remove additional spacing by @aarnphm in #774
- fix(cli): correct set arguments for
openllm import
andopenllm build
by @aarnphm in #775 - fix(mixtral): setup hack atm to load weights from pt specifically instead of safetensors by @aarnphm in #776
Full Changelog: v0.4.37...v0.4.38
v0.4.37
Installation
pip install openllm==0.4.37
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.4.37
Usage
All available models: openllm models
To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta
To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.37 start HuggingFaceH4/zephyr-7b-beta
Find more information about this release in the CHANGELOG.md
What's Changed
- feat(mixtral): correct support for mixtral by @aarnphm in #772
- chore: running all script when installation by @aarnphm in #773
Full Changelog: v0.4.36...v0.4.37
v0.4.36
Mixtral supports
Supports Mixtral on BentoCloud with vLLM
and all required dependencies.
Bento built with openllm now defaults to python 3.11 for this change to work.
Installation
pip install openllm==0.4.36
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.4.36
Usage
All available models: openllm models
To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta
To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.36 start HuggingFaceH4/zephyr-7b-beta
Find more information about this release in the CHANGELOG.md
What's Changed
- feat(openai): supports echo by @aarnphm in #760
- fix(openai): logprobs when echo is enabled by @aarnphm in #761
- ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in #767
- chore(deps): bump docker/metadata-action from 5.2.0 to 5.3.0 by @dependabot in #766
- chore(deps): bump actions/setup-python from 4.7.1 to 5.0.0 by @dependabot in #765
- chore(deps): bump taiki-e/install-action from 2.21.26 to 2.22.0 by @dependabot in #764
- chore(deps): bump aquasecurity/trivy-action from 0.14.0 to 0.16.0 by @dependabot in #763
- chore(deps): bump github/codeql-action from 2.22.8 to 2.22.9 by @dependabot in #762
- feat: mixtral support by @aarnphm in #770
Full Changelog: v0.4.35...v0.4.36
v0.4.35
Installation
pip install openllm==0.4.35
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.4.35
Usage
All available models: openllm models
To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta
To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.35 start HuggingFaceH4/zephyr-7b-beta
Find more information about this release in the CHANGELOG.md
What's Changed
- chore(deps): bump pypa/gh-action-pypi-publish from 1.8.10 to 1.8.11 by @dependabot in #749
- chore(deps): bump docker/metadata-action from 5.0.0 to 5.2.0 by @dependabot in #751
- chore(deps): bump taiki-e/install-action from 2.21.19 to 2.21.26 by @dependabot in #750
- ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in #753
- fix(logprobs): explicitly set logprobs=None by @aarnphm in #757
Full Changelog: v0.4.34...v0.4.35
v0.4.34
Installation
pip install openllm==0.4.34
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.4.34
Usage
All available models: openllm models
To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta
To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.34 start HuggingFaceH4/zephyr-7b-beta
Find more information about this release in the CHANGELOG.md
What's Changed
- feat(models): Support qwen by @yansheng105 in #742
New Contributors
- @yansheng105 made their first contribution in #742
Full Changelog: v0.4.33...v0.4.34
v0.4.33
Installation
pip install openllm==0.4.33
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.4.33
Usage
All available models: openllm models
To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta
To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.33 start HuggingFaceH4/zephyr-7b-beta
Find more information about this release in the CHANGELOG.md
Full Changelog: v0.4.32...v0.4.33
v0.4.32
Installation
pip install openllm==0.4.32
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.4.32
Usage
All available models: openllm models
To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta
To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.32 start HuggingFaceH4/zephyr-7b-beta
Find more information about this release in the CHANGELOG.md
What's Changed
- chore(deps): bump taiki-e/install-action from 2.21.17 to 2.21.19 by @dependabot in #735
- chore(deps): bump github/codeql-action from 2.22.7 to 2.22.8 by @dependabot in #734
- chore: revert back previous backend support PyTorch by @aarnphm in #739
Full Changelog: v0.4.31...v0.4.32
v0.4.31
Installation
pip install openllm==0.4.31
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.4.31
Usage
All available models: openllm models
To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta
To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.31 start HuggingFaceH4/zephyr-7b-beta
Find more information about this release in the CHANGELOG.md
What's Changed
Full Changelog: v0.4.30...v0.4.31
v0.4.30
Installation
pip install openllm==0.4.30
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.4.30
Usage
All available models: openllm models
To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta
To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.30 start HuggingFaceH4/zephyr-7b-beta
Find more information about this release in the CHANGELOG.md
Full Changelog: v0.4.29...v0.4.30