Releases · bentoml/BentoML

09 Jan 07:01

frostming

v1.2.0a0

777301d

v1.2.0a0 Pre-release

Pre-release

What's Changed

feat: add preview feature for output by @jinyang1994 in #4319
feat: add feature for form validation by @jinyang1994 in #4322
fix: example and bento config by @FogDong in #4324
fix: set wrong default value when type is array by @jinyang1994 in #4332
feat: add v2 config and json override env by @FogDong in #4331
feat: config override by @frostming in #4334
feat: models.save api and tests by @MingLiangDai in #4307
fix: fix config migration by @FogDong in #4341
fix: async api call by @xianml in #4349
fix(config): typo on override by @Haivilo in #4351
feat: reorganize the new SDK package by @frostming in #4337
feat: support .python-version symlink by @aarnphm in #4354
feat: add loading status when form is submitting by @jinyang1994 in #4361
feat: add e2e tests for new SDK by @frostming in #4352
fix(with_config): annotate return type by @aarnphm in #4355
Chore: add supported gpu type by @xianml in #4363
fix(config): make sure to escape quotation for migration of services config by @aarnphm in #4369
fix(sdk): identify async by original func by @bojiang in #4370
chore(bento.yaml): move fields into services by @bojiang in #4372
fix: add services in manifest by @FogDong in #4373
chore(sdk): envs in bentofile by @bojiang in #4378
chore(cloud): include envs in manifest by @bojiang in #4379
chore: cherry-pick SSE utils into 1.2 branch by @aarnphm in #4375
fix: correct 1.2 model list and tag format when pushing bento by @Haivilo in #4381
feat: Update the bento yaml schema by @frostming in #4371
ci: pre-commit autoupdate [skip ci] by @pre-commit-ci in #4382
chore(sdk): able to specify service name by @bojiang in #4377
feat(bentocloud): deployment v2 api client + cli by @Haivilo in #4335
chore(deps): bump github/codeql-action from 2 to 3 by @dependabot in #4343
feat(sdk): use attribute chain as dependency import string by @frostming in #4385
chore(example): change name to avoid conflict by @bojiang in #4387
refactor(impl): 1.2 loader by @bojiang in #4388
fix: refactor deployment v2 client and cli by @FogDong in #4383

Full Changelog: v1.1.11...v1.2.0a0

Contributors

bojiang, jinyang1994, and 8 other contributors

Assets 2

28 Dec 21:36

aarnphm

v1.1.11

40694cf

BentoML - v1.1.11

Bug fixes

Fix streaming for long payloads on remote runners. It will now always yield text and follow SSE protocol. We also provide SSE utils:

import bentoml
from bentoml.io import SSE

class MyRunnable(bentoml.Runnable):
	@bentoml.Runnable.method()
	def streaming(self, text):
		yield "data: 1\n\n"
		yield "data: 12222222222222222222222222222\n\n"

runner = bentoml.Runner(MyRunnable)

svc = bentoml.Service("service", runners=[runner])

@svc.api()
def infer(text):
	result = 0
	async for it in runner.streaming.async_stream(text):
		payload = SSE.from_iterator(it)
		result += int(payload.data)
	return result

What's Changed

docs: Add BentoCloud payment doc by @Sherlock113 in #4286
docs: update quickstart with OpenLLM by @aarnphm in #4295
fix(docs): correct server implementation by @aarnphm in #4297
docs: Remove bill void status by @Sherlock113 in #4299
docs: Update LLM quickstart format and wording by @Sherlock113 in #4300
fix: citation link at README.md by @shenxiangzhuang in #4301
fix(transformers): support trust_remote_code and added unit tests by @MingLiangDai in #4271
docs: Add Bento Deployment details docs by @Sherlock113 in #4304
test: remove outdated tests with pretrained_class parameter by @MingLiangDai in #4308
fix: Updated starlette to >= 0.24.0 by @jakthra in #4306
ci: pre-commit autoupdate [skip ci] by @pre-commit-ci in #4316
fix: syntax error on code snippet in bentoml.onnx.save_model docs by @lucasew in #4323
chore(deps): bump actions/setup-python from 4 to 5 by @dependabot in #4329
docs: fixed typo in file name benchmark README.md by @gazon1 in #4342
fix(stream): streaming enable to work with proxy by @jianshen92 in #4330
docs: fix typo in frameworks transformers guide by @IbrahimAmin1 in #4360
chore(sse): refactor sse utils with efficient buffering by @bojiang in #4362
fix(runner): fix DataFrame container header too long by @larme in #4364
chore(generated): new stubs for proto 4 by @aarnphm in #4374

New Contributors

@shenxiangzhuang made their first contribution in #4301
@jakthra made their first contribution in #4306
@lucasew made their first contribution in #4323
@gazon1 made their first contribution in #4342
@IbrahimAmin1 made their first contribution in #4360

Full Changelog: v1.1.10...v1.1.11

Contributors

larme, bojiang, and 11 other contributors

Assets 2

20 Nov 04:07

aarnphm

v1.1.10

fa27883

BentoML - v1.1.10

Released a patch that set the upper bound for cattrs<23.2, which breaks our whole serialisation process both upstream and downstream.

What's Changed

fix: StreamingResponse compatibility issue by @xianml in #4248
bentocloud doc update --data field by @MingLiangDai in #4272
docs: Add repository and bento selection notes by @Sherlock113 in #4280
fix(dispatcher): unbounded overload batch_size by @aarnphm in #4273
docs: update transformers example to include gpu options by @ssheng in #4281
fix monitoring docs configuration.yaml typo by @KimSoungRyoul in #4287
fix: runnable framework logic in transformers.py by @benfu-verses in #4291
docs: Update docs on supported CUDA versions by @Sherlock113 in #4288
docs: Add docs for new transformers model import API by @Sherlock113 in #4282
fix: Disable exception in serve-grpc on Windows in development mode by @zimka in #4294
fix(dependencies): lock cattrs<23.2 for now by @aarnphm in #4292

New Contributors

@benfu-verses made their first contribution in #4291
@zimka made their first contribution in #4294

Full Changelog: v1.1.9...v1.1.10

Contributors

ssheng, zimka, and 6 other contributors

Assets 2

09 Nov 17:48

ssheng

v1.1.9

a59750c

BentoML - v1.1.9

Import Hugging Face Transformers Model: the bentoml.transformers.import_model API imports pretrained transformers models directly from HuggingFace. Using this API allows importing Transformers models into the BentoML model store without loading the model into memory. The bentoml.transformers.import_model API takes the first argument to be the model name in BentoML store, and the second argument to be the model_id on HuggingFace Hub.

import bentoml

bentomodel = bentoml.transformers.import_model("zephyr-7b-beta", "HuggingFaceH4/zephyr-7b-beta")

Standardize with nvidia-ml-py: BentoML now uses the official nvidia-ml-py package instead of pynvml to avoid conflict with other packages.
Define Environment Variable in Configuration: Within bentoml_configuration.yaml, values in the form of ${ENV_VAR} will be expanded at runtime to the value of the corresponding environment variable, but please note that this only supports string types.

What's Changed

docs: Update the deployment docs by @Sherlock113 in #4260
ci: pre-commit autoupdate [skip ci] by @pre-commit-ci in #4264
feat: import model for transformers framework by @MingLiangDai in #4247
build: Use official nvidia-ml-py package instead of fork by @ecederstrand in #4208

New Contributors

@MingLiangDai made their first contribution in #4247
@ecederstrand made their first contribution in #4208

Full Changelog: v1.1.7...v1.1.9

Contributors

ecederstrand, pre-commit-ci, and 2 other contributors

Assets 2

08 Nov 23:43

ssheng

v1.1.8

ff26f66

BentoML - v1.1.8

What's Changed

docs: Add the OpenLLM Llama 2 Colab link by @Sherlock113 in #4235
docs: Add best practices doc for building and deploying Bentos by @Sherlock113 in #4237
docs: Update the bento building and deployment best practices doc by @Sherlock113 in #4242
docs: Add global token and token expiration in docs by @Sherlock113 in #4243
fix: client API by @alexparker443 in #4245
docs: Update the Bentos doc by @Sherlock113 in #4246
docs: Update OneDiffusion Colab link by @Sherlock113 in #4249
fix: suppress abstractmethod TypeError of TritonRunnerHandle by @netoou in #4251
fix: send request ID in response headers in all cases by @frostming in #4253
fix(client): prepend http if necessary in wait by @sauyon in #4255
fix: Bug 4252 Restore functioning benoml.ray.deployment by @jerryharrow in #4257
fix(client): add http if required to sync client wait by @sauyon in #4258
fix(ci): fix tests by @sauyon in #4259
feat: support storing secrets with env vars in config by @frostming in #4254

New Contributors

@alexparker443 made their first contribution in #4245
@netoou made their first contribution in #4251
@jerryharrow made their first contribution in #4257

Full Changelog: v1.1.7...v1.1.8

Contributors

sauyon, frostming, and 4 other contributors

Assets 2

12 Oct 18:24

aarnphm

v1.1.7

1e8902a

BentoML - v1.1.7

What's Changed

Update OTEL deps to 0.41b0 to address CVE for 0.39b0

General documentation client updates.

docs: Add the SDXL deployment quickstart by @Sherlock113 in #4175
Update pytorch.rst by @piercus in #4176
chore(deps): bump actions/checkout from 3 to 4 by @dependabot in #4177
fix: parse tag from multiline output by @frostming in #4178
docs: Update the user management docs by @Sherlock113 in #4186
fix(config): set default runner timeout to 15min by @sauyon in #4184
docs: Add observability to the BentoCloud overview docs by @Sherlock113 in #4187
fix(framework): add args and kwargs to sklearn and xgboost methods by @jianshen92 in #4189
docs: fix typo in bento.rst and model.rst by @seedspirit in #4192
fix: Rename ASGIHTTPSender to BufferedASGISender for Ray compatibility. by @HamzaFarhan in #4191
fix(client): make get_client raise instead of logging by @sauyon in #4181
fix(cloud-client): delete unused field of schema by @Haivilo in #4196
chore(deps): bump docker/setup-buildx-action from 2 to 3 by @dependabot in #4195
chore(deps): bump docker/setup-qemu-action from 2 to 3 by @dependabot in #4194
chore: client_request_hook type fix by @sauyon in #4199
docs: Add docs for the new bentoml.Server API by @Sherlock113 in #4198
docs: Add the OneDiffusion Google Colab task by @Sherlock113 in #4202
docs: Add best practices doc for cost optimization by @Sherlock113 in #4200
docs: Update the Manage Models and Bentos docs by @Sherlock113 in #4203
fix: do not use UDS on WSL by @frostming in #4204
docs: fix typos in help messages by @smidm in #4206
fix: subprocess not using same python as main process causing bentoml.bentos.build to crash by @nickolasrm in #4209
fix: allow WSL in the condition by @frostming in #4210
docs: Update manage access token docs by @Sherlock113 in #4215
ci: pre-commit autoupdate [skip ci] by @pre-commit-ci in #4216
fix: EasyOCR integration docs mistake by @jianshen92 in #4214
fix: include mounted FastAPI app's OpenAPI components by @RobbieFernandez in #4212
UPDATE: model.py -> fix Model class Exepction message. by @JminJ in #4219
docs: Remove private access mention by @Sherlock113 in #4221
docs: Change to sentence case by @Sherlock113 in #4222
docs: Fix dead link by @Sherlock113 in #4225
feat: support ipv6 addresses for serve by @sauyon in #3914
docs: Fix all dead links in BentoML docs by @Sherlock113 in #4229
docs: Add the BYOC doc by @Sherlock113 in #4223
docs: Update the Services doc by @Sherlock113 in #4231
fix(client): type fixes by @sauyon in #4182
fix: correct the bento size to include the size of models by @frostming in #4226
fix: use httpx for usage tracking by @sauyon in #4228
fix(deps): bump otel for CVE by @aarnphm in #4233
feat: separate and optimize async and sync clients by @judahrand in #4116

New Contributors

@piercus made their first contribution in #4176
@seedspirit made their first contribution in #4192
@HamzaFarhan made their first contribution in #4191
@nickolasrm made their first contribution in #4209
@JminJ made their first contribution in #4219

Full Changelog: v1.1.6...v1.1.7

Contributors

piercus, smidm, and 14 other contributors

Assets 2

08 Sep 05:23

ssheng

v1.1.6

c1504bd

BentoML - v1.1.6

What's Changed

fix(exception): catch exception for users' runners code by @aarnphm in #4150
docs: Add the streaming docs by @Sherlock113 in #4164
ci: pre-commit autoupdate [skip ci] by @pre-commit-ci in #4167
fix(httpclient): take into account trailing slash in from_url by @sauyon in #4169
docs: fix typo by @Sherlock113 in #4173
fix: apply env map for distributed runner workers by @bojiang in #4174

New Contributors

@pre-commit-ci made their first contribution in #4167

Full Changelog: v1.1.5...v1.1.6

Contributors

sauyon, bojiang, and 3 other contributors

Assets 2

08 Sep 05:15

ssheng

v1.1.5

ca6eca5

BentoML - v1.1.5

What's Changed

fix(type): explicit init for attrs Runner by @aarnphm in #4140
fix: typo in ALLOWED_CUDA_VERSION_ARGS by @thomasjo in #4156
chore(deps): open Starlette version, to allow latest by @alexeyshockov in #4100
chore: lower bound for cloudpickle by @aarnphm in #4098
docs: Add embedded runners docs by @Sherlock113 in #4157
fix cloud client types by @sauyon in #4160
fix: use closer-integrated callbackwrapper by @sauyon in #4161
chore(annotations): cleanup compat and fix ModelSignatureDict type by @aarnphm in #4162
fix(pull): correct use cloud_context for models pull by @aarnphm in #4163

New Contributors

@thomasjo made their first contribution in #4156
@alexeyshockov made their first contribution in #4100

Full Changelog: v1.1.4...v1.1.5

Contributors

thomasjo, alexeyshockov, and 3 other contributors

Assets 2

30 Aug 01:17

aarnphm

v1.1.4

7a83d99

BentoML - v1.1.4

🍱 To better support LLM serving through response streaming, we are proud to introduce an experimental support of server-sent events (SSE) streaming support in this release of BentoML v1.14 and OpenLLM v0.2.27. See an example service definition for SSE streaming with Llama2.

Added response streaming through SSE to the bentoml.io.Text IO Descriptor type.
Added async generator support to both API Server and Runner to yield incremental text responses.
Added supported to ☁️ BentoCloud to natively support SSE streaming.

🦾 OpenLLM added token streaming capabilities to support streaming responses from LLMs.

Added /v1/generate_stream endpoint for streaming responses from LLMs.

curl -N -X 'POST' 'http://0.0.0.0:3000/v1/generate_stream' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{
  "prompt": "### Instruction:\n What is the definition of time (200 words essay)?\n\n### Response:",
  "llm_config": {
    "use_llama2_prompt": false,
    "max_new_tokens": 4096,
    "early_stopping": false,
    "num_beams": 1,
    "num_beam_groups": 1,
    "use_cache": true,
    "temperature": 0.89,
    "top_k": 50,
    "top_p": 0.76,
    "typical_p": 1,
    "epsilon_cutoff": 0,
    "eta_cutoff": 0,
    "diversity_penalty": 0,
    "repetition_penalty": 1,
    "encoder_repetition_penalty": 1,
    "length_penalty": 1,
    "no_repeat_ngram_size": 0,
    "renormalize_logits": false,
    "remove_invalid_values": false,
    "num_return_sequences": 1,
    "output_attentions": false,
    "output_hidden_states": false,
    "output_scores": false,
    "encoder_no_repeat_ngram_size": 0,
    "n": 1,
    "best_of": 1,
    "presence_penalty": 0.5,
    "frequency_penalty": 0,
    "use_beam_search": false,
    "ignore_eos": false
  },
  "adapter_name": null
}'

What's Changed

docs: Update the models doc by @Sherlock113 in #4145
docs: Add more workflows to the GitHub Actions doc by @Sherlock113 in #4146
docs: Add text embedding example to readme by @Sherlock113 in #4151
fix: bento build cache miss by @xianml in #4153
fix(buildx): parsing attestation on docker desktop by @aarnphm in #4155

New Contributors

@xianml made their first contribution in #4153

Full Changelog: v1.1.3...v1.1.4

Contributors

aarnphm, Sherlock113, and xianml

Assets 2

22 Aug 02:46

aarnphm

v1.1.2

a2ead21

BentoML - v1.1.2

Patch releases

BentoML now provides a new diffusers integration, bentoml.diffusers_simple.

This introduces two integration for stable_diffusion and stable_diffusion_xl model.

import bentoml

# Create a Runner for a Stable Diffusion model
runner = bentoml.diffusers_simple.stable_diffusion.create_runner("CompVis/stable-diffusion-v1-4")

# Create a Runner for a Stable Diffusion XL model
runner_xl = bentoml.diffusers_simple.stable_diffusion_xl.create_runner("stabilityai/stable-diffusion-xl-base-1.0")

General bug fixes and documentation improvement

What's Changed

docs: Add the Overview and Quickstarts sections by @Sherlock113 in #4088
chore(type): makes ModelInfo mypy-compatible by @aarnphm in #4094
feat(store): update annotations by @aarnphm in #4092
docs: Fix some relative links by @Sherlock113 in #4097
docs: Add the Iris quickstart doc by @Sherlock113 in #4096
docs: Add the yolo quickstart by @Sherlock113 in #4099
docs: Code format fix by @Sherlock113 in #4101
fix: respect environment during bentoml.bentos.build by @aarnphm in #4081
docs: replaced deprecated save to save_model in pytorch.rst by @EgShes in #4102
fix: Make the install command shorter by @frostming in #4103
docs: Update the BentoCloud Build doc by @Sherlock113 in #4104
docs: Add quickstart repo link and move torch import in Yolo by @Sherlock113 in #4106
docs: fix typo by @zhangwm404 in #4108
docs: fix typo by @zhangwm404 in #4109
fix: calculate Pandas DataFrame batch size correctly by @judahrand in #4110
fix(cli): fix CLI output to BentoCloud by @Haivilo in #4114
Fix sklearn example docs by @jianshen92 in #4121
docs: Add the BentoCloud Deployment creation and update page property explanations by @Sherlock113 in #4105
fix: disable pyright for being too strict by @frostming in #4113
refactor(cli): change prompt of cloud cli to unify Yatai and BentoCloud by @Haivilo in #4124
fix(cli): change model to lower case by @Haivilo in #4126
chore(ci): remove codestyle jobs by @aarnphm in #4125
fix: don't pass column names twice by @judahrand in #4120
feat: SSE (Experimental) by @jianshen92 in #4083
docs: Restructure the get started section in BentoCloud docs by @Sherlock113 in #4129
docs: change monitoring image by @Haivilo in #4133
feat: Rust gRPC client by @aarnphm in #3368
feature(framework): diffusers lora and textual inversion support by @larme in #4086
feat(buildx): support for attestation and sbom with buildx by @aarnphm in #4132

New Contributors

@EgShes made their first contribution in #4102
@zhangwm404 made their first contribution in #4108

Full Changelog: v1.1.1...v1.1.2

Contributors

larme, zh4n7wm, and 7 other contributors

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

Contributors

Bug fixes

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

Patch releases

What's Changed

New Contributors

Contributors

Releases: bentoml/BentoML

v1.2.0a0

What's Changed

Contributors

BentoML - v1.1.11

Bug fixes

What's Changed

New Contributors

Contributors

BentoML - v1.1.10

What's Changed

New Contributors

Contributors

BentoML - v1.1.9

What's Changed

New Contributors

Contributors

BentoML - v1.1.8

What's Changed

New Contributors

Contributors

BentoML - v1.1.7

What's Changed

New Contributors

Contributors

BentoML - v1.1.6

What's Changed

New Contributors

Contributors

BentoML - v1.1.5

What's Changed

New Contributors

Contributors

BentoML - v1.1.4

What's Changed

New Contributors

Contributors

BentoML - v1.1.2

Patch releases

What's Changed

New Contributors

Contributors