v2.18.0
⭐ Highlights
Here’s a quick overview of what’s new in 2.18.0:
- 🐳 Support for models in OCI registry (includes ollama)
- 🌋 Support for llama.cpp with vulkan (container images only for now)
- 🗣️ the transcription endpoint now can also translate with
translate
- ⚙️ Adds
repeat_last_n
andproperties_order
as model configurations - ⬆️ CUDA 12.5 Upgrade: we are now tracking the latest CUDA version (12.5).
- 💎 Gemma 2 model support!
🐋 Support for OCI Images and Ollama Models
You can now specify models using oci://
and ollama://
prefixes in your YAML config files. Here’s an example for Ollama models:
parameters:
model: ollama://...
Start the Ollama model directly with:
local-ai run ollama://gemma:2b
Or download only the model by using:
local-ai models install ollama://gemma:2b
For standard OCI images, use the oci://
prefix. To build a compatible container image, use docker
for example.
Your Dockerfile should look like this:
FROM scratch
COPY ./my_gguf_file.gguf /
You can actually use it to store also other model types (for instance safetensors files for stable diffusion) and YAML config files !
🌋 Vulkan Support for Llama.cpp
We’ve introduced Vulkan support for Llama.cpp! Check out our new image tags latest-vulkan-ffmpeg-core
and v2.18.0-vulkan-ffmpeg-core
.
🗣️ Transcription and Translation
Our transcription endpoint now supports translation! Simply add translate: true
to your transcription requests to translate the transcription to English.
⚙️ Enhanced Model Configuration
We’ve added new configuration options repeat_last_n
and properties_order
to give you more control. Here’s how you can set them up in your model YAML file:
# Force JSON to return properties in the specified order
function:
grammar:
properties_order: "name,arguments"
And for setting repeat_last_n
(specific to Llama.cpp):
parameters:
repeat_last_n: 64
💎 Gemma 2!
Google has just dropped gemma 2 models (blog post here), you can already install and run gemma 2 models in LocalAI with
local-ai run gemma-2-27b-it
local-ai run gemma-2-9b-it
What's Changed
Bug fixes 🐛
- fix(install.sh): correctly handle systemd service installation by @mudler in #2627
- fix(worker): use dynaload for single binaries by @mudler in #2620
- fix(install.sh): fix version typo by @mudler in #2645
- fix(install.sh): move ARCH detection so it works also for mac by @mudler in #2646
- fix(cli): remove duplicate alias by @mudler in #2654
Exciting New Features 🎉
- feat: Upgrade to CUDA 12.5 by @reneleonhardt in #2601
- feat(oci): support OCI images and Ollama models by @mudler in #2628
- feat(whisper): add translate option by @mudler in #2649
- feat(vulkan): add vulkan support to the llama.cpp backend by @mudler in #2648
- feat(ui): allow to select between all the available models in the chat by @mudler in #2657
- feat(build): only build llama.cpp relevant targets by @mudler in #2659
- feat(options): add
repeat_last_n
by @mudler in #2660 - feat(grammar): expose properties_order by @mudler in #2662
🧠 Models
- models(gallery): add l3-umbral-mind-rp-v1.0-8b-iq-imatrix by @mudler in #2608
- models(gallery): ⬆️ update checksum by @localai-bot in #2607
- models(gallery): add llama-3-sec-chat by @mudler in #2611
- models(gallery): add llama-3-cursedstock-v1.8-8b-iq-imatrix by @mudler in #2612
- models(gallery): add llama3-8b-darkidol-1.1-iq-imatrix by @mudler in #2613
- models(gallery): add magnum-72b-v1 by @mudler in #2614
- models(gallery): add qwen2-1.5b-ita by @mudler in #2615
- models(gallery): add hermes-2-theta-llama-3-70b by @mudler in #2626
- models(gallery): ⬆️ update checksum by @localai-bot in #2630
- models(gallery): add dark-idol-1.2 by @mudler in #2663
- models(gallery): add einstein v7 qwen2 by @mudler in #2664
- models(gallery): add arcee-spark by @mudler in #2665
- models(gallery): add gemma2-9b-it and gemma2-27b-it by @mudler in #2670
📖 Documentation and examples
- docs: update to include installer and update advanced YAML options by @mudler in #2631
- feat(swagger): update swagger by @localai-bot in #2651
- feat(swagger): update swagger by @localai-bot in #2666
- telegram-bot example: Update LocalAI version (fixes #2638) by @greygoo in #2640
👒 Dependencies
- ⬆️ Update docs version mudler/LocalAI by @localai-bot in #2605
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2606
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2617
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2629
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2632
- deps(llama.cpp): bump to latest, update build variables by @mudler in #2669
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2652
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2671
Other Changes
- ci: bump parallel jobs by @mudler in #2633
- chore: fix go.mod module by @sozercan in #2635
- rf: centralize base64 image handling and secscan cleanup by @dave-gray101 in #2595
- refactor: gallery inconsistencies by @mudler in #2647
New Contributors
Full Changelog: v2.17.1...v2.18.0