Skip to content

koboldcpp-1.66.1

Compare
Choose a tag to compare
@LostRuins LostRuins released this 24 May 10:33
· 1126 commits to concedo since this release

koboldcpp-1.66.1

Phi guess that's the way the cookie crumbles edition

  • NEW: Added custom SD LoRA support! Specify it with --sdlora and set the LoRA multiplier with --sdloramult. Note that SD LoRAs can only be used when loading in 16bit (e.g. with the .safetensors model) and will not work on quantized models (so incompatible with --sdquant)
  • NEW: Added custom SD VAE support, which can be specified in the Image Gen tab of the GUI launcher, or using --sdvae [vae_file.safetensors]
  • NEW: Added in-built support for TAE SD for SD1.5 and SDXL. This is a very small VAE replacement that can be used if a model has a broken VAE, it also works faster than regular VAE. To use it, select "Fix Bad VAE" checkbox or use the flag --sdvaeauto
    • Note: Do not use the above new flags with --sdconfig, which is a deprecated flag and not to be used.
  • NEW: Added experimental support for Rep Pen Slope. This is not a true slope, but the end result is it applies a slightly reduced rep pen for older tokens within the rep pen range, scaled by the slope value. Setting rep pen slope to 1 negates this effect. For compatibility reasons, rep pen slope defaults to 1 if unspecified (same behavior as before).
  • NEW: You can now specify a http/https URL to a GGUF file when passing the --model parameter, or in the model selector UI. KoboldCpp will attempt to download the model file into your current working directory, and automatically load it when the download is done.
  • Disable UI launcher scaling on MacOS due to display issues. Please report any further scaling issues.
  • Improved EOT token handling, fixed a bug in token speed calculations.
  • Default thread count will not exceed 8 unless overridden, this helps mitigate e-core issues.
  • Merged improvements and fixes from upstream, including new Phi support and Vulkan fixes from @0cc4m
  • Updated Kobold Lite:
    • Now attempts to function correctly if hosted on a subdirectory URL path (e.g. using a reverse proxy), if that fails it defaults back to the root URL.
    • Changed default chatmode player name from "You" to "User", which solves some wonky phrasing issues.
    • Added viewport width controls in settings, including horizontal fullscreen.
    • Minor bugfixes for markdown

Fix for 1.66.1 - Fixed quant tools makefile, fixed sd seed parsing, updated lite

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller.
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller.
If you have a newer Nvidia GPU, you can use the CUDA 12 version koboldcpp_cu12.exe (much larger, slightly faster).
If you're using Linux, select the appropriate Linux binary file instead (not exe).
If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001

For more information, be sure to run the program from command line with the --help flag.