v1.10.0 - 2025-01-04
- Update command r to 2024-08
- Sort models alphabetically before output
- Add qwen2.5 7b 14b 32b
- Add mistral small 22b
- Add phi 3.5 mini 3b
- Update mistral 7b to 0.3
- Replace mistral 7b with ministral 8b
- Remove old llama 2 model from table
v1.9.0 - 2024-12-20
- Update nvidia/cuda to 12.6.3
- Switch to CMake as Makefile is deprecated
v1.8.0 - 2024-08-17
- Set llama 3.1 as default llama 3 target
- Add more detailed command line usage
- Add llama 3.1 8b model
- Add phi 3 medium and gemma 2 9b and 27b
- Update nvidia/cuda to 12.5.1
- Add pattern rule to download models
- Move shell into functions
- Simplify adding new models
v1.7.1 - 2024-06-13
- Copy over new binary names
v1.7.0 - 2024-06-12
- Update nvidia/cuda to 12.5.0
- Add libomp to clang build
- Add libgomp to cuda build
- LLAMA_CUBLAS is deprecated
v1.6.1 - 2024-05-21
- Include new models as makefile targets (again)
v1.6.0 - 2024-05-20
- Add llama 3 8B and phi 3 mini
v1.5.1 - 2024-04-15
- Include new models as makefile targets
v1.5.0 - 2024-04-15
- Add command-r 35b model
- Add starling 7b beta model
- Sort by lmsys leaderboard elo score
- Update nvidia/cuda to 12.4.1
v1.4.0 - 2024-03-06
- Use clang 16 instead of gcc in cpu version
v1.3.1 - 2024-01-06
- Move entrypoint to bottom of Dockerfile
- Suppress missing nvidia-smi command output
v1.3.0 - 2024-01-03
- Improve docker sudo detection in Makefile
- Move download to docker-entrypoint.sh
- Rename target to llama 2 to match download
v1.2.1 - 2023-12-27
- Ensure build stages are named
v1.2.0 - 2023-12-25
- Automatically build and run gpu or cpu version
v1.1.0 - 2023-12-22
- Convert env vars to command line args
v1.0.0 - 2023-12-20
- Do not add downloaded models to git
- Run llama.cpp with GPU enabled Docker Compose