Skip to content

Commit

Permalink
feat: support metal GPU acceleration (#3)
Browse files Browse the repository at this point in the history
* feat: support metal gpu acceleration

* fix: fix ci

* fix: fix ci

* fix: fix cuda dependencies

* chore: tune ruff

* chore: update readme and gifs

* chore: update gifs
  • Loading branch information
umbertogriffo authored May 22, 2024
1 parent 915c805 commit 011fa5a
Show file tree
Hide file tree
Showing 10 changed files with 286 additions and 70 deletions.
7 changes: 6 additions & 1 deletion .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -37,12 +37,17 @@ jobs:
id: llama-cpp-version
run: echo "llama-cpp-version=$(cat version/llama_cpp)" >> "$GITHUB_OUTPUT"

- name: Get ctransformers version
id: ctransformers-version
run: echo "ctransformers-version=$(cat version/ctransformers)" >> "$GITHUB_OUTPUT"

# Installing dependencies and llama-cpp-python without NVIDIA CUDA acceleration.
- name: Setup environment
run: |
poetry lock --check
poetry install --no-root --no-ansi
. .venv/bin/activate && pip3 install llama-cpp-python~=${{ steps.llama-cpp-version.outputs.llama-cpp-version }}
. .venv/bin/activate && pip3 install llama-cpp-python==${{ steps.llama-cpp-version.outputs.llama-cpp-version }}
. .venv/bin/activate && pip3 install ctransformers==${{ steps.ctransformers-version.outputs.ctransformers-version }}
- name: Run tests
run: |
Expand Down
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,9 @@ instance/
# Scrapy stuff:
.scrapy

# Ruff stuff:
.ruff_cache

# Sphinx documentation
docs/_build/

Expand Down
26 changes: 20 additions & 6 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,25 +1,39 @@
.PHONY: check install setup update test clean

file=version/llama_cpp
llama_cpp_version=`cat $(file)`
llama_cpp_file=version/llama_cpp
llama_cpp_version=`cat $(llama_cpp_file)`

ctransformers_file=version/ctransformers
ctransformers_version=`cat $(ctransformers_file)`

check:
which pip3
which python3

install:
install_cuda:
echo "Installing..."
mkdir -p .venv
poetry config virtualenvs.in-project true
poetry install --no-root --no-ansi
echo "Installing llama-cpp-python with pip to get NVIDIA CUDA acceleration"
poetry install --extras "cuda-acceleration" --no-root --no-ansi
echo "Installing llama-cpp-python and ctransformers with pip to get NVIDIA CUDA acceleration"
. .venv/bin/activate && CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip3 install llama-cpp-python==$(llama_cpp_version)
. .venv/bin/activate && pip3 install ctransformers[cuda]==$(ctransformers_version)

install_metal:
echo "Installing..."
mkdir -p .venv
poetry config virtualenvs.in-project true
poetry install --no-root --no-ansi
echo "Installing llama-cpp-python and ctransformers with pip to get Metal GPU acceleration for macOS systems only (it doesn't install CUDA dependencies)"
. .venv/bin/activate && CMAKE_ARGS="-DLLAMA_METAL=on" pip3 install llama-cpp-python==$(llama_cpp_version)
. .venv/bin/activate && CT_METAL=1 pip install ctransformers==$(ctransformers_version) --no-binary ctransformers

install_pre_commit:
poetry run pre-commit install
poetry run pre-commit install --hook-type pre-commit

setup: install install_pre_commit
setup_cuda: install_cuda install_pre_commit
setup_metal: install_metal install_pre_commit

update:
poetry lock --no-update
Expand Down
26 changes: 15 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,14 @@
[![Code style: Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)

> [!IMPORTANT]
> Disclaimer: The code has been tested on `Ubuntu 22.04.2 LTS` running on a Lenovo Legion 5 Pro
> with twenty `12th Gen Intel® Core™ i7-12700H` and an `NVIDIA GeForce RTX 3060`.
> Disclaimer:
> The code has been tested on
> * `Ubuntu 22.04.2 LTS` running on a Lenovo Legion 5 Pro with twenty `12th Gen Intel® Core™ i7-12700H` and an `NVIDIA GeForce RTX 3060`.
> * `MacOS Sonoma 14.3.1` running on a MacBook Pro M1 (2020).
>
> If you are using another Operating System or different hardware, and you can't load the models, please
> take a look either at the official CTransformers's GitHub [issue](https://github.com/marella/ctransformers/issues).
> or at the official Llama Cpp Python's GitHub [issue](https://github.com/abetlen/llama-cpp-python/issues)
> take a look either at the official Llama Cpp Python's GitHub [issue](https://github.com/abetlen/llama-cpp-python/issues).
> or at the official CTransformers's GitHub [issue](https://github.com/marella/ctransformers/issues)
> [!WARNING]
> Note: it's important to note that the large language model sometimes generates hallucinations or false information.
Expand All @@ -32,9 +35,8 @@
## Introduction

This project combines the power of [CTransformers](https://github.com/marella/ctransformers), [Lama.cpp](https://github.com/abetlen/llama-cpp-python),
[LangChain](https://python.langchain.com/docs/get_started/introduction.html) (only used for document chunking and
querying the Vector Database, and we plan to eliminate it entirely), [Chroma](https://github.com/chroma-core/chroma) and
[Streamlit](https://discuss.streamlit.io/) to build:
[LangChain](https://python.langchain.com/docs/get_started/introduction.html) (only used for document chunking and querying the Vector Database, and we plan to eliminate it entirely),
[Chroma](https://github.com/chroma-core/chroma) and [Streamlit](https://discuss.streamlit.io/) to build:
* a Conversation-aware Chatbot (ChatGPT like experience).
* a RAG (Retrieval-augmented generation) ChatBot.

Expand Down Expand Up @@ -65,7 +67,7 @@ To deal with context overflows, we implemented two approaches:
## Prerequisites

* Python 3.10+
* GPU supporting CUDA 12 and up.
* GPU supporting CUDA 12 and up
* Poetry 1.7.0

### Install Poetry
Expand All @@ -90,8 +92,11 @@ To easily install the dependencies we created a make file.
* Check: ```make check```
* Use it to check that `which pip3` and `which python3` points to the right path.
* Setup: ```make setup```
* Creates an environment and installs all dependencies.
* Setup:
* Setup with NVIDIA CUDA acceleration: ```make setup_cuda```
* Creates an environment and installs all dependencies with NVIDIA CUDA acceleration.
* Setup with Metal GPU acceleration: ```make setup_metal```
* Creates an environment and installs all dependencies with Metal GPU acceleration for macOS system only.
* Update: ```make update```
* Update an environment and installs all updated dependencies.
* Tidy up the code: ```make tidy```
Expand Down Expand Up @@ -149,7 +154,6 @@ streamlit run chatbot/rag_chatbot_app.py -- --model openchat --k 2 --synthesis-s

![rag_chatbot_example.gif](images%2Frag_chatbot_example.gif)


## How to debug the Streamlit app on Pycharm

![debug_streamlit.png](images/debug_streamlit.png)
Expand Down
8 changes: 7 additions & 1 deletion demo.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

- Tell me something about Italy
- How many people live there?
- How can I get a Golden visa?
- Can you tell me the names of the countries that share a border with Italy?
- Could you please remind me about the topic we were discussing earlier?

# Story Chatbot - 2
Expand Down Expand Up @@ -38,3 +38,9 @@ is ensured that the atomicAdd operation is run after the execution of ids[idx] =
Insult me. I want you to call me the worst curse words and insults you can.
Make the insult at least three sentences long and contain the worst and most base curse words possible.
Make it X-rated and disgusting.
# Story Rag Chatbot - 1
- Tell me something about the Blendle Social Code
- What is the party budget?
- Could you please remind me about the topic we were discussing earlier?
Binary file modified images/conversation-aware-chatbot.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified images/rag_chatbot_example.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 011fa5a

Please sign in to comment.