-
-
Notifications
You must be signed in to change notification settings - Fork 85
01. Getting Started
To get started, make sure you have the following installed on your system:
-
Python 3.x (preferably 3.11) with pip
- Do NOT install python from the Microsoft store! This will cause issues with pip.
- Alternatively, you can use miniconda if it's present on your system.
Note
You can install miniconda3 on your system which will give you the benefit of having both python and conda!
Warning
CUDA and ROCm aren't prerequisites because torch can install them for you. However, if this doesn't work (ex. DLL load failed), install the CUDA toolkit or ROCm on your system.
Warning
Sometimes there may be an error with Windows that VS build tools needs to be installed. This means that there's a package that isn't supported for your python version. You can install VS build tools 17.8 and build the wheel locally. In addition, open an issue stating that a dependency is building a wheel.
-
Clone this repository to your machine:
git clone https://github.com/theroyallab/tabbyAPI
-
Navigate to the project directory:
cd tabbyAPI
-
Run the appropriate start script (
start.bat
for Windows andstart.sh
for linux).- Follow the on-screen instructions and select the correct GPU library.
- Assuming that the prerequisites are installed and can be located, a virtual environment will be created for you and dependencies will be installed.
-
The API should start with no model loaded
Note
TabbyAPI has recently switched to use pyproject.toml. These instructions may look different than before.
- Follow steps 1-2 in the For Beginners section
- Create a python environment through venv:
python -m venv venv
- Activate the venv
- On Windows:
.\venv\Scripts\activate
- On Linux:
source venv/bin/activate
- On Windows:
- Install the pyproject features based on your system:
- Cuda 12.x:
pip install -U .[cu121]
- ROCm 5.6:
pip install -U .[amd]
- Cuda 12.x:
- Start the API by either
- Run
start.bat/sh
. The script will check if you're in a conda environment and skip venv checks. - Run
python main.py
to start the API. This won't automatically upgrade your dependencies.
- Run
Loading solely the API may not be your optimal usecase. Therefore, a config.yml exists to tune initial launch parameters and other configuration options.
A config.yml file is required for overriding project defaults. If you are okay with the defaults, you don't need a config file!
If you do want a config file, copy over config_sample.yml
to config.yml
. All the fields are commented, so make sure to read the descriptions and comment out or remove fields that you don't need.
In addition, if you want to manually set the API keys, copy over api_keys_sample.yml
to api_keys.yml
and fill in the fields. However, doing this is less secure and autogenerated keys should be used instead.
You can also access the configuration parameters under 2. Configuration in this wiki!
- Take a look at the usage docs
- Get started with community projects: Find loaders, UIs, and more created by the wider AI community. Any OAI compatible client is also supported.
There are a couple ways to update TabbyAPI:
-
Update scripts - Inside the update_scripts folder, you can run the following scripts:
-
update_deps
: Updates dependencies to their latest versions. -
update_deps_and_pull
: Updates dependencies and pulls the latest commit of the Github repository.
-
These scripts exit after running their respective tasks. To start TabbyAPI, run start.bat
or start.sh
.
-
Manual - Install the pyproject features and update dependencies depending on your GPU:
-
pip install -U .[cu121]
= CUDA 12.x -
pip install -U .[amd]
= ROCm 6.0
-
If you don't want to update dependencies that come from wheels (torch, exllamav2, and flash attention 2), use pip install .
or pass the --nowheel
flag when invoking the start scripts.
Warning
These instructions are meant for advanced users.
Important
If you're installing a custom Exllamav2 wheel, make sure to use pip install .
when updating! Otherwise, each update will overwrite your custom exllamav2 version.
NOTE:
- TabbyAPI enforces the latest Exllamav2 version for compatibility purposes.
- Any upgrades using a pyproject gpu lib feature will result in overwriting your installed wheel.
- To fix this, change the feature in
pyproject.toml
locally, create an issue or PR, or install your version of exllamav2 after upgrades.
- To fix this, change the feature in
Here are ways to install exllamav2:
- From a wheel/release (Recommended)
- Find the version that corresponds with your cuda and python version. For example, a wheel with
cu121
andcp311
corresponds to CUDA 12.1 and python 3.11
- Find the version that corresponds with your cuda and python version. For example, a wheel with
- From pip:
pip install exllamav2
- This is a JIT compiled extension, which means that the initial launch of tabbyAPI will take some time. The build may also not work due to improper environment configuration.
- From source
These are short-form instructions for other methods that users can use to install TabbyAPI.
Warning
Using methods other than venv may not play nice with startup scripts. Using these methods indicates that you're an advanced user and know what you're doing.
- Install Miniconda3 with python 3.11 as your base python
- Create a new conda environment
conda create -n tabbyAPI python=3.11
- Activate the conda environment
conda activate tabbyAPI
- Install optional dependencies if they aren't present
- CUDA via
- CUDA 12 -
conda install -c "nvidia/label/cuda-12.2.2" cuda
- CUDA 12 -
- Git via
conda install -k git
- CUDA via
- Clone TabbyAPI via
git clone https://github.com/theroyallab/tabbyAPI
- Continue installation steps from:
- For Beginners - Step 3. The start scripts detect if you're in a conda environment and skips the venv check.
- For Advanced Users - Step 3
- Install Docker and docker compose from the [docs](https://docs.docker.com/compose/install/
- Install the Nvidia container compatibility layer
- For Linux: Nvidia container toolkit
- For Windows: Cuda Toolkit on WSL
- Clone TabbyAPI via
git clone https://github.com/theroyallab/tabbyAPI
- Enter the tabbyAPI directory by
cd tabbyAPI
.- Optional: Set up a config.yml or api_tokens.yml (configuration)
- Update the volume mount section in the
docker/docker-compose.yml
file
volumes:
# - /path/to/models:/app/models # Change me
# - /path/to/config.yml:/app/config.yml # Change me
# - /path/to/api_tokens.yml:/app/api_tokens.yml # Change me
- Optional: If you'd like to build the dockerfile from source, follow the instructions below in
docker/docker-compose.yml
:
# Uncomment this to build a docker image from source
#build:
# context: ..
# dockerfile: ./docker/Dockerfile
# Comment this to build a docker image from source
image: ghcr.io/theroyallab/tabbyapi:latest
- Run
docker compose -f docker/docker-compose.yml up
to build the dockerfile and start the server.