-
-
Notifications
You must be signed in to change notification settings - Fork 85
05. FAQ
Brian Dashore edited this page Sep 1, 2024
·
2 revisions
-
What OS is supported?
- Windows and Linux
-
I'm confused, how do I do anything with this API?
- That's okay! Not everyone is an AI mastermind on their first try. The start scripts and
config.yml
aim to guide new users to the right configuration. The Usage page explains how the API works. Community Projects contain UIs that help interact with TabbyAPI via API endpoints. The Discord Server is also a place to ask questions, but please be nice.
- That's okay! Not everyone is an AI mastermind on their first try. The start scripts and
-
How do I interface with the API?
-
What does TabbyAPI run?
- TabbyAPI uses Exllamav2 as a powerful and fast backend for model inference, loading, etc. Therefore, the following types of models are supported:
- Exl2 (Highly recommended)
- GPTQ
- FP16 (using Exllamav2's loader)
- TabbyAPI uses Exllamav2 as a powerful and fast backend for model inference, loading, etc. Therefore, the following types of models are supported:
-
Exllamav2 may error with the following exception:
ImportError: DLL load failed while importing exllamav2_ext: The specified module could not be found.
- First, make sure to check if the wheel is equivalent to your python version and CUDA version. Also make sure you're in a venv or conda environment.
- If those prerequisites are correct, the torch cache may need to be cleared. This is due to a mismatching exllamav2_ext.
- In Windows: Find the cache at
C:\Users\<User>\AppData\Local\torch_extensions\torch_extensions\Cache
where<User>
is your Windows username - In Linux: Find the cache at
~/.cache/torch_extensions
- look for any folder named
exllamav2_ext
in the python subdirectories and delete them. - Restart TabbyAPI and launching should work again.
- In Windows: Find the cache at