Seamlessly deploy LLMs locally on your Jetson Nano and make use of small models such as TinyLlama and leverage their power using a web-ui.
Run the following command for a quickstart
git clone https://github.com/n-vlahovic/edge-llm
cd edge-llm
make init
or just
make init
in case you already cloned and CD'd into the repo.
The WebUI is available under http://<jetson-ip>:<WEBUI_PORT>
and the LLM backend under http://<jetson-ip>:<LLM_PORT>
.
You can now proceed to logging in.
- Requirements
- Quickstart
- Updating Docker
- Creating an ENV File
- Building the Services
- Pulling Models
- Logging into the Web-UI
- NVidia Jetson Nano
- 4GB >= RAM
- 32GB >= Storage (64GB >= preferred)
- JetPack SDK
If you installed JetPack following the official getting-started docs, it might be that your docker; docker-compose
version is somewhat dated.
Running make docker-update
will update docker; docker-compose
by invoking the ./scripts/update_docker.py
script, which:
- Checks if the docker version is
- Removes the existing
docker; docker-compose
installations - Adds the Docker's official GPG key
- Adds the repository to Apt sources
- Installs the Docker packages
- Enables docker service
- Adds the current
$USER
to thedocker
user group (a reboot will be necessary)
The docker-compose.yml
file has abstracted certain parameters by leveraging .env
files. An .env
file with the following variables needs to be created:
LLM_PORT=<int>
: The exposed port (on the Jetson nano - not the container) of the LLM backend.WEBUI_PORT=<int>
: The exposed port (on the Jetson nano - not the container) of the LLM WebUI.
Here is a possible way to create the .env
file:
python3 scripts/check_dotenv.py
or
echo 'LLM_PORT=8000' >> .env && echo 'WEBUI_PORT=3000' >> .env
To build the services, simply run make build
which runs docker compose up --build -d
.
To stop or kill the services, you can run make <stop|kill>
which runs docker compose <stop|down -v>
.
Our setup uses Ollama as backend, here is a list of out-of-the-box available models.
Models can be pulled as such:
python3 scripts/pull_model.py -m <model>
alternatively
docker exec -it ollama ollama pull <model>
Given the compute and memory limitations of our platform, we should aim at deploying small models, e.g. tinyllama
.
Using any device connected to the same network as your Jetson Nano, navigate to the URL http://<HOST>:<WEB_UI_PORT>
(the HOST
can typically be the hostname
of your Jetson Nano e.g. jetson-nano.local
or its ip-address which can be determined via ip a
).
You should be prompted with this login screen.
If you didn't create an account, create one via the Sign up link.
Please note that this creates a local account. For persistency, the data is saved via docker volumes
onto the filesystem in the ./ollama-webui
directory (which is excluded from git
).