Deploy Ollama with a model bundled in the Docker container on Koyeb
Learn more about Koyeb
·
Explore the documentation
·
Discover our tutorials
Koyeb is a developer-friendly serverless platform to deploy apps globally. No-ops, servers, or infrastructure management.
This repository is designed to show how to deploy an Ollama instance with a model bundle in the Docker container to Koyeb. The Dockerfile
allows for configuration through environment variables to make deployment and configuration more straightforward. By default, the image deploys Ollama with the gemma2:2b
model, but this is configurable using the MODEL_NAME
environment variable.
Follow the steps below to deploy an Ollama instance to your Koyeb account.
To use this repository, you need:
- A Koyeb account to build the
Dockerfile
and deploy it to the platform. If you don't already have an account, you can sign-up for free.
The fastest way to deploy an Ollama instance is to click the Deploy to Koyeb button below.
Clicking on this button brings you to the Koyeb App creation page with most of the settings pre-configured to launch this application. You will need to configure the following environment variables:
MODEL_NAME
: Set this to the name of the model you wish to use, as given on the Ollama site. You can check what models Ollama supports to find out more. Click the model name copy icon on the Hugging Face page to copy the appropriate value. If not provided, thegemma2:2b
model will be deployed.
To modify this application example, you will need to fork this repository. Check out the fork and deploy instructions.
If you want to customize and enhance this application, you need to fork this repository.
If you used the Deploy to Koyeb button, you can simply link your service to your forked repository to be able to push changes. Alternatively, you can manually create the application as described below.
On the Koyeb Control Panel, on the Overview tab, click the Create Web Service button to begin.
-
Select GitHub as the deployment method.
-
Choose the repository containing your application code.
-
Expand the Environment variables section and click Bulk edit to configure new environment variables. Paste the following variable definitions in the box:
MODEL_NAME=
Fill out the values as described in the previous section.
-
In the Instance section, select the GPU category and choose RTX-4000-SFF-ADA.
-
Click Deploy.
The repository will be pulled, built, and deployed on Koyeb. Once the deployment is complete, it will be accessible using the Koyeb subdomain for your service.
If you have any questions, ideas or suggestions regarding this application sample, feel free to open an issue or fork this repository and open a pull request.