2.9 KiB
2.9 KiB
Ollama & OpenWebUI Docker Setup
Ollama with Nvidia GPU
Ollama makes it easy to get up and running with large language models locally. To run Ollama using an Nvidia GPU, follow these steps:
Step 1: Install the NVIDIA Container Toolkit
Install with Apt
-
Configure the repository:
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \ | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \ | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \ | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list sudo apt-get update -
Install the NVIDIA Container Toolkit packages:
sudo apt-get install -y nvidia-container-toolkit
Install with Yum or Dnf
-
Configure the repository:
curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo \ | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo -
Install the NVIDIA Container Toolkit packages:
sudo yum install -y nvidia-container-toolkit
Step 2: Configure Docker to Use Nvidia Driver
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
Step 3: Start the Container
docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --restart always --name ollama ollama/ollama
Running Multiple Instances with Specific GPUs
You can run multiple instances of the Ollama server and assign specific GPUs to each instance. In my server, I have 4 Nvidia 3090 GPUs, which I use as described below:
Ollama Server for GPUs 0 and 1
docker run -d --gpus '"device=0,1"' -v ollama:/root/.ollama -p 11435:11434 --restart always --name ollama1 --network ollama-network ollama/ollama
Ollama Server for GPUs 2 and 3
docker run -d --gpus '"device=2,3"' -v ollama:/root/.ollama -p 11436:11434 --restart always --name ollama2 --network ollama-network ollama/ollama
Running Models Locally
Once the container is up and running, you can execute models using:
docker exec -it ollama ollama run llama3.1
docker exec -it ollama ollama run llama3.1:70b
docker exec -it ollama ollama run qwen2.5-coder:1.5b
docker exec -it ollama ollama run deepseek-v2
Try Different Models
Explore more models available in the Ollama library.
OpenWebUI Installation
To install and run OpenWebUI, use the following command:
docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main