This guide explains how to run the RagCode infrastructure (Qdrant + Ollama) using Docker, while leveraging your existing local Ollama models.
- Isolation: Keeps your system clean.
- Consistency: Ensures you run the exact version required.
- Integration: Easy to orchestrate with Qdrant via
docker-compose.
We have configured docker-compose.yml to map your local Ollama models (~/.ollama) into the container. This means:
- You don't need to re-download models.
- Models downloaded inside Docker appear on your host.
- You save massive amounts of disk space.
- Docker & Docker Compose installed.
- For GPU Support (Recommended): NVIDIA Container Toolkit installed.
- Existing models in
~/.ollama(optional, but recommended).
-
Start the stack:
docker-compose up -d
-
Verify Ollama is running:
docker logs ragcode-ollama
-
Check available models (inside container):
docker exec -it ragcode-ollama ollama listYou should see all your locally downloaded models here!
-
Pull a new model (if needed):
docker exec -it ragcode-ollama ollama pull phi3:medium
"Error: could not connect to ollama"
- Ensure port
11434is not being used by a local Ollama instance. - Stop your local Ollama before running the container:
systemctl stop ollamaorpkill ollama.
GPU not working
- If you don't have an NVIDIA GPU or the container toolkit, remove the
deploysection fromdocker-compose.ymlto run in CPU-only mode (slower).