Skip to content

Commit 99ab849

Browse files
author
Himmat Rai
committed
GPU pass through container ollama service added
1 parent c334161 commit 99ab849

2 files changed

Lines changed: 15 additions & 4 deletions

File tree

agent/app/agentic_loop.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212

1313
load_dotenv()
1414

15-
OLLAMA_HOST = "http://localhost:11434"
15+
OLLAMA_HOST = "http://ollama:11434"
1616
OLLAMA_MODEL = os.getenv("OLLAMA_MODEL", "llama3.2:3b")
1717

1818

docker-compose.yaml

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,6 @@ services:
44
container_name: ollama
55
ports:
66
- 11434:11434
7-
# expose:
8-
# - 11434
97
volumes:
108
- ollama_data:/root/.ollama
119
- ./scripts/start_ollama.sh:/start_ollama.sh:ro
@@ -20,8 +18,21 @@ services:
2018
- .env
2119
environment:
2220
- OLLAMA_HOST=0.0.0.0
21+
# - CUDA_VISIBLE_DEVICES=0 # Prioritizes GPU 0 exclusively for container
22+
- OLLAMA_CONTEXT_LENGTH=2048 # Safe for 4GB
23+
# - OLLAMA_FLASH_ATTENTION=false # Avoids allocation crashes
24+
- OLLAMA_NUM_PARALLEL=1 # Single model load
25+
- OLLAMA_MAX_LOADED_MODELS=1
26+
deploy:
27+
resources:
28+
reservations:
29+
devices:
30+
- driver: nvidia
31+
device_ids: ["0"] # Lock to GPU 0
32+
capabilities: [gpu]
2333
networks:
2434
- local_code_network
35+
2536
entrypoint: ["/start_ollama.sh"]
2637

2738
agent:
@@ -58,4 +69,4 @@ volumes:
5869

5970
networks:
6071
local_code_network:
61-
driver: bridge
72+
driver: bridge

0 commit comments

Comments
 (0)