Skip to content

Commit 1bada62

Browse files
committed
Add RAG + Ollama quick reference guide
Introduces docs/RAG_QUICK_REFERENCE.md with setup instructions, architecture overview, troubleshooting, deployment options, and migration notes for the RAG + Ollama AI system. This guide provides concise steps for local/private AI deployment using Xenova embeddings, Supabase pgvector, and Ollama models.
1 parent 313eb28 commit 1bada62

1 file changed

Lines changed: 273 additions & 0 deletions

File tree

docs/RAG_QUICK_REFERENCE.md

Lines changed: 273 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,273 @@
1+
# 🚀 RAG + Ollama Quick Reference
2+
3+
## ✅ What We Use Now
4+
5+
**Single AI Architecture: RAG + Ollama**
6+
7+
- **Embeddings**: Xenova/all-MiniLM-L6-v2 (384 dimensions)
8+
- **Vector DB**: Supabase pgvector
9+
- **AI Model**: Ollama (deepseek-r1:7b or compatible)
10+
- **Cost**: $0 for AI (infrastructure costs only)
11+
- **Privacy**: 100% local/private
12+
13+
---
14+
15+
## 🛠️ Quick Setup
16+
17+
### 1. Install Ollama
18+
19+
**Windows:**
20+
```powershell
21+
# Download from https://ollama.com
22+
# Run installer
23+
```
24+
25+
**macOS:**
26+
```bash
27+
brew install ollama
28+
```
29+
30+
**Linux:**
31+
```bash
32+
curl -fsSL https://ollama.com/install.sh | sh
33+
```
34+
35+
### 2. Pull Model
36+
37+
```bash
38+
ollama pull deepseek-r1:7b
39+
```
40+
41+
### 3. Verify
42+
43+
```bash
44+
ollama list
45+
# Should show: deepseek-r1:7b
46+
```
47+
48+
### 4. Run Dev Server
49+
50+
```powershell
51+
pnpm dev
52+
# or
53+
.\start-dev.ps1
54+
```
55+
56+
---
57+
58+
## 🧠 How RAG Works
59+
60+
```
61+
User: "How do I add a collaborator?"
62+
63+
1. Convert question to 384D vector
64+
65+
2. Search knowledge_base table (pgvector)
66+
67+
3. Find top 3 similar docs (similarity > 0.7)
68+
69+
4. Build context: [relevant docs] + user question
70+
71+
5. Send to Ollama (deepseek-r1:7b)
72+
73+
6. AI responds with context-aware answer
74+
```
75+
76+
---
77+
78+
## 📊 Status Indicators
79+
80+
| Indicator | Meaning |
81+
|-----------|---------|
82+
| 🟣 Purple pulse + "RAG-Enhanced" | RAG context found & used |
83+
| 🟢 Green pulse + "Ollama" | Ollama running without RAG |
84+
| Error message | Ollama not running |
85+
86+
---
87+
88+
## 🔧 Environment Variables
89+
90+
```env
91+
# Required for AI
92+
OLLAMA_URL=http://localhost:11434
93+
OLLAMA_MODEL=deepseek-r1:7b
94+
95+
# Required for RAG
96+
NEXT_PUBLIC_SUPABASE_URL=your-url
97+
NEXT_PUBLIC_SUPABASE_ANON_KEY=your-key
98+
```
99+
100+
---
101+
102+
## 🗄️ Database Setup
103+
104+
```sql
105+
-- Enable pgvector
106+
CREATE EXTENSION IF NOT EXISTS vector;
107+
108+
-- Create knowledge base table
109+
CREATE TABLE knowledge_base (
110+
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
111+
content TEXT NOT NULL,
112+
embedding vector(384),
113+
metadata JSONB,
114+
category TEXT,
115+
source TEXT,
116+
title TEXT,
117+
created_at TIMESTAMP DEFAULT NOW()
118+
);
119+
120+
-- Add vector similarity index
121+
CREATE INDEX ON knowledge_base
122+
USING ivfflat (embedding vector_cosine_ops);
123+
```
124+
125+
---
126+
127+
## 📚 Index Documentation
128+
129+
```bash
130+
# Index all docs for RAG
131+
node scripts/index-knowledge.js
132+
133+
# This creates embeddings for:
134+
# - README.md
135+
# - docs/*.md
136+
# - Code comments
137+
# - Feature descriptions
138+
```
139+
140+
---
141+
142+
## 🐛 Troubleshooting
143+
144+
### "AI not responding"
145+
146+
```bash
147+
# 1. Check if Ollama is running
148+
ollama list
149+
150+
# 2. Verify model is downloaded
151+
ollama pull deepseek-r1:7b
152+
153+
# 3. Test Ollama directly
154+
ollama run deepseek-r1:7b
155+
# Type: "Hello"
156+
# Should respond
157+
# Type: /bye to exit
158+
159+
# 4. Check environment variables
160+
echo $OLLAMA_URL # Should be http://localhost:11434
161+
```
162+
163+
### "RAG not working"
164+
165+
1. **Check Supabase connection**
166+
- Verify `NEXT_PUBLIC_SUPABASE_URL`
167+
- Verify `NEXT_PUBLIC_SUPABASE_ANON_KEY`
168+
169+
2. **Check knowledge_base table**
170+
```sql
171+
SELECT COUNT(*) FROM knowledge_base;
172+
-- Should have documents indexed
173+
```
174+
175+
3. **Re-index documentation**
176+
```bash
177+
node scripts/index-knowledge.js
178+
```
179+
180+
### "Slow responses"
181+
182+
- Use smaller model: `ollama pull llama3.2` (3B instead of 7B)
183+
- Close other apps to free RAM
184+
- Consider GPU for faster inference
185+
186+
---
187+
188+
## 📈 Production Deployment
189+
190+
### Option A: Vercel + VPS
191+
192+
1. **Deploy Next.js to Vercel**
193+
```bash
194+
vercel --prod
195+
```
196+
197+
2. **Run Ollama on VPS** (DigitalOcean, AWS, etc.)
198+
```bash
199+
# On VPS
200+
curl -fsSL https://ollama.com/install.sh | sh
201+
ollama pull deepseek-r1:7b
202+
203+
# Set up reverse proxy (nginx)
204+
# Configure SSL
205+
```
206+
207+
3. **Update Vercel env vars**
208+
```env
209+
OLLAMA_URL=https://ai.yourdomain.com
210+
```
211+
212+
### Option B: All Local (Development)
213+
214+
```env
215+
OLLAMA_URL=http://localhost:11434
216+
OLLAMA_MODEL=deepseek-r1:7b
217+
```
218+
219+
---
220+
221+
## 📁 Key Files
222+
223+
| File | Purpose |
224+
|------|---------|
225+
| `app/api/chat/route.ts` | RAG-enhanced chat API |
226+
| `lib/services/rag-service.ts` | RAG embedding & retrieval |
227+
| `app/dashboard/ai-tools/page.tsx` | Chat UI |
228+
| `scripts/index-knowledge.js` | Knowledge indexer |
229+
| `docs/RAG_SYSTEM.md` | Full RAG documentation |
230+
| `docs/OLLAMA_SETUP.md` | Ollama setup guide |
231+
232+
---
233+
234+
## ✅ Benefits Summary
235+
236+
| Aspect | Benefit |
237+
|--------|---------|
238+
| **Privacy** | 100% local, no external API calls |
239+
| **Cost** | $0 for AI (only infrastructure) |
240+
| **Accuracy** | Context from YOUR docs, not generic |
241+
| **Offline** | Works without internet |
242+
| **Unlimited** | No rate limits or quotas |
243+
| **Fast** | Vector search <100ms |
244+
245+
---
246+
247+
## 🎯 What Changed (Migration)
248+
249+
### ❌ Removed
250+
- DeepSeek API integration
251+
- Google Gemini API integration
252+
- Multi-provider fallback logic
253+
- Cloud API environment variables
254+
255+
### ✅ Added
256+
- RAG service with Xenova embeddings
257+
- Supabase pgvector integration
258+
- Knowledge base indexer
259+
- Context-aware AI responses
260+
261+
---
262+
263+
## 📚 Learn More
264+
265+
- [RAG System Documentation](./RAG_SYSTEM.md)
266+
- [Ollama Setup Guide](./OLLAMA_SETUP.md)
267+
- [Production Deployment](./PRODUCTION_DEPLOYMENT_OLLAMA.md)
268+
- [Migration Summary](./RAG_MIGRATION_SUMMARY.md)
269+
270+
---
271+
272+
**Last Updated:** January 17, 2026
273+
**Architecture:** RAG + Ollama (100% Private)

0 commit comments

Comments
 (0)