|
| 1 | +# 🚀 RAG + Ollama Quick Reference |
| 2 | + |
| 3 | +## ✅ What We Use Now |
| 4 | + |
| 5 | +**Single AI Architecture: RAG + Ollama** |
| 6 | + |
| 7 | +- **Embeddings**: Xenova/all-MiniLM-L6-v2 (384 dimensions) |
| 8 | +- **Vector DB**: Supabase pgvector |
| 9 | +- **AI Model**: Ollama (deepseek-r1:7b or compatible) |
| 10 | +- **Cost**: $0 for AI (infrastructure costs only) |
| 11 | +- **Privacy**: 100% local/private |
| 12 | + |
| 13 | +--- |
| 14 | + |
| 15 | +## 🛠️ Quick Setup |
| 16 | + |
| 17 | +### 1. Install Ollama |
| 18 | + |
| 19 | +**Windows:** |
| 20 | +```powershell |
| 21 | +# Download from https://ollama.com |
| 22 | +# Run installer |
| 23 | +``` |
| 24 | + |
| 25 | +**macOS:** |
| 26 | +```bash |
| 27 | +brew install ollama |
| 28 | +``` |
| 29 | + |
| 30 | +**Linux:** |
| 31 | +```bash |
| 32 | +curl -fsSL https://ollama.com/install.sh | sh |
| 33 | +``` |
| 34 | + |
| 35 | +### 2. Pull Model |
| 36 | + |
| 37 | +```bash |
| 38 | +ollama pull deepseek-r1:7b |
| 39 | +``` |
| 40 | + |
| 41 | +### 3. Verify |
| 42 | + |
| 43 | +```bash |
| 44 | +ollama list |
| 45 | +# Should show: deepseek-r1:7b |
| 46 | +``` |
| 47 | + |
| 48 | +### 4. Run Dev Server |
| 49 | + |
| 50 | +```powershell |
| 51 | +pnpm dev |
| 52 | +# or |
| 53 | +.\start-dev.ps1 |
| 54 | +``` |
| 55 | + |
| 56 | +--- |
| 57 | + |
| 58 | +## 🧠 How RAG Works |
| 59 | + |
| 60 | +``` |
| 61 | +User: "How do I add a collaborator?" |
| 62 | + ↓ |
| 63 | +1. Convert question to 384D vector |
| 64 | + ↓ |
| 65 | +2. Search knowledge_base table (pgvector) |
| 66 | + ↓ |
| 67 | +3. Find top 3 similar docs (similarity > 0.7) |
| 68 | + ↓ |
| 69 | +4. Build context: [relevant docs] + user question |
| 70 | + ↓ |
| 71 | +5. Send to Ollama (deepseek-r1:7b) |
| 72 | + ↓ |
| 73 | +6. AI responds with context-aware answer |
| 74 | +``` |
| 75 | + |
| 76 | +--- |
| 77 | + |
| 78 | +## 📊 Status Indicators |
| 79 | + |
| 80 | +| Indicator | Meaning | |
| 81 | +|-----------|---------| |
| 82 | +| 🟣 Purple pulse + "RAG-Enhanced" | RAG context found & used | |
| 83 | +| 🟢 Green pulse + "Ollama" | Ollama running without RAG | |
| 84 | +| Error message | Ollama not running | |
| 85 | + |
| 86 | +--- |
| 87 | + |
| 88 | +## 🔧 Environment Variables |
| 89 | + |
| 90 | +```env |
| 91 | +# Required for AI |
| 92 | +OLLAMA_URL=http://localhost:11434 |
| 93 | +OLLAMA_MODEL=deepseek-r1:7b |
| 94 | +
|
| 95 | +# Required for RAG |
| 96 | +NEXT_PUBLIC_SUPABASE_URL=your-url |
| 97 | +NEXT_PUBLIC_SUPABASE_ANON_KEY=your-key |
| 98 | +``` |
| 99 | + |
| 100 | +--- |
| 101 | + |
| 102 | +## 🗄️ Database Setup |
| 103 | + |
| 104 | +```sql |
| 105 | +-- Enable pgvector |
| 106 | +CREATE EXTENSION IF NOT EXISTS vector; |
| 107 | + |
| 108 | +-- Create knowledge base table |
| 109 | +CREATE TABLE knowledge_base ( |
| 110 | + id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), |
| 111 | + content TEXT NOT NULL, |
| 112 | + embedding vector(384), |
| 113 | + metadata JSONB, |
| 114 | + category TEXT, |
| 115 | + source TEXT, |
| 116 | + title TEXT, |
| 117 | + created_at TIMESTAMP DEFAULT NOW() |
| 118 | +); |
| 119 | + |
| 120 | +-- Add vector similarity index |
| 121 | +CREATE INDEX ON knowledge_base |
| 122 | +USING ivfflat (embedding vector_cosine_ops); |
| 123 | +``` |
| 124 | + |
| 125 | +--- |
| 126 | + |
| 127 | +## 📚 Index Documentation |
| 128 | + |
| 129 | +```bash |
| 130 | +# Index all docs for RAG |
| 131 | +node scripts/index-knowledge.js |
| 132 | + |
| 133 | +# This creates embeddings for: |
| 134 | +# - README.md |
| 135 | +# - docs/*.md |
| 136 | +# - Code comments |
| 137 | +# - Feature descriptions |
| 138 | +``` |
| 139 | + |
| 140 | +--- |
| 141 | + |
| 142 | +## 🐛 Troubleshooting |
| 143 | + |
| 144 | +### "AI not responding" |
| 145 | + |
| 146 | +```bash |
| 147 | +# 1. Check if Ollama is running |
| 148 | +ollama list |
| 149 | + |
| 150 | +# 2. Verify model is downloaded |
| 151 | +ollama pull deepseek-r1:7b |
| 152 | + |
| 153 | +# 3. Test Ollama directly |
| 154 | +ollama run deepseek-r1:7b |
| 155 | +# Type: "Hello" |
| 156 | +# Should respond |
| 157 | +# Type: /bye to exit |
| 158 | + |
| 159 | +# 4. Check environment variables |
| 160 | +echo $OLLAMA_URL # Should be http://localhost:11434 |
| 161 | +``` |
| 162 | + |
| 163 | +### "RAG not working" |
| 164 | + |
| 165 | +1. **Check Supabase connection** |
| 166 | + - Verify `NEXT_PUBLIC_SUPABASE_URL` |
| 167 | + - Verify `NEXT_PUBLIC_SUPABASE_ANON_KEY` |
| 168 | + |
| 169 | +2. **Check knowledge_base table** |
| 170 | + ```sql |
| 171 | + SELECT COUNT(*) FROM knowledge_base; |
| 172 | + -- Should have documents indexed |
| 173 | + ``` |
| 174 | + |
| 175 | +3. **Re-index documentation** |
| 176 | + ```bash |
| 177 | + node scripts/index-knowledge.js |
| 178 | + ``` |
| 179 | + |
| 180 | +### "Slow responses" |
| 181 | + |
| 182 | +- Use smaller model: `ollama pull llama3.2` (3B instead of 7B) |
| 183 | +- Close other apps to free RAM |
| 184 | +- Consider GPU for faster inference |
| 185 | + |
| 186 | +--- |
| 187 | + |
| 188 | +## 📈 Production Deployment |
| 189 | + |
| 190 | +### Option A: Vercel + VPS |
| 191 | + |
| 192 | +1. **Deploy Next.js to Vercel** |
| 193 | + ```bash |
| 194 | + vercel --prod |
| 195 | + ``` |
| 196 | + |
| 197 | +2. **Run Ollama on VPS** (DigitalOcean, AWS, etc.) |
| 198 | + ```bash |
| 199 | + # On VPS |
| 200 | + curl -fsSL https://ollama.com/install.sh | sh |
| 201 | + ollama pull deepseek-r1:7b |
| 202 | + |
| 203 | + # Set up reverse proxy (nginx) |
| 204 | + # Configure SSL |
| 205 | + ``` |
| 206 | + |
| 207 | +3. **Update Vercel env vars** |
| 208 | + ```env |
| 209 | + OLLAMA_URL=https://ai.yourdomain.com |
| 210 | + ``` |
| 211 | + |
| 212 | +### Option B: All Local (Development) |
| 213 | + |
| 214 | +```env |
| 215 | +OLLAMA_URL=http://localhost:11434 |
| 216 | +OLLAMA_MODEL=deepseek-r1:7b |
| 217 | +``` |
| 218 | + |
| 219 | +--- |
| 220 | + |
| 221 | +## 📁 Key Files |
| 222 | + |
| 223 | +| File | Purpose | |
| 224 | +|------|---------| |
| 225 | +| `app/api/chat/route.ts` | RAG-enhanced chat API | |
| 226 | +| `lib/services/rag-service.ts` | RAG embedding & retrieval | |
| 227 | +| `app/dashboard/ai-tools/page.tsx` | Chat UI | |
| 228 | +| `scripts/index-knowledge.js` | Knowledge indexer | |
| 229 | +| `docs/RAG_SYSTEM.md` | Full RAG documentation | |
| 230 | +| `docs/OLLAMA_SETUP.md` | Ollama setup guide | |
| 231 | + |
| 232 | +--- |
| 233 | + |
| 234 | +## ✅ Benefits Summary |
| 235 | + |
| 236 | +| Aspect | Benefit | |
| 237 | +|--------|---------| |
| 238 | +| **Privacy** | 100% local, no external API calls | |
| 239 | +| **Cost** | $0 for AI (only infrastructure) | |
| 240 | +| **Accuracy** | Context from YOUR docs, not generic | |
| 241 | +| **Offline** | Works without internet | |
| 242 | +| **Unlimited** | No rate limits or quotas | |
| 243 | +| **Fast** | Vector search <100ms | |
| 244 | + |
| 245 | +--- |
| 246 | + |
| 247 | +## 🎯 What Changed (Migration) |
| 248 | + |
| 249 | +### ❌ Removed |
| 250 | +- DeepSeek API integration |
| 251 | +- Google Gemini API integration |
| 252 | +- Multi-provider fallback logic |
| 253 | +- Cloud API environment variables |
| 254 | + |
| 255 | +### ✅ Added |
| 256 | +- RAG service with Xenova embeddings |
| 257 | +- Supabase pgvector integration |
| 258 | +- Knowledge base indexer |
| 259 | +- Context-aware AI responses |
| 260 | + |
| 261 | +--- |
| 262 | + |
| 263 | +## 📚 Learn More |
| 264 | + |
| 265 | +- [RAG System Documentation](./RAG_SYSTEM.md) |
| 266 | +- [Ollama Setup Guide](./OLLAMA_SETUP.md) |
| 267 | +- [Production Deployment](./PRODUCTION_DEPLOYMENT_OLLAMA.md) |
| 268 | +- [Migration Summary](./RAG_MIGRATION_SUMMARY.md) |
| 269 | + |
| 270 | +--- |
| 271 | + |
| 272 | +**Last Updated:** January 17, 2026 |
| 273 | +**Architecture:** RAG + Ollama (100% Private) |
0 commit comments