| Component | Responsibility | Technology |
|---|---|---|
| Frontend | User interaction, real-time display | Next.js, React |
| API Layer | Request routing, validation, streaming | FastAPI, Uvicorn |
| RAG Engine | Retrieval, chunking, embedding | ChromaDB, FAISS |
| LLM Client | Text generation, guardrails | Ollama, qwen3 |
| Eval System | Quality measurement | Groq, LLM-as-Judge |
- Monolithic but Modular: Single deployment, clear module boundaries
- Local-First: All processing on user's machine
- Graceful Degradation: FAISS optional, system works without it
- Streaming by Default: Better perceived performance
- Dual Storage: ChromaDB for simplicity, FAISS for scale



