| tags |
|
|||
|---|---|---|---|---|
| aliases |
|
CodeRAG is an intelligent codebase context engine for AI coding agents. It creates a semantic vector database (RAG) from source code, documentation, and project backlog, then exposes it as MCP tools that give AI agents deep understanding of the entire codebase.
flowchart LR
subgraph Sources
S1[Git Repos]
S2[Backlog<br/>ADO / Jira / ClickUp]
S3[Docs<br/>Confluence / SharePoint / MD]
end
subgraph Ingestion["Ingestion Pipeline"]
P[Tree-sitter<br/>AST Parser]
C[AST Chunker]
E[NL Enrichment<br/>Ollama]
end
subgraph Storage["Storage Layer"]
V[LanceDB / Qdrant<br/>Vector Store]
B[MiniSearch<br/>BM25 Index]
G[Dependency<br/>Graph]
end
subgraph Retrieval["Retrieval Engine"]
H[Hybrid Search<br/>+ RRF]
X[Graph Expansion<br/>+ Re-ranking]
T[Token Budget<br/>Optimizer]
end
subgraph Interface["Agent Interface"]
M[MCP Server<br/>6 Tools]
R[REST API]
VS[VS Code<br/>Extension]
VW[Web Viewer]
end
S1 & S2 & S3 --> P --> C --> E --> V & B & G
V & B & G --> H --> X --> T
T --> M & R & VS & VW
| Page | Description |
|---|---|
| Installation | Prerequisites, setup, Ollama models |
| Quick Start | First index + search in 5 minutes |
| Configuration | Full .coderag.yaml reference |
| Page | Description |
|---|---|
| Overview | High-level architecture, tech stack, design principles |
| Ingestion Pipeline | Parse → Chunk → Enrich → Embed → Store |
| Retrieval Pipeline | Query → Analyze → Search → Expand → Budget |
| Hybrid Search | Vector + BM25 + Reciprocal Rank Fusion |
| Dependency Graph | Graph model, edges, BFS expansion |
| Design Decisions | ADR-style rationale for key decisions |
| Package | NPM | Description |
|---|---|---|
| Core | @code-rag/core |
Shared library — ingestion, embedding, retrieval, auth |
| CLI | @code-rag/cli |
CLI tool — coderag init/index/search/serve/status/viewer |
| MCP Server | @code-rag/mcp-server |
MCP server — stdio + SSE transport |
| API Server | @code-rag/api-server |
Express REST API — team/cloud deployment |
| Viewer | @code-rag/viewer |
Vite SPA — dashboard, search, graph, UMAP |
| VS Code Extension | code-rag-vscode |
VS Code integration — search panel, auto-config |
| Benchmarks | — | Benchmark suite — precision, recall, MRR |
| Page | Description |
|---|---|
| MCP Tools | All 6 MCP tools with schemas and examples |
| REST API | All REST endpoints with request/response formats |
| Types | Core TypeScript types (Chunk, SearchResult, Config, ...) |
| Interfaces | Provider interfaces (EmbeddingProvider, VectorStore, ...) |
| Page | Description |
|---|---|
| Multi Repo | Multi-repository setup and cross-repo resolution |
| Backlog Integration | Azure DevOps, Jira, ClickUp integration |
| Cloud Deployment | API server, Docker, auth, RBAC, team storage |
| Embedding Providers | Ollama, Voyage, OpenAI — setup and comparison |
| Contributing | Development workflow, conventions, testing |
| Page | Description |
|---|---|
| Glossary | Key terms and definitions |
| Project History | Sprint timeline, milestones, stats |
Info: About this documentation This vault contains 27 interconnected pages covering the full CodeRAG system. Use Obsidian's graph view to explore relationships between concepts, or navigate via the links above.
Tip: Quick links
- I want to use CodeRAG → Start with Installation → Quick Start
- I want to understand how it works → Read Overview → Ingestion Pipeline → Retrieval Pipeline
- I want to integrate with my AI agent → See MCP Tools or REST API
- I want to contribute → Read Contributing → Design Decisions