Skip to content

Commit 8a51003

Browse files
authored
Merge pull request #11 from doITmagic/dev
This release introduces a major upgrade to the embedding architecture and critical stability fixes. We have transitioned to a more powerful embedding model (bge-m3 family) and increased the context window to 1024 tokens, significantly improving semantic search accuracy for larger code blocks. Key changes include: Upgraded Embedding Engine: Switched model architecture and increased capacity to 1024 tokens for deeper code understanding and better retrieval of complex functions. Production Stability: Fixed critical race conditions during background indexing and model migration. Intelligent Installer: Enhanced the installer to auto-detect environment configurations and handle updates more robustly (Zero-Config). Resilience: Improved error recovery for long-running worker processes.
2 parents f8ea5a5 + 644c08c commit 8a51003

43 files changed

Lines changed: 1305 additions & 225 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.clauderules

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# RagCode MCP - Semantic Search Rules
2+
# This file is automatically managed by RagCode MCP.
3+
4+
## ⚖️ The Golden Rule
5+
**For any information about the code (location, structure, logic, or usage), you MUST use RagCode MCP tools. Never guess code details from memory; always search the local index first.**
6+
7+
## Available Tools
8+
- search_code: Primary entry point for semantic search.
9+
- get_function_details: Get full implementation of a function.
10+
- find_type_definition: Get struct/interface definitions.
11+
- list_package_exports: See what a module offers.
12+
- search_docs: Find project documentation.
13+
14+
## Usage Guidelines
15+
- Always provide 'file_path' to tools to ensure they detect the correct project context.
16+
- Use 'hybrid_search' if looking for exact variable names or error messages.
17+
- If the tool says "workspace not indexed", use 'index_workspace' once.

.clinerules

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# RagCode MCP - Semantic Search Rules
2+
# This file is automatically managed by RagCode MCP.
3+
4+
## ⚖️ The Golden Rule
5+
**For any information about the code (location, structure, logic, or usage), you MUST use RagCode MCP tools. Never guess code details from memory; always search the local index first.**
6+
7+
## Available Tools
8+
- search_code: Primary entry point for semantic search.
9+
- get_function_details: Get full implementation of a function.
10+
- find_type_definition: Get struct/interface definitions.
11+
- list_package_exports: See what a module offers.
12+
- search_docs: Find project documentation.
13+
14+
## Usage Guidelines
15+
- Always provide 'file_path' to tools to ensure they detect the correct project context.
16+
- Use 'hybrid_search' if looking for exact variable names or error messages.
17+
- If the tool says "workspace not indexed", use 'index_workspace' once.

.cursorrules

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# Cursor AI Rules - RagCode MCP
2+
3+
## ⚖️ The Golden Rule
4+
**For any information about the codebase (structure, logic, or usage), you MUST use RagCode MCP tools.**
5+
Never guess code details from memory; always search the local index first using `search_code` or `get_function_details`.
6+
7+
## Guidelines
8+
1. **Context First**: Always call `search_code` when starting a task to see where relevant logic exists.
9+
2. **Actual Code**: Use `get_function_details` to read the implementation of a function instead of assuming what it does.
10+
3. **Workspace Detection**: Always provide the current `file_path` to the tools so they can identify the correct project/workspace.
11+
4. **No Guesswork**: If you don't find something, index the workspace using `index_workspace` and search again.

.github/copilot-instructions.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# Copilot Instructions - RagCode MCP
2+
3+
## ⚖️ The Golden Rule
4+
**For any information about the code (location, structure, logic, or usage), you MUST use RagCode MCP tools. Never guess code details from memory; always search the local index first.**
5+
6+
## Project Overview
7+
RagCode is a Model Context Protocol (MCP) server that provides semantic code search (RAG) for local codebases using Ollama (embeddings) and Qdrant (vector storage). It supports multiple languages through a pluggable analyzer architecture.
8+
9+
## Architecture & Patterns
10+
- **Core Components**:
11+
- `Indexer`: Orchestrates analysis, embedding, and storage.
12+
- `PathAnalyzer`: Interface for language-specific AST analysis (Go, PHP/Laravel, Python, HTML).
13+
- `CodeChunk`: The canonical v2 data structure for all indexed code symbols (functions, types, files).
14+
- `Workspace.Manager`: Handles multi-workspace multi-language isolation via language-specific Qdrant collections.
15+
- **Data Flow**: Tools -> `Workspace.Manager.DetectWorkspace` -> Language Detection -> Qdrant Collection (`ragcode-{id}-{lang}`) -> Search Results.
16+
- **Convention**: The project is migrating from `APIChunk` to `CodeChunk`. Always use `CodeChunk` for new features.
17+
18+
## Developer Workflows
19+
- **Build/Install**: Use `go run ./cmd/install/main.go` to build binaries and configure local IDEs.
20+
- **Runtime Binaries**: Installed to `~/.local/share/ragcode/bin/` by default.
21+
- **Testing**: Use standard `go test ./...`. Use `t.TempDir()` for workspace/filesystem isolation.
22+
- **Logging**: MCP server logs to `mcp.log` next to the executable. Check `MCP_LOG_LEVEL=debug` for issues.
23+
24+
## MCP Tools Usage
25+
- `search_code`: Use as the primary entry point for exploration. **Crucial**: Always provide the `file_path` parameter as it's used for workspace and language detection.
26+
- `index_workspace`: Triggered automatically on first query per workspace, but can be manually invoked for major changes.
27+
28+
## Integration Points
29+
- **Ollama**: Requires `phi3:medium` (reasoning) and `mxbai-embed-large` (embeddings) by default.
30+
- **Qdrant**: Runs in Docker as `ragcode-qdrant` on port 6333.
31+
32+
## Romania/Hungarian Support (Note)
33+
The project identifies as `rag-code-mcp`. Old configurations naming it `do-ai` or `coderag` are deprecated and paths must be updated to the new project structure in `github.com/doITmagic/rag-code-mcp`.

.roomodes

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# RagCode MCP - Semantic Search Rules
2+
# This file is automatically managed by RagCode MCP.
3+
4+
## ⚖️ The Golden Rule
5+
**For any information about the code (location, structure, logic, or usage), you MUST use RagCode MCP tools. Never guess code details from memory; always search the local index first.**
6+
7+
## Available Tools
8+
- search_code: Primary entry point for semantic search.
9+
- get_function_details: Get full implementation of a function.
10+
- find_type_definition: Get struct/interface definitions.
11+
- list_package_exports: See what a module offers.
12+
- search_docs: Find project documentation.
13+
14+
## Usage Guidelines
15+
- Always provide 'file_path' to tools to ensure they detect the correct project context.
16+
- Use 'hybrid_search' if looking for exact variable names or error messages.
17+
- If the tool says "workspace not indexed", use 'index_workspace' once.

.windsurfrules

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# Windsurf AI Rules - RagCode MCP
2+
3+
## ⚖️ The Golden Rule
4+
**For any information about the codebase (structure, logic, or usage), you MUST use RagCode MCP tools.**
5+
Never guess code details from memory; always search the local index first using `search_code` or `get_function_details`.
6+
7+
## Guidelines
8+
1. **Context First**: Always call `search_code` when starting a task to see where relevant logic exists.
9+
2. **Actual Code**: Use `get_function_details` to read the implementation of a function instead of assuming what it does.
10+
3. **Workspace Detection**: Always provide the current `file_path` to the tools so they can identify the correct project/workspace.
11+
4. **No Guesswork**: If you don't find something, index the workspace using `index_workspace` and search again.

CONTRIBUTING.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ First off, thank you for considering contributing to RagCode MCP! It's people li
3131

3232
# Pull models
3333
ollama pull phi3:medium
34-
ollama pull nomic-embed-text
34+
ollama pull mxbai-embed-large
3535
```
3636

3737
4. **Run the server locally**

QUICKSTART.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ Then configure your Windows IDE manually (e.g., Windsurf at `%USERPROFILE%\.code
5050
"env": {
5151
"OLLAMA_BASE_URL": "http://localhost:11434",
5252
"OLLAMA_MODEL": "phi3:medium",
53-
"OLLAMA_EMBED": "nomic-embed-text",
53+
"OLLAMA_EMBED": "mxbai-embed-large",
5454
"QDRANT_URL": "http://localhost:6333"
5555
},
5656
"disabled": false
@@ -116,7 +116,7 @@ That's it! The AI will use RagCode's semantic search to find relevant code.
116116
| Problem | Solution |
117117
|---------|----------|
118118
| "Could not connect to Qdrant" | Run `docker start ragcode-qdrant` |
119-
| "Ollama model not found" | Run `ollama pull phi3:medium && ollama pull nomic-embed-text` |
119+
| "Ollama model not found" | Run `ollama pull phi3:medium && ollama pull mxbai-embed-large` |
120120
| IDE doesn't see RagCode | Re-run `./ragcode-installer -skip-build` |
121121

122122
For more help, see [README.md#troubleshooting](./README.md#-troubleshooting) or open an [Issue](https://github.com/doITmagic/rag-code-mcp/issues).

README.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,11 @@ RagCode is a **Model Context Protocol (MCP) server** that instantly makes your p
2323

2424
Built with the official [Model Context Protocol Go SDK](https://github.com/modelcontextprotocol/go-sdk), RagCode provides **9 powerful tools** to index, search, and analyze code, making it the ultimate solution for **AI-ready software development**.
2525

26+
## ⚖️ The Golden Rule
27+
> **"FOR ANY INFORMATION ABOUT YOUR CODE (location, structure, logic, or usage), YOU MUST USE RAGCODE MCP TOOLS."**
28+
>
29+
> *By using semantic search instead of simple keyword lookups, your AI assistant gains true context, avoiding hallucinations and missing details even in massive legacy mono-repos.*
30+
2631
---
2732

2833
## ⚡ One-Command Installation
@@ -51,7 +56,7 @@ Invoke-WebRequest -Uri "https://github.com/doITmagic/rag-code-mcp/releases/lates
5156
**That's it!** The installer automatically:
5257
- ✅ Downloads and installs the `rag-code-mcp` binary
5358
- ✅ Sets up Ollama and Qdrant in Docker containers
54-
- ✅ Downloads required AI models (`phi3:medium`, `nomic-embed-text`)
59+
- ✅ Downloads required AI models (`phi3:medium`, `mxbai-embed-large`)
5560
- ✅ Configures your IDE (VS Code, Cursor, Windsurf, Claude Desktop)
5661
- ✅ Adds binaries to your PATH
5762

@@ -219,7 +224,7 @@ RagCode works with all major AI-powered IDEs:
219224
| Component | Requirement | Notes |
220225
|-----------|-------------|-------|
221226
| **CPU** | 4 cores | For running Ollama models |
222-
| **RAM** | 16 GB | 8 GB for `phi3:medium`, 4 GB for `nomic-embed-text`, 4 GB system |
227+
| **RAM** | 16 GB | 8 GB for `phi3:medium`, 4 GB for `mxbai-embed-large`, 4 GB system |
223228
| **Disk** | 10 GB free | ~8 GB for models + 2 GB for data |
224229
| **OS** | Linux, macOS, Windows | Docker required for Qdrant |
225230

bin/index-all

47.4 KB
Binary file not shown.

0 commit comments

Comments
 (0)