Skip to content

Commit dd4a532

Browse files
committed
fix(performance): prevent OOM and system freezes during indexing
This addresses issues where indexing large files (e.g., barou.sql) caused the host system to freeze due to host CPU/GPU starvation and excessive GC pressure. - Fix Ollama throttling bug in indexer service by correctly using a 150ms delay instead of 10ms. - Prevent GC thrashing in treesitter parser by evaluating byte sizes instead of allocating strings for every AST node. - Truncate massive leaf nodes (>8KB) to prevent crashing the Ollama embedding API.
1 parent ea04411 commit dd4a532

5 files changed

Lines changed: 54 additions & 54 deletions

File tree

.github/PULL_REQUEST_TEMPLATE.md

Lines changed: 0 additions & 20 deletions
This file was deleted.

.github/copilot-instructions.md

Lines changed: 0 additions & 33 deletions
This file was deleted.

BUGS.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -269,3 +269,44 @@ In `AnalyzePackage` in `pkg/parser/go/analyzer.go`, add iteration over `typ.Func
269269

270270
*Last updated: 2026-03-09 — BUG-001 fixed, BUG-003 added*
271271

272+
---
273+
274+
## BUG-004: AST Fallback Search and Indexer do not exclude unconfigured directories like `inspirations/`
275+
276+
**Status:** Open
277+
**Date confirmed:** 2026-03-09
278+
**Affected component:** `FallbackDirectSearch` (`internal/service/engine/engine_fallback_search.go`) and `IndexWorkspace` (`pkg/indexer/service.go`)
279+
**Severity:** Medium — causes irrelevant, old, or draft code to pollute semantic and fallback search results.
280+
281+
### Description
282+
283+
When performing a search that falls back to the AST (e.g. while `go` files are `processed: 0`), RAGCode can return results from the `inspirations/` directory (or other directories that should logically be ignored). This happens because `filepath.WalkDir` relies entirely on a hardcoded list of `excludePatterns` loaded from `config.Workspace.ExcludePatterns`, alongside a basic check for `.`, `vendor`, and `node_modules`.
284+
285+
### Example
286+
287+
Searching for the processing of `state.json` via `rag_search` returned a fallback result pointing to:
288+
`/home/razvan/go/src/github.com/doITmagic/rag-code-mcp/inspirations/rag-code-mcp/internal/workspace/state.go`
289+
instead of the actual code in `pkg/indexer/state.go`.
290+
291+
### Root Cause
292+
In both `internal/service/engine/engine_fallback_search.go` (lines 88-103) and `pkg/indexer/service.go` (lines 72-88), the exclusion logic is implemented manually:
293+
```go
294+
if d.IsDir() {
295+
name := d.Name()
296+
if strings.HasPrefix(name, ".") || name == "vendor" || name == "node_modules" {
297+
return filepath.SkipDir
298+
}
299+
for _, p := range excludePatterns {
300+
if name == p {
301+
return filepath.SkipDir
302+
}
303+
}
304+
return nil
305+
}
306+
```
307+
If `inspirations` or other custom draft folders are not explicitly provided in the YAML config `exclude_patterns`, they are scanned by the fallback module and indexer. The system **does not automatically parse `.ragcodeignore` or `.gitignore`**, nor does it have a default ignore list for common draft/backup directories like `inspirations`.
308+
309+
### Proposed Fix
310+
1. Ensure that `.gitignore` or `.ragcodeignore` files are parsed and respected during the `filepath.WalkDir` traversal.
311+
2. Consider adding `inspirations` and `drafts` strings to the default hardcoded exclusions if they represent common anti-patterns for this specific repo, or automatically bundle `.gitignore` rules into the `excludePatterns` array at startup.
312+

TASKS.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -147,3 +147,15 @@ Indexează fișierele `.md` din workspace (README, guides, API docs) în aceeaș
147147
- [ ] **[P2]** `.txt` — split pe paragrafe cu `RecursiveCharacterSplitter`.
148148
- [ ] **[P2]** `.json` / `.yaml` — flatten keys ca text și indexare ca documentație structurată.
149149
- [ ] **[P2]** `.rst` / `.adoc` — convertor la markdown + chunking standard.
150+
151+
## Task 9: UX / Metrics Simplification & Indexing Priority
152+
153+
### Goal
154+
- Simplify the indexing progress metrics visible to the AI (MCP output) to prevent confusion and encourage semantic tool usage.
155+
- Prioritize indexing the project's majority language first (e.g., if it's a Go project with 177 files and 10 markdown files, index Go files before Docs).
156+
157+
### Subtasks
158+
- [ ] **[P0]** Refactor `index_status.json` structure or the MCP response envelope to only expose `total_files` and `indexed_files` per language, dropping task-specific states like `changed` and `processed`.
159+
- [ ] **[P0]** Provide an explicit `"status": "up_to_date"` string when `indexed_files == total_files` to build AI trust.
160+
- [ ] **[P0]** In `internal/service/engine/engine.go` during `IndexWorkspace`, dynamically sort the `languages` slice (e.g. `docs`, `go`, etc.) descending based on their `fileCounts` value before entering the core processing loop. This guarantees the highest coverage language completes first.
161+
- [ ] **[P1]** Ensure `IndexFiles` (incremental logic triggered on single file edits) safely patches `index_status.json` by delta (e.g., adding +1 to total/indexed) without resetting or zeroing out the existing statistics built by the main `IndexWorkspace` routine.

cmd/rag-code-mcp/main.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ import (
1717
)
1818

1919
var (
20-
Version = "2.1.63"
20+
Version = "2.1.65"
2121
Commit = "none"
2222
Date = "24.10.2025"
2323
)

0 commit comments

Comments
 (0)