refactor: Replace progressStore with file-based IndexStatus#40
Conversation
- Remove progressStore (preRegister, update, carry-over, flusher) - Remove IndexingProgressSummary, BuildIndexingProgress, formatAge, buildIndexingMessage - Remove auto-resume from SearchCode/HybridSearchCode (redundant with DetectContext) - Remove resumeAttempts field from Engine - Add IndexStatus/LangStatus/SaveIndexStatus/LoadIndexStatus in pkg/indexer/ - Indexer writes OnDisk/Changed/Processed via Progress callback - Tools read status directly from .ragcode/index_status.json - Fix TestDetectNoMarkers with AllowedRoots isolation
There was a problem hiding this comment.
Pull request overview
Replaces the in-memory progressStore indexing progress mechanism (with preRegister, flusher goroutine, carry-over logic) with a simpler file-based IndexStatus system in pkg/indexer/index_status.go. MCP tools now read status directly from {workspaceRoot}/.ragcode/index_status.json via indexer.LoadIndexStatus(). This is a significant simplification that removes ~1000 lines of complex concurrency code.
Changes:
- New
pkg/indexer/index_status.gowithIndexStatus/LangStatusstructs andSaveIndexStatus/LoadIndexStatusfunctions, replacing the oldengine/index_progress.gowith itsprogressStore, flusher goroutine, and deep-copy logic - All MCP tools updated to use the new
IndexingStatusfield (backed byindexer.IndexStatus) instead ofIndexingProgress(backed byIndexingProgressSummary), with the JSON tag preserved as"indexing_progress"for backward compatibility - Removed auto-resume logic from
SearchCode/HybridSearchCodeand theresumeAttemptsthrottle, as workspace re-indexing is now triggered viaDetectContext
Reviewed changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
pkg/indexer/index_status.go |
New file: IndexStatus/LangStatus types, SaveIndexStatus, LoadIndexStatus |
pkg/indexer/index_status_test.go |
New file: round-trip and missing-file tests |
internal/service/engine/engine.go |
Remove progressStore/resumeAttempts, add GetIndexStatus, wire Progress callback with file-based status |
internal/service/engine/index_progress.go |
Deleted: old progressStore and all related types/functions |
internal/service/engine/index_progress_test.go |
Deleted: tests for removed progressStore |
internal/service/tools/response.go |
Replace IndexingProgressSummary with IndexingStatus, rename helper to ContextFromWorkspaceWithStatus |
internal/service/tools/smart_search_pipeline.go |
Use LoadIndexStatus, replace dynamic fallback note with static string |
internal/service/tools/smart_search.go |
Simplify indexing error messages, remove progress attachment |
internal/service/tools/find_usages.go |
Switch to GetIndexStatus/ContextFromWorkspaceWithStatus |
internal/service/tools/call_hierarchy.go |
Switch to GetIndexStatus/ContextFromWorkspaceWithStatus |
internal/service/tools/list_package_exports.go |
Switch to ContextFromWorkspaceWithStatus, remove idx-based status override |
internal/service/tools/index_workspace.go |
Use LoadIndexStatus directly |
internal/service/tools/skills.go |
Set IndexingStatus: nil |
internal/service/tools/evaluate_ragcode.go |
Remove unused wctxID/wctxRoot, set IndexingStatus: nil |
internal/service/tools/read_file_context.go |
Set IndexingStatus: nil |
internal/service/tools/tests/health_metrics_test.go |
Remove tests for deleted types/functions |
internal/service/engine/engine_searchcode_test.go |
Remove auto-resume test |
internal/service/engine/engine_nonblocking_search_test.go |
Remove progress.stop() cleanup |
internal/service/engine/engine_fallback_search_test.go |
Remove progress.stop() cleanup |
internal/service/engine/engine_sticky_test.go |
Remove progress.stop() cleanup |
pkg/workspace/detector/detector_test.go |
Isolate test to avoid picking up .ragcode markers from parent dirs |
cmd/rag-code-mcp/main.go |
Version bump to 2.1.63 |
The Progress callback received totalFiles = len(changedFiles), which only counts modified files needing re-indexing. This was incorrectly assigned to OnDisk, causing on_disk: 1 when only 1 file changed — despite 232 Go files and 655 docs on disk. Fix: - Call CountAllFiles() once before the language loop for real disk totals - Pre-populate index_status.json with on_disk counts at indexing start - Use diskTotal (pre-counted) for OnDisk, totalFiles for Changed - Languages with 0 changed files now correctly show their disk totals
- Fix nil panic in ContextFromWorkspaceWithStatus when wctx is nil (#7) - Fix indentation in smart_search_pipeline.go (#1) - Use loaded idx instead of nil in call_hierarchy.go and find_usages.go (#3, #9) - Add backward-compat comment on JSON tag mismatch (#6) - Create fresh IndexStatus when LoadIndexStatus returns nil (#8) - Populate Elapsed field at completed/failed transitions (#2) - Throttle progress I/O writes to every 10 files (#4) - Fix test cleanup for .ragcode dir in TempDir
- Removed the 'State' field ('starting', 'running', 'completed', 'failed') from IndexStatus entirely.
- This state was misleading for AI consumers, especially during incremental re-indexing (which reset state to 'starting' even if the index was 99% complete), causing AI agents to prematurely abandon tools.
- Simplified engine.go progress callbacks and terminal states to only log timestamps and errors, rather than a potentially confusing overall state keyword.
- Updated related tests to match the simplified struct.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 23 out of 23 changed files in this pull request and generated 9 comments.
Comments suppressed due to low confidence (2)
internal/service/tools/call_hierarchy.go:136
- The check
if idx != nilat line 133 is now semantically incorrect. With the old in-memoryprogressStore,GetIndexProgressreturned nil when no indexing was active. With the new file-based system,GetIndexStatusreads from disk and returns non-nil even for a completed previous run (the file persists). This means the response will always beindexing_in_progresswhen no collections exist but a previous run left anindex_status.jsonfile — even if indexing completed hours ago.
This needs to check the actual state (e.g., EndedAt is empty and Error is empty) to determine if indexing is truly in progress. Without a State field on IndexStatus, you'd need something like: if idx != nil && idx.EndedAt == "" && idx.Error == "".
if idx != nil {
resp.Status = "indexing_in_progress"
resp.Data = map[string]any{"indexing": idx}
}
internal/service/tools/find_usages.go:105
- Same issue as in
call_hierarchy.go:if idx != nilat line 103 will now be true even for a completed previous indexing run (the file persists on disk), incorrectly changing the status fromindexing_requiredtoindexing_in_progress. This check needs to verify that indexing is actually ongoing (e.g.,idx.EndedAt == "").
if idx != nil {
resp.Status = "indexing_in_progress"
}
BUG-001 (list_package_exports): normalize full import path to short package name before querying Qdrant. The index stores 'indexer', not 'github.com/doITmagic/rag-code-mcp/pkg/indexer'. BUG-003 (Go parser): go/doc automatically moves constructor/loader functions (NewX, LoadX) that return *T from docPkg.Funcs into docPkg.Types[T].Funcs. The parser only iterated typ.Methods, so these functions were silently dropped and never written to the vector index. Fix: add a typ.Funcs loop in AnalyzePackage() after the methods loop. Affected symbols confirmed missing from Qdrant before fix: LoadIndexStatus, NewService, NewState, LoadState (pkg/indexer) Tests: expanded analyzer_test.go to use real pkg/indexer code as fixture with expectations anchored to the Qdrant DB snapshot (25 points, 2026-03-09). Added regression tests for BUG-003, IsPublic correctness, signature accuracy, and line coverage.
Engine (BUG-004): - StartIndexingAsync now queues recreate=true as pendingOverflow when a job is already running, instead of silently dropping the request - Fix all flaky engine test cleanups: properly wait for background goroutines from BOTH engine instances with time.Sleep before TempDir removal - Add tests: TestStartIndexingAsyncRecreateQueues/StartsImmediately Python parser (treesitter.go): - Add patchExceptAs workaround for gotreesitter v0.6.0 broken AST on except-as - Extract module-level variables/constants (extractAssignment/extractAssignmentDirect) - Extract class variables from class body blocks - Extract function/method calls for Code Graph relations (rag_find_usages) - Detect generators via nodeContainsType(yield) - Parse metaclass= keyword arguments in class bases - Refactor docstring extraction with stripDocstringQuotes helper - Handle gotreesitter putting string nodes directly in blocks (no wrapper) Python parser (extract.go): - Refactor getIndentation to use tagged switch
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 28 out of 28 changed files in this pull request and generated 8 comments.
Comments suppressed due to low confidence (2)
internal/service/tools/call_hierarchy.go:136
- Same bug as in
find_usages.go:idxis loaded from the persistedindex_status.jsonfile. Once any indexing run has completed, the file exists andidx != nilis always true — causing the response status to be incorrectly set to"indexing_in_progress"even when no indexing job is active.
The condition on line 133 should check whether indexing is actually in progress (e.g., via ActiveIndexingJobs() or checking idx.EndedAt == "") rather than just checking file existence.
idx := t.engine.GetIndexStatus(wctx.Root)
visited := make(map[string]bool)
rootNode := &CallNode{Name: symbolName}
// Try to find root symbol info
rootRes := t.findSymbolInfo(ctx, wctx.ID, symbolName)
if rootRes != nil {
rootNode.Type, _ = rootRes.Point.Payload["type"].(string)
rootNode.FilePath, _ = rootRes.Point.Payload["file_path"].(string)
rootNode.Package, _ = rootRes.Point.Payload["package"].(string)
} else {
// If nothing is indexed yet, ExactSearchPolyglot will return ErrNoCollectionsFound.
// Signal indexing status instead of returning an empty hierarchy.
_, sErr := t.engine.ExactSearchPolyglot(ctx, wctx.ID, map[string]interface{}{"name": symbolName}, 1)
var noCollections *engine.ErrNoCollectionsFound
if errors.As(sErr, &noCollections) {
resp := ToolResponse{
Status: "indexing_required",
Message: fmt.Sprintf("⏳ Workspace '%s' is not indexed yet. Indexing is required for complete call hierarchy results.", wctx.Root),
Context: ContextFromWorkspaceWithStatus(wctx, t.engine),
}
if idx != nil {
resp.Status = "indexing_in_progress"
resp.Data = map[string]any{"indexing": idx}
}
internal/service/tools/find_usages.go:105
- Bug:
idxis now loaded from the persistedindex_status.jsonfile viaGetIndexStatus(). Unlike the oldGetIndexProgress()which returned non-nil only when an active in-memory job was running, the file persists after indexing completes. This meansidx != nilwill always be true once any indexing run has occurred, causing the status to incorrectly change from"indexing_required"to"indexing_in_progress"even when indexing finished long ago.
To fix: either check if an indexing job is currently active (via ActiveIndexingJobs() or indexingJobs.Load(wctx.ID)), or check a specific field on the status (e.g., s.EndedAt == "") before treating it as "in progress".
idx := t.engine.GetIndexStatus(wctx.Root)
allResults, err := t.engine.ExactSearchPolyglot(ctx, wctx.ID, filter, 100)
if err != nil {
var noCollections *engine.ErrNoCollectionsFound
if errors.As(err, &noCollections) {
resp := ToolResponse{
Status: "indexing_required",
Message: fmt.Sprintf("⏳ Workspace '%s' is not indexed yet. Indexing is required for complete results.", wctx.Root),
Context: ContextFromWorkspaceWithStatus(wctx, t.engine),
}
if idx != nil {
resp.Status = "indexing_in_progress"
}
This addresses issues where indexing large files (e.g., barou.sql) caused the host system to freeze due to host CPU/GPU starvation and excessive GC pressure. - Fix Ollama throttling bug in indexer service by correctly using a 150ms delay instead of 10ms. - Prevent GC thrashing in treesitter parser by evaluating byte sizes instead of allocating strings for every AST node. - Truncate massive leaf nodes (>8KB) to prevent crashing the Ollama embedding API.
This addresses issues where indexing large files (e.g., barou.sql) caused the host system to freeze due to host CPU/GPU starvation and excessive GC pressure. - Fix Ollama throttling bug in indexer service by correctly using a 150ms delay instead of 10ms. - Prevent GC thrashing in treesitter parser by evaluating byte sizes instead of allocating strings for every AST node. - Truncate massive leaf nodes (>8KB) to prevent crashing the Ollama embedding API.
Export IsInvalidRoot from the watch package and apply it as a safety check at the very start of StartIndexingAsync, before any job registration or SaveIndexStatus call. This prevents accidental indexing of dangerous paths such as the user home directory (~), filesystem root (/), or /tmp — which would cause .ragcode/index_status.json to be written outside any real workspace. - pkg/workspace/watch: isInvalidRoot → IsInvalidRoot (exported + docstring) - internal/service/engine: guard added as first check in StartIndexingAsync
Critical fixes: - Populate IndexingStatus in tool responses (was nil) for ListSkillsTool, InstallSkillTool, EvaluateRagCodeTool, ReadFileContextTool, SmartSearchTool, ListPackageExportsTool — use ContextFromWorkspaceWithStatus consistently - fix(engine): preserve Languages map during incremental indexing in StartIndexingAsync (was overwriting with empty object) - fix(engine): extract finalizeIndexStatus helper to eliminate duplicated EndedAt/Elapsed/Error finalization logic in success and error branches - fix(engine): Progress callback — eliminate LoadIndexStatus (disk read + JSON unmarshal) on every tick; keep single *IndexStatus in-memory and only call SaveIndexStatus (atomic write) for disk flush every 10 files - fix(indexer): SaveIndexStatus uses atomic write-to-temp-then-rename to prevent concurrent readers seeing partial JSON Hidden from AI consumers: - LangStatus.Changed field now json:"-" — AI sees only on_disk and processed Cleanup: - smart_search_pipeline.go: fix extra blank lines and restore missing return statement after buildResponseMeta refactor - treesitter.go: replace invalid issues/TBD link with descriptive comment - watcher.go: clarify IsInvalidRoot doc comment (~ is not expanded by filepath.Clean; rejection is via os.UserHomeDir()) - BUGS.md: mark BUG-003 as Fixed (PR #40) - SUGGESTIONS.md: translate to English, update with current State-field status - analyzer_test.go: remove stale Qdrant DB snapshot references from comments - extract.go: fix getIndentation break → return to exit for-loop Tests: - analyzer_test.go: relax exact line number assertions to > 0 - treesitter_test.go: add 7 new tests for patchExceptAs, call extraction (Code Graph), module-level vars/constants, class vars, IsGenerator - treesitter.go: fix extractClassVarsFromBlock to handle assignment nodes placed directly in block without expression_statement wrapper
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 35 out of 36 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (2)
internal/service/tools/find_usages.go:105
- In the ErrNoCollectionsFound branch, resp.Status is switched to "indexing_in_progress" whenever idx != nil. With the new file-based IndexStatus, idx will be non-nil for any workspace that has ever written index_status.json (even if indexing already completed), so this can misreport "indexing_in_progress". Consider keying this on a real in-progress signal (e.g., idx.EndedAt == "" / idx.Error == "" or checking Engine.ActiveIndexingJobs for wctx.ID) instead of mere file existence.
idx := t.engine.GetIndexStatus(wctx.Root)
allResults, err := t.engine.ExactSearchPolyglot(ctx, wctx.ID, filter, 100)
if err != nil {
var noCollections *engine.ErrNoCollectionsFound
if errors.As(err, &noCollections) {
resp := ToolResponse{
Status: "indexing_required",
Message: fmt.Sprintf("⏳ Workspace '%s' is not indexed yet. Indexing is required for complete results.", wctx.Root),
Context: ContextFromWorkspaceWithStatus(wctx, t.engine),
}
if idx != nil {
resp.Status = "indexing_in_progress"
}
internal/service/tools/call_hierarchy.go:136
- The "indexing_in_progress" status is currently set whenever idx != nil, but IndexStatus will be non-nil for any workspace that has an index_status.json from a previous run. This can incorrectly report indexing as in progress when indexing is actually completed (or stale). Consider switching this condition to something that reflects an active run (e.g., idx.EndedAt == "" and idx.Error == "" / checking Engine.ActiveIndexingJobs for wctx.ID).
idx := t.engine.GetIndexStatus(wctx.Root)
visited := make(map[string]bool)
rootNode := &CallNode{Name: symbolName}
// Try to find root symbol info
rootRes := t.findSymbolInfo(ctx, wctx.ID, symbolName)
if rootRes != nil {
rootNode.Type, _ = rootRes.Point.Payload["type"].(string)
rootNode.FilePath, _ = rootRes.Point.Payload["file_path"].(string)
rootNode.Package, _ = rootRes.Point.Payload["package"].(string)
} else {
// If nothing is indexed yet, ExactSearchPolyglot will return ErrNoCollectionsFound.
// Signal indexing status instead of returning an empty hierarchy.
_, sErr := t.engine.ExactSearchPolyglot(ctx, wctx.ID, map[string]interface{}{"name": symbolName}, 1)
var noCollections *engine.ErrNoCollectionsFound
if errors.As(sErr, &noCollections) {
resp := ToolResponse{
Status: "indexing_required",
Message: fmt.Sprintf("⏳ Workspace '%s' is not indexed yet. Indexing is required for complete call hierarchy results.", wctx.Root),
Context: ContextFromWorkspaceWithStatus(wctx, t.engine),
}
if idx != nil {
resp.Status = "indexing_in_progress"
resp.Data = map[string]any{"indexing": idx}
}
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 36 out of 37 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (2)
internal/service/tools/find_usages.go:106
idx != nilis treated as “indexing_in_progress”, but IndexStatus is loaded from disk and will remain non-nil even after indexing completed (EndedAt set) or failed. This can incorrectly label the workspace as indexing whenErrNoCollectionsFoundoccurs for other reasons (e.g., collections deleted). Consider deriving “in progress” from the status fields (e.g., StartedAt set AND EndedAt empty), and only then switch toindexing_in_progress/ attachindexingdata.
idx := t.engine.GetIndexStatus(wctx.Root)
allResults, err := t.engine.ExactSearchPolyglot(ctx, wctx.ID, filter, 100)
if err != nil {
var noCollections *engine.ErrNoCollectionsFound
if errors.As(err, &noCollections) {
resp := ToolResponse{
Status: "indexing_required",
Message: fmt.Sprintf("⏳ Workspace '%s' is not indexed yet. Indexing is required for complete results.", wctx.Root),
Context: ContextFromWorkspaceWithStatus(wctx, t.engine),
}
if idx != nil {
resp.Status = "indexing_in_progress"
}
return resp.JSON()
internal/service/tools/call_hierarchy.go:137
- The tool sets
status=indexing_in_progresswhenever an IndexStatus file exists (idx != nil), but IndexStatus persists after completion. In the ErrNoCollectionsFound branch this can misreport state if collections are missing for other reasons. Prefer a deterministic “in progress” check (e.g., StartedAt present and EndedAt empty) before reportingindexing_in_progressand returningindexingdata.
idx := t.engine.GetIndexStatus(wctx.Root)
visited := make(map[string]bool)
rootNode := &CallNode{Name: symbolName}
// Try to find root symbol info
rootRes := t.findSymbolInfo(ctx, wctx.ID, symbolName)
if rootRes != nil {
rootNode.Type, _ = rootRes.Point.Payload["type"].(string)
rootNode.FilePath, _ = rootRes.Point.Payload["file_path"].(string)
rootNode.Package, _ = rootRes.Point.Payload["package"].(string)
} else {
// If nothing is indexed yet, ExactSearchPolyglot will return ErrNoCollectionsFound.
// Signal indexing status instead of returning an empty hierarchy.
_, sErr := t.engine.ExactSearchPolyglot(ctx, wctx.ID, map[string]interface{}{"name": symbolName}, 1)
var noCollections *engine.ErrNoCollectionsFound
if errors.As(sErr, &noCollections) {
resp := ToolResponse{
Status: "indexing_required",
Message: fmt.Sprintf("⏳ Workspace '%s' is not indexed yet. Indexing is required for complete call hierarchy results.", wctx.Root),
Context: ContextFromWorkspaceWithStatus(wctx, t.engine),
}
if idx != nil {
resp.Status = "indexing_in_progress"
resp.Data = map[string]any{"indexing": idx}
}
return resp.JSON()
…ct root/idMap in ResumeIndexingOnConnect
… refactor/indexing-progress
ResumeIndexingOnConnect and DetectContext auto-trigger could both call StartIndexingAsync for the same workspace simultaneously, bypassing the LoadOrStore dedup guard via TOCTOU race window. Changes: - ResumeIndexingOnConnect now marks connectTriggered before StartIndexingAsync - Removed redundant indexingJobs.Load check from DetectContext (TOCTOU) - Changed 'go e.StartIndexingAsync(...)' to direct call (goroutine created internally) Fixes system freeze when indexing large workspaces (~5000+ files).
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 36 out of 37 changed files in this pull request and generated 2 comments.
You can also share your feedback on Copilot code review. Take the survey.
The VKCOM PHP parser AST already includes $ in Identifier.Value (e.g. "$role" not "role"), so adding another $ prefix resulted in $$role in method signatures and $$table in property signatures. - buildMethodSignature: remove explicit "$" + prefix (line 663) - convertToChunks: remove "$" from property Signature format (line 944) Verified: all php parser tests pass, manual test on Laravel project confirms single $ in all signatures.
Three PHP parser improvements:
1. uses_type relations: PHP 'use' import statements now generate
uses_type relations on class chunks. This enables find_usages to
discover all classes importing a given type (e.g. find_usages('Lawyer')
finds all controllers with 'use App\Lawyer').
2. Route file extraction: PHP files in routes/ directories that yield
0 symbols from standard AST analysis now fall back to regex-based
Route::get/post/resource extraction. routes/web.php goes from
0 to 39 symbols.
3. Fix $$ double dollar: Remove extra $ prefix from parameter and
property signatures since VKCOM AST already includes $ in
Identifier.Value.
These file types are not documentation - they are code that was incorrectly classified as docs. Removing them from the docs parser: - SQL: query language - SH: shell scripts - Svelte: frontend framework components This reduces docs from 551 to ~49 files on the barou Laravel project, making the language sort put PHP/JS first and dramatically reducing indexing time for documentation. Updated tests to verify these extensions are no longer handled by docs.
These file types are not documentation - they are code that was incorrectly classified as docs. Removing them from the docs parser: - SQL: query language - SH: shell scripts - Svelte: frontend framework components This reduces docs from 551 to ~49 files on the barou Laravel project, making the language sort put PHP/JS first and dramatically reducing indexing time for documentation. Updated tests to verify these extensions are no longer handled by docs.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 44 out of 45 changed files in this pull request and generated 4 comments.
You can also share your feedback on Copilot code review. Take the survey.
- Replace Unix socket and .pid lock files with TCP port binding (localhost:39000) for singleton enforcement. - Update IsDaemonRunning, StartDaemon and StopDaemon to fetch process ID via HTTP /health. - Remove tracking logic around pidfile and sockets. - Recreate adapter and lifecycle tests to connect over loopback TCP instead of sockets. - Update rag-code-install gracefully stop procedure to pull daemon PID from health endpoint.
- Introduce FrameworkEnricher interface in core PHP analyzer - Isolate Laravel and WordPress specific analysis into enricher.go - Resolve plugin overhead with blank imports on run/test files - Maintain lazy-loading decoupled structure to prevent import cycles
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 59 out of 62 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (2)
internal/service/tools/find_usages.go:105
- In the ErrNoCollectionsFound branch, the tool sets
status="indexing_in_progress"wheneverGetIndexStatusreturns non-nil. With the new file-based IndexStatus, a non-nil status file can exist even when indexing is completed/failed, so this can incorrectly report “in progress”. Consider checking something likeidx.EndedAt == "" && idx.StartedAt != ""(and/oridx.Error=="") before switching the status.
internal/service/tools/call_hierarchy.go:136 - Same issue as in FindUsagesTool:
idx := GetIndexStatus(...)is used as a boolean to decidestatus="indexing_in_progress". Since IndexStatus persists after completion, this can mislabel a workspace as “in progress” even when it’s done. Gate this on an “active” condition (e.g.,EndedAt == "").
idx := t.engine.GetIndexStatus(wctx.Root)
visited := make(map[string]bool)
rootNode := &CallNode{Name: symbolName}
// Try to find root symbol info
rootRes := t.findSymbolInfo(ctx, wctx.ID, symbolName)
if rootRes != nil {
rootNode.Type, _ = rootRes.Point.Payload["type"].(string)
rootNode.FilePath, _ = rootRes.Point.Payload["file_path"].(string)
rootNode.Package, _ = rootRes.Point.Payload["package"].(string)
} else {
// If nothing is indexed yet, ExactSearchPolyglot will return ErrNoCollectionsFound.
// Signal indexing status instead of returning an empty hierarchy.
_, sErr := t.engine.ExactSearchPolyglot(ctx, wctx.ID, map[string]interface{}{"name": symbolName}, 1)
var noCollections *engine.ErrNoCollectionsFound
if errors.As(sErr, &noCollections) {
resp := ToolResponse{
Status: "indexing_required",
Message: fmt.Sprintf("⏳ Workspace '%s' is not indexed yet. Indexing is required for complete call hierarchy results.", wctx.Root),
Context: ContextFromWorkspaceWithStatus(wctx, t.engine),
}
if idx != nil {
resp.Status = "indexing_in_progress"
resp.Data = map[string]any{"indexing": idx}
}
You can also share your feedback on Copilot code review. Take the survey.
…emon - Parsers: Introduced gotreesitter parser caching & explicit 'arenagc' draining. Arena memory is now freed after each file, fixing a severe memory leak. - HTML/CSS: Dropped CSS/SCSS tracking in the HTML parser to avoid Tree-Sitter GLR explosions and extreme slowdowns during embedding. - Indexer: Added strict ignoring of minified/vendored files (.min.js, .bundle.css, etc.) to skip massive auto-generated files. - Indexer: Added watchdog and auto-recovery for Ollama embedded deadlocks. - Daemon: Reverted to simple and stable 'Setsid' background daemon spawn pattern in lifecycle.go. - Main: Removed unnecessary --fork-exec flag logic and bumped version.
…tter memory explosions - Extracted CSS parsing from html/analyzer.go to a dedicated css/analyzer.go - Replaces GLR AST generation with linear bracket-depth text scanning - Caps huge CSS rule chunks to 8KB to prevent vector DB overload - Removed old unused css_regex.go implementation - Registered the new generic CSS parser globally in the daemon Resolves Trello Task 1, Task 2, Task 3
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 75 out of 78 changed files in this pull request and generated 14 comments.
You can also share your feedback on Copilot code review. Take the survey.
| - **Python**: Complete native AST support | ||
| - **HTML & Markdown**: Structural documentation mappings | ||
| - **Generic Support**: CSS, JSON, YAML, Shell scripts, SQL | ||
| - **HTML & CSS**: HTML structural mappings, CSS/SCSS/SASS/LESS via tree-sitter |
|
|
||
| body { | ||
| color: #fff; | ||
| } | ||
|
No newline at end of file |
| // ListenConfig configures the daemon's network listeners and lifecycle. | ||
| type ListenConfig struct { | ||
| SocketPath string // Unix domain socket path (required) | ||
| PIDPath string // PID file path (required) | ||
| Version string // Server version string | ||
| HTTPPort int // TCP port for optional HTTP listener (0 = disabled) | ||
| Handler http.Handler // MCP handler (must handle /mcp) | ||
| OnReady func() // Called when daemon is ready to accept connections (optional) | ||
| Port int // TCP port for localhost listener | ||
| Version string // Server version string | ||
| Handler http.Handler // MCP handler (must handle /mcp) | ||
| OnReady func() // Called when daemon is ready to accept connections (optional) | ||
| } | ||
|
|
||
| // ListenAndServe starts the daemon listeners and blocks until ctx is cancelled | ||
| // or SIGTERM/SIGINT is received. Cleans up socket and PID file on exit. | ||
| // | ||
| // It sets up two listeners: | ||
| // 1. Unix domain socket at SocketPath (primary, for stdio adapters) | ||
| // 2. TCP HTTP on HTTPPort (optional, for curl/debug/external agents, localhost only) | ||
| // | ||
| // Both serve the same handler mux with /health and the provided MCP handler. | ||
| // or SIGTERM/SIGINT is received. It binds exclusively to a local TCP port to | ||
| // guarantee it is a singleton, avoiding file locking issues. | ||
| func ListenAndServe(ctx context.Context, cfg ListenConfig) error { |
| | **PHP** | [`/php`](./php/README.md) | Deep Laravel integration (Eloquent, Routes, Controllers) & WordPress. | ✅ Production | | ||
| | **HTML & CSS** | [`/html`](./html/README.md) | HTML semantic sectioning + CSS/SCSS/SASS/LESS via tree-sitter. | ✅ Production | | ||
| | **JavaScript** | [`/javascript`](./javascript/README.md) | React, Vue, & TypeScript support. | ✅ Production | | ||
| | **Docs** | [`/docs`](./docs/README.md) | Markdown, JSON, YAML, XML, TOML, reStructuredText. | ✅ Production | |
| var fileErrs []string | ||
| for _, path := range changedFiles { | ||
| fileNum := int(doneFiles.Load()) + 1 | ||
| logger.Instance.Debug("[IDX] ws=%s lang=%s [%d/%d] %s (indexing...)", | ||
| logger.Instance.Info("[IDX] ws=%s lang=%s [%d/%d] %s (indexing...)", | ||
| wsName, opts.Language, fileNum, totalFiles, filepath.Base(path)) | ||
|
|
||
| symCount, indexErr := s.IndexFile(ctx, collection, path, state) | ||
| if indexErr != nil { | ||
| logger.Instance.Warn("[IDX] ws=%s lang=%s ⚠️ %s: %v", wsName, opts.Language, filepath.Base(path), indexErr) | ||
| fileErrs = append(fileErrs, fmt.Sprintf("%s: %v", path, indexErr)) | ||
| } else { | ||
| logger.Instance.Debug("[IDX] ws=%s lang=%s %s → %d symbol(s)", wsName, opts.Language, filepath.Base(path), symCount) | ||
| logger.Instance.Info("[IDX] ws=%s lang=%s %s → %d symbol(s)", wsName, opts.Language, filepath.Base(path), symCount) |
| // Analyzer implementeaza procesarea pe bucati (chunk-based) a fisierelor CSS/SCSS/LESS. | ||
| // Fara sa depinda de Tree-sitter, nu face OOM nici macar la bundle-uri gigantice. | ||
| type Analyzer struct{} |
| // Profiling endpoints | ||
| mcpMux.HandleFunc("/debug/pprof/", pprof.Index) | ||
| mcpMux.HandleFunc("/debug/pprof/cmdline", pprof.Cmdline) | ||
| mcpMux.HandleFunc("/debug/pprof/profile", pprof.Profile) | ||
| mcpMux.HandleFunc("/debug/pprof/symbol", pprof.Symbol) | ||
| mcpMux.HandleFunc("/debug/pprof/trace", pprof.Trace) |
| @@ -0,0 +1 @@ | |||
| 404 page not found | |||
| atIdx := strings.Index(argsStr, "@") | ||
| if atIdx > 0 { | ||
| // Find the quoted string containing @ | ||
| for _, q := range []byte{'\'', '"'} { | ||
| startQ := strings.IndexByte(argsStr[strings.Index(argsStr, string(q))+1:], q) | ||
| _ = startQ | ||
| } |
| func getFreePort() (int, error) { | ||
| addr, err := net.ResolveTCPAddr("tcp", "localhost:0") | ||
| if err != nil { | ||
| return 0, err | ||
| } | ||
| l, err := net.ListenTCP("tcp", addr) | ||
| if err != nil { | ||
| return 0, err | ||
| } | ||
| defer l.Close() | ||
| return l.Addr().(*net.TCPAddr).Port, nil | ||
| } |
Description
Replaces the old, complex
progressStoreindexing progress mechanism with a simple file-basedIndexStatussystem.Problem
The previous system had two separate file walks (one for counting, one for indexing), an in-memory
progressStorewithpreRegister/update/carry-over/flusherlogic, and multiple intermediate structs (IndexingProgressSummary,LangProgressItem). This made progress reporting confusing, inaccurate, and hard to maintain.Solution
{workspaceRoot}/.ragcode/index_status.json— a simple JSON file written by the indexer during processingProgresscallback inpkg/indexer/service.gowritesOnDisk,Changed,Processedcounts per language directly to the status fileindexer.LoadIndexStatus()— no intermediate transformationsstarting→running(first file processed) →completed/failedWhat was removed
progressStore(preRegister, update, carry-over, flusher goroutine)IndexingProgressSummary,LangProgressItem,BuildIndexingProgress,formatAge,buildIndexingMessageSearchCode/HybridSearchCode(redundant withDetectContextwhich already triggers re-indexing on first tool call)resumeAttemptsfield from Engine structWhat was added
pkg/indexer/index_status.go—IndexStatus,LangStatus,SaveIndexStatus,LoadIndexStatuspkg/indexer/index_status_test.go— round-trip and missing file testsengine.IndexWorkspaceFiles changed (22 files, +208 / -1182 lines)
engine/index_progress.go→pkg/indexer/index_status.goengine.go,response.go,smart_search.go,smart_search_pipeline.go,index_workspace.go,list_package_exports.go,find_usages.go,call_hierarchy.go,evaluate_ragcode.go,skills.go,read_file_context.goengine_searchcode_test.go,engine_*_test.go,health_metrics_test.go,detector_test.goType of change
Checklist:
go fmt ./...go test ./...and they pass