refactor: Replace progressStore with file-based IndexStatus by doITmagic · Pull Request #40 · doITmagic/rag-code-mcp

doITmagic · 2026-03-08T23:04:52Z

Description

Replaces the old, complex progressStore indexing progress mechanism with a simple file-based IndexStatus system.

Problem

The previous system had two separate file walks (one for counting, one for indexing), an in-memory progressStore with preRegister/update/carry-over/flusher logic, and multiple intermediate structs (IndexingProgressSummary, LangProgressItem). This made progress reporting confusing, inaccurate, and hard to maintain.

Solution

Single source of truth: {workspaceRoot}/.ragcode/index_status.json — a simple JSON file written by the indexer during processing
Direct write from indexer: The Progress callback in pkg/indexer/service.go writes OnDisk, Changed, Processed counts per language directly to the status file
Direct read by tools: MCP tools read the status file via indexer.LoadIndexStatus() — no intermediate transformations
Lifecycle: starting → running (first file processed) → completed/failed

What was removed

progressStore (preRegister, update, carry-over, flusher goroutine)
IndexingProgressSummary, LangProgressItem, BuildIndexingProgress, formatAge, buildIndexingMessage
Auto-resume logic from SearchCode/HybridSearchCode (redundant with DetectContext which already triggers re-indexing on first tool call)
resumeAttempts field from Engine struct

What was added

pkg/indexer/index_status.go — IndexStatus, LangStatus, SaveIndexStatus, LoadIndexStatus
pkg/indexer/index_status_test.go — round-trip and missing file tests
Progress callback wiring in engine.IndexWorkspace

Files changed (22 files, +208 / -1182 lines)

Moved: engine/index_progress.go → pkg/indexer/index_status.go
Modified: engine.go, response.go, smart_search.go, smart_search_pipeline.go, index_workspace.go, list_package_exports.go, find_usages.go, call_hierarchy.go, evaluate_ragcode.go, skills.go, read_file_context.go
Tests updated: engine_searchcode_test.go, engine_*_test.go, health_metrics_test.go, detector_test.go

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update

Checklist:

I have performed a self-review of my own code
I have formatted my code with go fmt ./...
I have run tests go test ./... and they pass
I have verified integration with Ollama/Qdrant (if applicable)
I have updated the documentation accordingly

- Remove progressStore (preRegister, update, carry-over, flusher) - Remove IndexingProgressSummary, BuildIndexingProgress, formatAge, buildIndexingMessage - Remove auto-resume from SearchCode/HybridSearchCode (redundant with DetectContext) - Remove resumeAttempts field from Engine - Add IndexStatus/LangStatus/SaveIndexStatus/LoadIndexStatus in pkg/indexer/ - Indexer writes OnDisk/Changed/Processed via Progress callback - Tools read status directly from .ragcode/index_status.json - Fix TestDetectNoMarkers with AllowedRoots isolation

Copilot

Pull request overview

Replaces the in-memory progressStore indexing progress mechanism (with preRegister, flusher goroutine, carry-over logic) with a simpler file-based IndexStatus system in pkg/indexer/index_status.go. MCP tools now read status directly from {workspaceRoot}/.ragcode/index_status.json via indexer.LoadIndexStatus(). This is a significant simplification that removes ~1000 lines of complex concurrency code.

Changes:

New pkg/indexer/index_status.go with IndexStatus/LangStatus structs and SaveIndexStatus/LoadIndexStatus functions, replacing the old engine/index_progress.go with its progressStore, flusher goroutine, and deep-copy logic
All MCP tools updated to use the new IndexingStatus field (backed by indexer.IndexStatus) instead of IndexingProgress (backed by IndexingProgressSummary), with the JSON tag preserved as "indexing_progress" for backward compatibility
Removed auto-resume logic from SearchCode/HybridSearchCode and the resumeAttempts throttle, as workspace re-indexing is now triggered via DetectContext

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
`pkg/indexer/index_status.go`	New file: `IndexStatus`/`LangStatus` types, `SaveIndexStatus`, `LoadIndexStatus`
`pkg/indexer/index_status_test.go`	New file: round-trip and missing-file tests
`internal/service/engine/engine.go`	Remove `progressStore`/`resumeAttempts`, add `GetIndexStatus`, wire Progress callback with file-based status
`internal/service/engine/index_progress.go`	Deleted: old `progressStore` and all related types/functions
`internal/service/engine/index_progress_test.go`	Deleted: tests for removed `progressStore`
`internal/service/tools/response.go`	Replace `IndexingProgressSummary` with `IndexingStatus`, rename helper to `ContextFromWorkspaceWithStatus`
`internal/service/tools/smart_search_pipeline.go`	Use `LoadIndexStatus`, replace dynamic fallback note with static string
`internal/service/tools/smart_search.go`	Simplify indexing error messages, remove progress attachment
`internal/service/tools/find_usages.go`	Switch to `GetIndexStatus`/`ContextFromWorkspaceWithStatus`
`internal/service/tools/call_hierarchy.go`	Switch to `GetIndexStatus`/`ContextFromWorkspaceWithStatus`
`internal/service/tools/list_package_exports.go`	Switch to `ContextFromWorkspaceWithStatus`, remove `idx`-based status override
`internal/service/tools/index_workspace.go`	Use `LoadIndexStatus` directly
`internal/service/tools/skills.go`	Set `IndexingStatus: nil`
`internal/service/tools/evaluate_ragcode.go`	Remove unused `wctxID`/`wctxRoot`, set `IndexingStatus: nil`
`internal/service/tools/read_file_context.go`	Set `IndexingStatus: nil`
`internal/service/tools/tests/health_metrics_test.go`	Remove tests for deleted types/functions
`internal/service/engine/engine_searchcode_test.go`	Remove auto-resume test
`internal/service/engine/engine_nonblocking_search_test.go`	Remove `progress.stop()` cleanup
`internal/service/engine/engine_fallback_search_test.go`	Remove `progress.stop()` cleanup
`internal/service/engine/engine_sticky_test.go`	Remove `progress.stop()` cleanup
`pkg/workspace/detector/detector_test.go`	Isolate test to avoid picking up `.ragcode` markers from parent dirs
`cmd/rag-code-mcp/main.go`	Version bump to 2.1.63

The Progress callback received totalFiles = len(changedFiles), which only counts modified files needing re-indexing. This was incorrectly assigned to OnDisk, causing on_disk: 1 when only 1 file changed — despite 232 Go files and 655 docs on disk. Fix: - Call CountAllFiles() once before the language loop for real disk totals - Pre-populate index_status.json with on_disk counts at indexing start - Use diskTotal (pre-counted) for OnDisk, totalFiles for Changed - Languages with 0 changed files now correctly show their disk totals

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

- Fix nil panic in ContextFromWorkspaceWithStatus when wctx is nil (#7) - Fix indentation in smart_search_pipeline.go (#1) - Use loaded idx instead of nil in call_hierarchy.go and find_usages.go (#3, #9) - Add backward-compat comment on JSON tag mismatch (#6) - Create fresh IndexStatus when LoadIndexStatus returns nil (#8) - Populate Elapsed field at completed/failed transitions (#2) - Throttle progress I/O writes to every 10 files (#4) - Fix test cleanup for .ragcode dir in TempDir

- Removed the 'State' field ('starting', 'running', 'completed', 'failed') from IndexStatus entirely. - This state was misleading for AI consumers, especially during incremental re-indexing (which reset state to 'starting' even if the index was 99% complete), causing AI agents to prematurely abandon tools. - Simplified engine.go progress callbacks and terminal states to only log timestamps and errors, rather than a potentially confusing overall state keyword. - Updated related tests to match the simplified struct.

Copilot

Pull request overview

Copilot reviewed 23 out of 23 changed files in this pull request and generated 9 comments.

Comments suppressed due to low confidence (2)

internal/service/tools/call_hierarchy.go:136

The check if idx != nil at line 133 is now semantically incorrect. With the old in-memory progressStore, GetIndexProgress returned nil when no indexing was active. With the new file-based system, GetIndexStatus reads from disk and returns non-nil even for a completed previous run (the file persists). This means the response will always be indexing_in_progress when no collections exist but a previous run left an index_status.json file — even if indexing completed hours ago.

This needs to check the actual state (e.g., EndedAt is empty and Error is empty) to determine if indexing is truly in progress. Without a State field on IndexStatus, you'd need something like: if idx != nil && idx.EndedAt == "" && idx.Error == "".

			if idx != nil {
				resp.Status = "indexing_in_progress"
				resp.Data = map[string]any{"indexing": idx}
			}

internal/service/tools/find_usages.go:105

Same issue as in call_hierarchy.go: if idx != nil at line 103 will now be true even for a completed previous indexing run (the file persists on disk), incorrectly changing the status from indexing_required to indexing_in_progress. This check needs to verify that indexing is actually ongoing (e.g., idx.EndedAt == "").

			if idx != nil {
				resp.Status = "indexing_in_progress"
			}

BUG-001 (list_package_exports): normalize full import path to short package name before querying Qdrant. The index stores 'indexer', not 'github.com/doITmagic/rag-code-mcp/pkg/indexer'. BUG-003 (Go parser): go/doc automatically moves constructor/loader functions (NewX, LoadX) that return *T from docPkg.Funcs into docPkg.Types[T].Funcs. The parser only iterated typ.Methods, so these functions were silently dropped and never written to the vector index. Fix: add a typ.Funcs loop in AnalyzePackage() after the methods loop. Affected symbols confirmed missing from Qdrant before fix: LoadIndexStatus, NewService, NewState, LoadState (pkg/indexer) Tests: expanded analyzer_test.go to use real pkg/indexer code as fixture with expectations anchored to the Qdrant DB snapshot (25 points, 2026-03-09). Added regression tests for BUG-003, IsPublic correctness, signature accuracy, and line coverage.

Engine (BUG-004): - StartIndexingAsync now queues recreate=true as pendingOverflow when a job is already running, instead of silently dropping the request - Fix all flaky engine test cleanups: properly wait for background goroutines from BOTH engine instances with time.Sleep before TempDir removal - Add tests: TestStartIndexingAsyncRecreateQueues/StartsImmediately Python parser (treesitter.go): - Add patchExceptAs workaround for gotreesitter v0.6.0 broken AST on except-as - Extract module-level variables/constants (extractAssignment/extractAssignmentDirect) - Extract class variables from class body blocks - Extract function/method calls for Code Graph relations (rag_find_usages) - Detect generators via nodeContainsType(yield) - Parse metaclass= keyword arguments in class bases - Refactor docstring extraction with stripDocstringQuotes helper - Handle gotreesitter putting string nodes directly in blocks (no wrapper) Python parser (extract.go): - Refactor getIndentation to use tagged switch

Copilot

Pull request overview

Copilot reviewed 28 out of 28 changed files in this pull request and generated 8 comments.

Comments suppressed due to low confidence (2)

internal/service/tools/call_hierarchy.go:136

Same bug as in find_usages.go: idx is loaded from the persisted index_status.json file. Once any indexing run has completed, the file exists and idx != nil is always true — causing the response status to be incorrectly set to "indexing_in_progress" even when no indexing job is active.

The condition on line 133 should check whether indexing is actually in progress (e.g., via ActiveIndexingJobs() or checking idx.EndedAt == "") rather than just checking file existence.

	idx := t.engine.GetIndexStatus(wctx.Root)

	visited := make(map[string]bool)

	rootNode := &CallNode{Name: symbolName}

	// Try to find root symbol info
	rootRes := t.findSymbolInfo(ctx, wctx.ID, symbolName)
	if rootRes != nil {
		rootNode.Type, _ = rootRes.Point.Payload["type"].(string)
		rootNode.FilePath, _ = rootRes.Point.Payload["file_path"].(string)
		rootNode.Package, _ = rootRes.Point.Payload["package"].(string)
	} else {
		// If nothing is indexed yet, ExactSearchPolyglot will return ErrNoCollectionsFound.
		// Signal indexing status instead of returning an empty hierarchy.
		_, sErr := t.engine.ExactSearchPolyglot(ctx, wctx.ID, map[string]interface{}{"name": symbolName}, 1)
		var noCollections *engine.ErrNoCollectionsFound
		if errors.As(sErr, &noCollections) {
			resp := ToolResponse{
				Status:  "indexing_required",
				Message: fmt.Sprintf("⏳ Workspace '%s' is not indexed yet. Indexing is required for complete call hierarchy results.", wctx.Root),
				Context: ContextFromWorkspaceWithStatus(wctx, t.engine),
			}
			if idx != nil {
				resp.Status = "indexing_in_progress"
				resp.Data = map[string]any{"indexing": idx}
			}

internal/service/tools/find_usages.go:105

Bug: idx is now loaded from the persisted index_status.json file via GetIndexStatus(). Unlike the old GetIndexProgress() which returned non-nil only when an active in-memory job was running, the file persists after indexing completes. This means idx != nil will always be true once any indexing run has occurred, causing the status to incorrectly change from "indexing_required" to "indexing_in_progress" even when indexing finished long ago.

To fix: either check if an indexing job is currently active (via ActiveIndexingJobs() or indexingJobs.Load(wctx.ID)), or check a specific field on the status (e.g., s.EndedAt == "") before treating it as "in progress".

	idx := t.engine.GetIndexStatus(wctx.Root)
	allResults, err := t.engine.ExactSearchPolyglot(ctx, wctx.ID, filter, 100)
	if err != nil {
		var noCollections *engine.ErrNoCollectionsFound
		if errors.As(err, &noCollections) {
			resp := ToolResponse{
				Status:  "indexing_required",
				Message: fmt.Sprintf("⏳ Workspace '%s' is not indexed yet. Indexing is required for complete results.", wctx.Root),
				Context: ContextFromWorkspaceWithStatus(wctx, t.engine),
			}
			if idx != nil {
				resp.Status = "indexing_in_progress"
			}

This addresses issues where indexing large files (e.g., barou.sql) caused the host system to freeze due to host CPU/GPU starvation and excessive GC pressure. - Fix Ollama throttling bug in indexer service by correctly using a 150ms delay instead of 10ms. - Prevent GC thrashing in treesitter parser by evaluating byte sizes instead of allocating strings for every AST node. - Truncate massive leaf nodes (>8KB) to prevent crashing the Ollama embedding API.

Export IsInvalidRoot from the watch package and apply it as a safety check at the very start of StartIndexingAsync, before any job registration or SaveIndexStatus call. This prevents accidental indexing of dangerous paths such as the user home directory (~), filesystem root (/), or /tmp — which would cause .ragcode/index_status.json to be written outside any real workspace. - pkg/workspace/watch: isInvalidRoot → IsInvalidRoot (exported + docstring) - internal/service/engine: guard added as first check in StartIndexingAsync

Copilot

Pull request overview

Copilot reviewed 34 out of 34 changed files in this pull request and generated 9 comments.

Critical fixes: - Populate IndexingStatus in tool responses (was nil) for ListSkillsTool, InstallSkillTool, EvaluateRagCodeTool, ReadFileContextTool, SmartSearchTool, ListPackageExportsTool — use ContextFromWorkspaceWithStatus consistently - fix(engine): preserve Languages map during incremental indexing in StartIndexingAsync (was overwriting with empty object) - fix(engine): extract finalizeIndexStatus helper to eliminate duplicated EndedAt/Elapsed/Error finalization logic in success and error branches - fix(engine): Progress callback — eliminate LoadIndexStatus (disk read + JSON unmarshal) on every tick; keep single *IndexStatus in-memory and only call SaveIndexStatus (atomic write) for disk flush every 10 files - fix(indexer): SaveIndexStatus uses atomic write-to-temp-then-rename to prevent concurrent readers seeing partial JSON Hidden from AI consumers: - LangStatus.Changed field now json:"-" — AI sees only on_disk and processed Cleanup: - smart_search_pipeline.go: fix extra blank lines and restore missing return statement after buildResponseMeta refactor - treesitter.go: replace invalid issues/TBD link with descriptive comment - watcher.go: clarify IsInvalidRoot doc comment (~ is not expanded by filepath.Clean; rejection is via os.UserHomeDir()) - BUGS.md: mark BUG-003 as Fixed (PR #40) - SUGGESTIONS.md: translate to English, update with current State-field status - analyzer_test.go: remove stale Qdrant DB snapshot references from comments - extract.go: fix getIndentation break → return to exit for-loop Tests: - analyzer_test.go: relax exact line number assertions to > 0 - treesitter_test.go: add 7 new tests for patchExceptAs, call extraction (Code Graph), module-level vars/constants, class vars, IsGenerator - treesitter.go: fix extractClassVarsFromBlock to handle assignment nodes placed directly in block without expression_statement wrapper

… not persisted

Copilot

Pull request overview

Copilot reviewed 35 out of 36 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (2)

internal/service/tools/find_usages.go:105

In the ErrNoCollectionsFound branch, resp.Status is switched to "indexing_in_progress" whenever idx != nil. With the new file-based IndexStatus, idx will be non-nil for any workspace that has ever written index_status.json (even if indexing already completed), so this can misreport "indexing_in_progress". Consider keying this on a real in-progress signal (e.g., idx.EndedAt == "" / idx.Error == "" or checking Engine.ActiveIndexingJobs for wctx.ID) instead of mere file existence.

	idx := t.engine.GetIndexStatus(wctx.Root)
	allResults, err := t.engine.ExactSearchPolyglot(ctx, wctx.ID, filter, 100)
	if err != nil {
		var noCollections *engine.ErrNoCollectionsFound
		if errors.As(err, &noCollections) {
			resp := ToolResponse{
				Status:  "indexing_required",
				Message: fmt.Sprintf("⏳ Workspace '%s' is not indexed yet. Indexing is required for complete results.", wctx.Root),
				Context: ContextFromWorkspaceWithStatus(wctx, t.engine),
			}
			if idx != nil {
				resp.Status = "indexing_in_progress"
			}

internal/service/tools/call_hierarchy.go:136

The "indexing_in_progress" status is currently set whenever idx != nil, but IndexStatus will be non-nil for any workspace that has an index_status.json from a previous run. This can incorrectly report indexing as in progress when indexing is actually completed (or stale). Consider switching this condition to something that reflects an active run (e.g., idx.EndedAt == "" and idx.Error == "" / checking Engine.ActiveIndexingJobs for wctx.ID).

	idx := t.engine.GetIndexStatus(wctx.Root)

	visited := make(map[string]bool)

	rootNode := &CallNode{Name: symbolName}

	// Try to find root symbol info
	rootRes := t.findSymbolInfo(ctx, wctx.ID, symbolName)
	if rootRes != nil {
		rootNode.Type, _ = rootRes.Point.Payload["type"].(string)
		rootNode.FilePath, _ = rootRes.Point.Payload["file_path"].(string)
		rootNode.Package, _ = rootRes.Point.Payload["package"].(string)
	} else {
		// If nothing is indexed yet, ExactSearchPolyglot will return ErrNoCollectionsFound.
		// Signal indexing status instead of returning an empty hierarchy.
		_, sErr := t.engine.ExactSearchPolyglot(ctx, wctx.ID, map[string]interface{}{"name": symbolName}, 1)
		var noCollections *engine.ErrNoCollectionsFound
		if errors.As(sErr, &noCollections) {
			resp := ToolResponse{
				Status:  "indexing_required",
				Message: fmt.Sprintf("⏳ Workspace '%s' is not indexed yet. Indexing is required for complete call hierarchy results.", wctx.Root),
				Context: ContextFromWorkspaceWithStatus(wctx, t.engine),
			}
			if idx != nil {
				resp.Status = "indexing_in_progress"
				resp.Data = map[string]any{"indexing": idx}
			}

Copilot

Pull request overview

Copilot reviewed 36 out of 37 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (2)

internal/service/tools/find_usages.go:106

idx != nil is treated as “indexing_in_progress”, but IndexStatus is loaded from disk and will remain non-nil even after indexing completed (EndedAt set) or failed. This can incorrectly label the workspace as indexing when ErrNoCollectionsFound occurs for other reasons (e.g., collections deleted). Consider deriving “in progress” from the status fields (e.g., StartedAt set AND EndedAt empty), and only then switch to indexing_in_progress / attach indexing data.

	idx := t.engine.GetIndexStatus(wctx.Root)
	allResults, err := t.engine.ExactSearchPolyglot(ctx, wctx.ID, filter, 100)
	if err != nil {
		var noCollections *engine.ErrNoCollectionsFound
		if errors.As(err, &noCollections) {
			resp := ToolResponse{
				Status:  "indexing_required",
				Message: fmt.Sprintf("⏳ Workspace '%s' is not indexed yet. Indexing is required for complete results.", wctx.Root),
				Context: ContextFromWorkspaceWithStatus(wctx, t.engine),
			}
			if idx != nil {
				resp.Status = "indexing_in_progress"
			}
			return resp.JSON()

internal/service/tools/call_hierarchy.go:137

The tool sets status=indexing_in_progress whenever an IndexStatus file exists (idx != nil), but IndexStatus persists after completion. In the ErrNoCollectionsFound branch this can misreport state if collections are missing for other reasons. Prefer a deterministic “in progress” check (e.g., StartedAt present and EndedAt empty) before reporting indexing_in_progress and returning indexing data.

	idx := t.engine.GetIndexStatus(wctx.Root)

	visited := make(map[string]bool)

	rootNode := &CallNode{Name: symbolName}

	// Try to find root symbol info
	rootRes := t.findSymbolInfo(ctx, wctx.ID, symbolName)
	if rootRes != nil {
		rootNode.Type, _ = rootRes.Point.Payload["type"].(string)
		rootNode.FilePath, _ = rootRes.Point.Payload["file_path"].(string)
		rootNode.Package, _ = rootRes.Point.Payload["package"].(string)
	} else {
		// If nothing is indexed yet, ExactSearchPolyglot will return ErrNoCollectionsFound.
		// Signal indexing status instead of returning an empty hierarchy.
		_, sErr := t.engine.ExactSearchPolyglot(ctx, wctx.ID, map[string]interface{}{"name": symbolName}, 1)
		var noCollections *engine.ErrNoCollectionsFound
		if errors.As(sErr, &noCollections) {
			resp := ToolResponse{
				Status:  "indexing_required",
				Message: fmt.Sprintf("⏳ Workspace '%s' is not indexed yet. Indexing is required for complete call hierarchy results.", wctx.Root),
				Context: ContextFromWorkspaceWithStatus(wctx, t.engine),
			}
			if idx != nil {
				resp.Status = "indexing_in_progress"
				resp.Data = map[string]any{"indexing": idx}
			}
			return resp.JSON()

…ct root/idMap in ResumeIndexingOnConnect

… refactor/indexing-progress

ResumeIndexingOnConnect and DetectContext auto-trigger could both call StartIndexingAsync for the same workspace simultaneously, bypassing the LoadOrStore dedup guard via TOCTOU race window. Changes: - ResumeIndexingOnConnect now marks connectTriggered before StartIndexingAsync - Removed redundant indexingJobs.Load check from DetectContext (TOCTOU) - Changed 'go e.StartIndexingAsync(...)' to direct call (goroutine created internally) Fixes system freeze when indexing large workspaces (~5000+ files).

Copilot

Pull request overview

Copilot reviewed 36 out of 37 changed files in this pull request and generated 2 comments.

You can also share your feedback on Copilot code review. Take the survey.

The VKCOM PHP parser AST already includes $ in Identifier.Value (e.g. "$role" not "role"), so adding another $ prefix resulted in $$role in method signatures and $$table in property signatures. - buildMethodSignature: remove explicit "$" + prefix (line 663) - convertToChunks: remove "$" from property Signature format (line 944) Verified: all php parser tests pass, manual test on Laravel project confirms single $ in all signatures.

Three PHP parser improvements: 1. uses_type relations: PHP 'use' import statements now generate uses_type relations on class chunks. This enables find_usages to discover all classes importing a given type (e.g. find_usages('Lawyer') finds all controllers with 'use App\Lawyer'). 2. Route file extraction: PHP files in routes/ directories that yield 0 symbols from standard AST analysis now fall back to regex-based Route::get/post/resource extraction. routes/web.php goes from 0 to 39 symbols. 3. Fix $$ double dollar: Remove extra $ prefix from parameter and property signatures since VKCOM AST already includes $ in Identifier.Value.

These file types are not documentation - they are code that was incorrectly classified as docs. Removing them from the docs parser: - SQL: query language - SH: shell scripts - Svelte: frontend framework components This reduces docs from 551 to ~49 files on the barou Laravel project, making the language sort put PHP/JS first and dramatically reducing indexing time for documentation. Updated tests to verify these extensions are no longer handled by docs.

Copilot

Pull request overview

Copilot reviewed 44 out of 45 changed files in this pull request and generated 4 comments.

You can also share your feedback on Copilot code review. Take the survey.

- Replace Unix socket and .pid lock files with TCP port binding (localhost:39000) for singleton enforcement. - Update IsDaemonRunning, StartDaemon and StopDaemon to fetch process ID via HTTP /health. - Remove tracking logic around pidfile and sockets. - Recreate adapter and lifecycle tests to connect over loopback TCP instead of sockets. - Update rag-code-install gracefully stop procedure to pull daemon PID from health endpoint.

- Introduce FrameworkEnricher interface in core PHP analyzer - Isolate Laravel and WordPress specific analysis into enricher.go - Resolve plugin overhead with blank imports on run/test files - Maintain lazy-loading decoupled structure to prevent import cycles

Copilot

Pull request overview

Copilot reviewed 59 out of 62 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (2)

internal/service/tools/find_usages.go:105

In the ErrNoCollectionsFound branch, the tool sets status="indexing_in_progress" whenever GetIndexStatus returns non-nil. With the new file-based IndexStatus, a non-nil status file can exist even when indexing is completed/failed, so this can incorrectly report “in progress”. Consider checking something like idx.EndedAt == "" && idx.StartedAt != "" (and/or idx.Error=="") before switching the status.
internal/service/tools/call_hierarchy.go:136
Same issue as in FindUsagesTool: idx := GetIndexStatus(...) is used as a boolean to decide status="indexing_in_progress". Since IndexStatus persists after completion, this can mislabel a workspace as “in progress” even when it’s done. Gate this on an “active” condition (e.g., EndedAt == "").

	idx := t.engine.GetIndexStatus(wctx.Root)

	visited := make(map[string]bool)

	rootNode := &CallNode{Name: symbolName}

	// Try to find root symbol info
	rootRes := t.findSymbolInfo(ctx, wctx.ID, symbolName)
	if rootRes != nil {
		rootNode.Type, _ = rootRes.Point.Payload["type"].(string)
		rootNode.FilePath, _ = rootRes.Point.Payload["file_path"].(string)
		rootNode.Package, _ = rootRes.Point.Payload["package"].(string)
	} else {
		// If nothing is indexed yet, ExactSearchPolyglot will return ErrNoCollectionsFound.
		// Signal indexing status instead of returning an empty hierarchy.
		_, sErr := t.engine.ExactSearchPolyglot(ctx, wctx.ID, map[string]interface{}{"name": symbolName}, 1)
		var noCollections *engine.ErrNoCollectionsFound
		if errors.As(sErr, &noCollections) {
			resp := ToolResponse{
				Status:  "indexing_required",
				Message: fmt.Sprintf("⏳ Workspace '%s' is not indexed yet. Indexing is required for complete call hierarchy results.", wctx.Root),
				Context: ContextFromWorkspaceWithStatus(wctx, t.engine),
			}
			if idx != nil {
				resp.Status = "indexing_in_progress"
				resp.Data = map[string]any{"indexing": idx}
			}

You can also share your feedback on Copilot code review. Take the survey.

…emon - Parsers: Introduced gotreesitter parser caching & explicit 'arenagc' draining. Arena memory is now freed after each file, fixing a severe memory leak. - HTML/CSS: Dropped CSS/SCSS tracking in the HTML parser to avoid Tree-Sitter GLR explosions and extreme slowdowns during embedding. - Indexer: Added strict ignoring of minified/vendored files (.min.js, .bundle.css, etc.) to skip massive auto-generated files. - Indexer: Added watchdog and auto-recovery for Ollama embedded deadlocks. - Daemon: Reverted to simple and stable 'Setsid' background daemon spawn pattern in lifecycle.go. - Main: Removed unnecessary --fork-exec flag logic and bumped version.

…tter memory explosions - Extracted CSS parsing from html/analyzer.go to a dedicated css/analyzer.go - Replaces GLR AST generation with linear bracket-depth text scanning - Caps huge CSS rule chunks to 8KB to prevent vector DB overload - Removed old unused css_regex.go implementation - Registered the new generic CSS parser globally in the daemon Resolves Trello Task 1, Task 2, Task 3

Copilot

Pull request overview

Copilot reviewed 75 out of 78 changed files in this pull request and generated 14 comments.

You can also share your feedback on Copilot code review. Take the survey.

 - **Python**: Complete native AST support
- **HTML & Markdown**: Structural documentation mappings
- **Generic Support**: CSS, JSON, YAML, Shell scripts, SQL
+- **HTML & CSS**: HTML structural mappings, CSS/SCSS/SASS/LESS via tree-sitter


+
+	body {
+		color: #fff;
+	}
+


 // ListenConfig configures the daemon's network listeners and lifecycle.
 type ListenConfig struct {
-	SocketPath string       // Unix domain socket path (required)
-	PIDPath    string       // PID file path (required)
-	Version    string       // Server version string
-	HTTPPort   int          // TCP port for optional HTTP listener (0 = disabled)
-	Handler    http.Handler // MCP handler (must handle /mcp)
-	OnReady    func()       // Called when daemon is ready to accept connections (optional)
+	Port    int          // TCP port for localhost listener
+	Version string       // Server version string
+	Handler http.Handler // MCP handler (must handle /mcp)
+	OnReady func()       // Called when daemon is ready to accept connections (optional)
 }

 // ListenAndServe starts the daemon listeners and blocks until ctx is cancelled
-// or SIGTERM/SIGINT is received. Cleans up socket and PID file on exit.
-//
-// It sets up two listeners:
-//  1. Unix domain socket at SocketPath (primary, for stdio adapters)
-//  2. TCP HTTP on HTTPPort (optional, for curl/debug/external agents, localhost only)
-//
-// Both serve the same handler mux with /health and the provided MCP handler.
+// or SIGTERM/SIGINT is received. It binds exclusively to a local TCP port to
+// guarantee it is a singleton, avoiding file locking issues.
 func ListenAndServe(ctx context.Context, cfg ListenConfig) error {


+| **PHP** | [`/php`](./php/README.md) | Deep Laravel integration (Eloquent, Routes, Controllers) & WordPress. | ✅ Production |
+| **HTML & CSS** | [`/html`](./html/README.md) | HTML semantic sectioning + CSS/SCSS/SASS/LESS via tree-sitter. | ✅ Production |
+| **JavaScript** | [`/javascript`](./javascript/README.md) | React, Vue, & TypeScript support. | ✅ Production |
+| **Docs** | [`/docs`](./docs/README.md) | Markdown, JSON, YAML, XML, TOML, reStructuredText. | ✅ Production |


 	var fileErrs []string
 	for _, path := range changedFiles {
 		fileNum := int(doneFiles.Load()) + 1
-		logger.Instance.Debug("[IDX] ws=%s lang=%s [%d/%d] %s (indexing...)",
+		logger.Instance.Info("[IDX] ws=%s lang=%s [%d/%d] %s (indexing...)",
 			wsName, opts.Language, fileNum, totalFiles, filepath.Base(path))

 		symCount, indexErr := s.IndexFile(ctx, collection, path, state)
 		if indexErr != nil {
 			logger.Instance.Warn("[IDX] ws=%s lang=%s ⚠️ %s: %v", wsName, opts.Language, filepath.Base(path), indexErr)
 			fileErrs = append(fileErrs, fmt.Sprintf("%s: %v", path, indexErr))
 		} else {
-			logger.Instance.Debug("[IDX] ws=%s lang=%s %s → %d symbol(s)", wsName, opts.Language, filepath.Base(path), symCount)
+			logger.Instance.Info("[IDX] ws=%s lang=%s %s → %d symbol(s)", wsName, opts.Language, filepath.Base(path), symCount)


+// Analyzer implementeaza procesarea pe bucati (chunk-based) a fisierelor CSS/SCSS/LESS.
+// Fara sa depinda de Tree-sitter, nu face OOM nici macar la bundle-uri gigantice.
+type Analyzer struct{}


+	// Profiling endpoints
+	mcpMux.HandleFunc("/debug/pprof/", pprof.Index)
+	mcpMux.HandleFunc("/debug/pprof/cmdline", pprof.Cmdline)
+	mcpMux.HandleFunc("/debug/pprof/profile", pprof.Profile)
+	mcpMux.HandleFunc("/debug/pprof/symbol", pprof.Symbol)
+	mcpMux.HandleFunc("/debug/pprof/trace", pprof.Trace)


@@ -0,0 +1 @@
+404 page not found


+	atIdx := strings.Index(argsStr, "@")
+	if atIdx > 0 {
+		// Find the quoted string containing @
+		for _, q := range []byte{'\'', '"'} {
+			startQ := strings.IndexByte(argsStr[strings.Index(argsStr, string(q))+1:], q)
+			_ = startQ
+		}


+func getFreePort() (int, error) {
+	addr, err := net.ResolveTCPAddr("tcp", "localhost:0")
+	if err != nil {
+		return 0, err
+	}
+	l, err := net.ListenTCP("tcp", addr)
+	if err != nil {
+		return 0, err
+	}
+	defer l.Close()
+	return l.Addr().(*net.TCPAddr).Port, nil
+}


Copilot AI review requested due to automatic review settings March 8, 2026 23:04

doITmagic self-assigned this Mar 8, 2026

Copilot started reviewing on behalf of doITmagic March 8, 2026 23:05 View session

Copilot AI reviewed Mar 8, 2026

View reviewed changes

doITmagic requested a review from Copilot March 8, 2026 23:20

Copilot AI reviewed Mar 8, 2026

View reviewed changes

razvan added 2 commits March 9, 2026 08:56

Copilot AI review requested due to automatic review settings March 9, 2026 07:17

Copilot started reviewing on behalf of doITmagic March 9, 2026 07:18 View session

Copilot AI reviewed Mar 9, 2026

View reviewed changes

razvan added 2 commits March 9, 2026 11:52

Copilot AI review requested due to automatic review settings March 9, 2026 13:31

Copilot started reviewing on behalf of doITmagic March 9, 2026 13:31 View session

Copilot AI reviewed Mar 9, 2026

View reviewed changes

doITmagic and others added 4 commits March 10, 2026 11:51

chore(engine): make indexing on connect adhere to auto_index config

1f26a48

Copilot AI review requested due to automatic review settings March 10, 2026 19:21

Copilot started reviewing on behalf of doITmagic March 10, 2026 19:21 View session

Copilot AI reviewed Mar 10, 2026

View reviewed changes

razvan added 2 commits March 10, 2026 22:15

test(indexer): fix TestIndexStatusRoundTrip — Changed is json:"-" and…

31aa1f3

… not persisted

Copilot AI review requested due to automatic review settings March 10, 2026 20:18

Copilot started reviewing on behalf of doITmagic March 10, 2026 20:19 View session

Copilot AI reviewed Mar 10, 2026

View reviewed changes

Comment thread pkg/parser/docs/treesitter.go Outdated

Comment thread pkg/indexer/index_status.go Outdated

Comment thread internal/service/engine/engine.go

Copilot AI review requested due to automatic review settings March 11, 2026 08:39

Copilot started reviewing on behalf of doITmagic March 11, 2026 08:39 View session

Copilot AI reviewed Mar 11, 2026

View reviewed changes

Comment thread pkg/indexer/service.go

Comment thread pkg/indexer/index_status.go

Comment thread pkg/parser/python/treesitter.go

doITmagic added 3 commits March 11, 2026 14:03

refactor: move GetLastInterruptedWorkspace to indexer package + extra…

0f5d3ba

…ct root/idMap in ResumeIndexingOnConnect

Merge remote-tracking branch 'origin/refactor/indexing-progress' into…

2e5cc25

… refactor/indexing-progress

Copilot AI review requested due to automatic review settings March 11, 2026 17:18

Copilot started reviewing on behalf of doITmagic March 11, 2026 17:19 View session

Copilot AI reviewed Mar 11, 2026

View reviewed changes

Comment thread internal/service/engine/engine_searchcode_test.go

Comment thread pkg/parser/python/treesitter.go

razvan added 4 commits March 11, 2026 21:37

Copilot AI review requested due to automatic review settings March 11, 2026 22:24

Copilot started reviewing on behalf of doITmagic March 11, 2026 22:24 View session

Copilot AI reviewed Mar 11, 2026

View reviewed changes

Comment thread pkg/parser/php/php_analyzer.go

Comment thread pkg/indexer/index_status.go

Comment thread pkg/indexer/index_status.go

Comment thread pkg/parser/php/analyzer.go

doITmagic added 3 commits March 12, 2026 15:40

WIP: local changes

16ec20d

Copilot AI review requested due to automatic review settings March 12, 2026 15:06

Copilot started reviewing on behalf of doITmagic March 12, 2026 15:07 View session

Copilot AI reviewed Mar 12, 2026

View reviewed changes

Comment thread pkg/parser/html/analyzer.go Outdated

razvan and others added 3 commits March 13, 2026 11:01

chore: fix golangci-lint warnings in html, drain, and test files

b3cd229

Copilot AI review requested due to automatic review settings March 13, 2026 21:35

Copilot started reviewing on behalf of doITmagic March 13, 2026 21:35 View session

Copilot AI reviewed Mar 13, 2026

View reviewed changes

doITmagic merged commit 90419b7 into dev Mar 13, 2026
7 checks passed

doITmagic deleted the refactor/indexing-progress branch March 13, 2026 21:42

Conversation

doITmagic commented Mar 8, 2026

Description

Problem

Solution

What was removed

What was added

Files changed (22 files, +208 / -1182 lines)

Type of change

Checklist:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!