feat: implement Git-based real-time incremental sync by yanurag-dev · Pull Request #12 · yanurag-dev/GrepTurbo

yanurag-dev · 2026-05-03T16:54:23Z

Added Baseline and Overlay search model to include uncommitted changes.
Implemented automatic index rebuild on Commit Drift (branch/HEAD change).
Added Tombstone system to filter out deleted files from index results.
Stored index metadata (commit, root dir, skip) in metadata.json.
Added unit tests for Git Sync logic in internal/index/sync_test.go.
Updated domain language in CONTEXT.md and documented strategy in ADR 0001.

Summary by CodeRabbit

New Features
- Search now includes uncommitted changes via an in-memory overlay and hides deleted files via tombstones.
- Index automatically rebuilds when Git HEAD differs from the stored baseline commit; searches retry after rebuild.
Documentation
- Added conceptual docs and an ADR describing the baseline/overlay/tombstone sync model and trade-offs.
Tests
- Added integration tests validating Git-based sync, overlay behavior, and commit-drift detection.
Chores
- Index metadata (commit, root, skip) is now persisted alongside the index.

- Added Baseline and Overlay search model to include uncommitted changes. - Implemented automatic index rebuild on Commit Drift (branch/HEAD change). - Added Tombstone system to filter out deleted files from index results. - Stored index metadata (commit, root dir, skip) in metadata.json. - Added unit tests for Git Sync logic in internal/index/sync_test.go. - Updated domain language in CONTEXT.md and documented strategy in ADR 0001.

coderabbitai · 2026-05-03T16:54:34Z

Warning

Rate limit exceeded

@yanurag-dev has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 55 minutes and 37 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9ca7deb6-924f-46b4-8252-8bab268d0276

📥 Commits

Reviewing files that changed from the base of the PR and between 15f65f4 and 9062c84.

📒 Files selected for processing (2)

GEMINI.md
internal/query/search.go

📝 Walkthrough

Walkthrough

Implements Git-based index sync: persists baseline metadata, detects commit drift, builds an in-memory overlay of modified/untracked files with tombstones for deletions, merges Baseline+Overlay during search, and triggers automatic rebuilds on commit drift. Adds tests and documentation.

Changes

Git-Based Index Synchronization and Commit Drift Handling

Layer / File(s)	Summary
Data Structures `internal/index/metadata.go`, `internal/index/sync.go`	Adds `Metadata {Commit, RootDir, Skip}`, `GitStatus`, and `Overlay {Posts, Files, Tombstones}`.
Metadata Persistence `internal/index/metadata.go`	`CurrentCommit` runs `git rev-parse HEAD`; `WriteMetadata` writes `metadata.json` (fallback `"unknown"`); `ReadMetadata` reads it.
Reader Foundation `internal/index/reader.go`	`Reader.Meta *Metadata` added; `NewReader` loads `metadata.json` and returns read errors.
Builder/Writer Updates `internal/index/builder.go`, `internal/index/writer.go`	`Builder` exposes `RootDir` and `Skip` and stores them during `Build`. `Write` now calls `WriteMetadata(dir, b.RootDir, b.Skip)` and returns its error.
Git Sync Overlay `internal/index/sync.go`	`(*Reader).Sync()` obtains current commit, returns early when no git, reports drift when baseline ≠ HEAD, runs `GetGitStatus()` to enumerate modified/untracked/deleted, builds tombstones and an in-memory trigram posting `Overlay` for dirty files.
Search Merging Logic `internal/query/search.go`	`Search` calls `r.Sync()`; on drift returns `ErrCommitDrift{Baseline, Current}`. Candidate selection now intersects trigram postings separately for Baseline (`r.Lookup`) and Overlay (`overlay.Posts`), excludes baseline tombstoned paths, maps overlay IDs to `overlay.Files`, and then runs regex on merged candidate list.
CLI Rebuild Handling `cmd/grepturbo/main.go`	`runSearch` catches `*query.ErrCommitDrift`, closes reader, runs `runBuild` with stored metadata, reopens the reader, and retries the search.
Tests & Documentation `internal/index/sync_test.go`, `CONTEXT.md`, `docs/adr/0001-git-sync-strategy.md`	Adds `TestGitSync` to validate overlay behavior and drift detection. Adds `CONTEXT.md` and ADR documenting Baseline/Overlay/Tombstone/Drift model and trade-offs.

Sequence Diagram

sequenceDiagram
    participant CLI as CLI (main)
    participant Search as Search (internal/query)
    participant Reader as Index Reader
    participant Git as Git
    participant Disk as Disk

    CLI->>Search: Search(pattern)
    Search->>Reader: Sync()
    Reader->>Git: git rev-parse HEAD
    Git-->>Reader: current_commit

    alt Commit Drift Detected
        Reader-->>Search: ErrCommitDrift {Baseline, Current}
        Search-->>CLI: ErrCommitDrift
        CLI->>CLI: runBuild() rebuild index
        CLI->>Reader: NewReader() (reopen)
    end

    Reader->>Git: git status --porcelain
    Git-->>Reader: modified/untracked/deleted paths

    par Build Overlay
        Disk->>Reader: read dirty files
        Reader->>Reader: index trigrams -> overlay.Posts
        Reader->>Reader: mark tombstones for deleted/modified
    end

    Reader-->>Search: Overlay {Posts, Files, Tombstones}
    Search->>Reader: Lookup.Get(trigram) [baseline]
    Reader-->>Search: baseline candidates (filtered by tombstones)
    Search->>Search: overlay.Posts.Get(trigram) -> overlay candidates
    Search->>Disk: open candidate files & run regex
    Disk-->>Search: matches
    Search-->>CLI: results

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly Related PRs

feat: implement concurrent trigram index building #11: Edits internal/index/builder.go as well—overlapping changes to Builder.Build and related struct fields.

Poem

🐰 A baseline sleeps beneath the tree,
Overlay hums with edits wild and free,
Tombstones hush the files that slipped away,
When HEAD jumps—rebuild—then search will play.
Hop, grep, and stitch—the index bounces back!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 63.64% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately captures the main change: implementing Git-based real-time incremental sync for uncommitted file changes via Baseline/Overlay/Commit Drift/Tombstone models.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/git-sync-incremental

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Review rate limit: 0/1 reviews remaining, refill in 55 minutes and 37 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

cmd/grepturbo/main.go (1)
85-107: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Double-close of r: defer r.Close() and the explicit r.Close() at line 91 both fire.

defer r.Close() is registered at line 85 and fires whenever runSearch returns — including every return path after the explicit r.Close() at line 91 (the runBuild error return, the NewReader error return, the retry search error return, and the normal return). Calling unix.Munmap twice on the same region is undefined behaviour and can cause SIGBUS. Calling pf.Close() twice silently returns a use of closed file error that the defer discards.

On Linux, runBuild calls os.Create(...) which creates fresh inodes for the new index files; the existing mmap still refers to the old inodes and is unaffected, so closing r before the rebuild is not necessary.
🐛 Proposed fix — remove the early close and let the defer handle cleanup
 	matches, err := query.Search(r, pattern)
 	if err != nil {
 		if drift, ok := err.(*query.ErrCommitDrift); ok {
 			fmt.Fprintf(os.Stderr, "Notice: %s. Rebuilding...\n", drift.Error())
-			r.Close() // close before rebuild
 			if err := runBuild(r.Meta.RootDir, idxDir, r.Meta.Skip); err != nil {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/grepturbo/main.go` around lines 85 - 107, The code currently calls defer
r.Close() and then explicitly calls r.Close() before runBuild, causing a
double-close; remove the explicit r.Close() call (the one immediately before
runBuild) and let the existing defer r.Close() perform cleanup, keeping the rest
of the ErrCommitDrift handling (the call to runBuild(r.Meta.RootDir, idxDir,
r.Meta.Skip), re-opening via index.NewReader, and retrying query.Search)
unchanged so only the defer handles closing the original reader.

🧹 Nitpick comments (3)

cmd/grepturbo/main.go (1)

89-89: ⚡ Quick win

Prefer errors.As over a direct type assertion for ErrCommitDrift.

err.(*query.ErrCommitDrift) silently fails if the error is ever wrapped (e.g., via fmt.Errorf("...: %w", drift)). errors.As is the idiomatic Go 1.13+ approach and unwraps automatically.

♻️ Proposed refactor

+import "errors"
 ...
-		if drift, ok := err.(*query.ErrCommitDrift); ok {
-			fmt.Fprintf(os.Stderr, "Notice: %s. Rebuilding...\n", drift.Error())
+		var drift *query.ErrCommitDrift
+		if errors.As(err, &drift) {
+			fmt.Fprintf(os.Stderr, "Notice: %s. Rebuilding...\n", drift.Error())

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cmd/grepturbo/main.go` at line 89, The current direct type assertion
err.(*query.ErrCommitDrift) will fail to detect wrapped errors; change it to use
errors.As to unwrap and match the concrete type: declare a variable of type
*query.ErrCommitDrift (e.g., var drift *query.ErrCommitDrift) and call
errors.As(err, &drift) to test and obtain the drift value, then proceed as
before; also ensure the standard "errors" package is imported.

internal/index/sync_test.go (1)

93-100: ⚡ Quick win

exec.Command in runCmd should be exec.CommandContext to respect test timeouts.

noctx finds function calls without context.Context; golangci-lint flags this at line 95. A git command that hangs (e.g., a blocking hook in CI) will block the test goroutine indefinitely because there is no cancellation path.

♻️ Proposed refactor

-func runCmd(t *testing.T, dir string, name string, args ...string) {
+func runCmd(t *testing.T, dir string, name string, args ...string) {
 	t.Helper()
-	cmd := exec.Command(name, args...)
+	ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
+	defer cancel()
+	cmd := exec.CommandContext(ctx, name, args...)
 	cmd.Dir = dir
 	if out, err := cmd.CombinedOutput(); err != nil {
 		t.Fatalf("cmd failed: %s\nOutput: %s", err, string(out))
 	}
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/index/sync_test.go` around lines 93 - 100, The runCmd helper uses
exec.Command which ignores test timeouts; change runCmd to use
exec.CommandContext with the test context (use t.Context()) so the subprocess is
canceled when the test times out or is canceled. Locate the runCmd function and
replace exec.Command(...) with exec.CommandContext(t.Context(), ...), keeping
the CombinedOutput call and existing error handling logic in place.

internal/index/sync.go (1)

89-96: 💤 Low value

Consider using exec.CommandContext for better control (optional).

The static analyzer flags the use of exec.Command without a context. For CLI tooling this is often acceptable, but adding context support would allow callers to cancel long-running git operations if needed.

♻️ Optional: Add context parameter

-func GetGitStatus(dir string) (*GitStatus, error) {
-	cmd := exec.Command("git", "status", "--porcelain")
+import "context"
+
+func GetGitStatus(ctx context.Context, dir string) (*GitStatus, error) {
+	cmd := exec.CommandContext(ctx, "git", "status", "--porcelain")

This would require updating callers to pass a context.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/index/sync.go` around lines 89 - 96, GetGitStatus currently uses
exec.Command which the static analyzer flags; change its signature to accept a
context (e.g., func GetGitStatus(ctx context.Context, dir string) (*GitStatus,
error)), replace exec.Command with exec.CommandContext(ctx, "git", "status",
"--porcelain"), and propagate the context to all callers so they can cancel
long-running git operations; if you cannot change all callers immediately,
create an internal wrapper that calls the new ctx-aware GetGitStatus with
context.Background() and migrate callers gradually.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/adr/0001-git-sync-strategy.md`:
- Line 5: The ADR uses h3 headings that skip from the h1 title; change the
section headings "Context", "Decision", and "Consequences" from "### ..." to "##
..." so they are proper top-level ADR sections; locate the heading lines
matching "### Context", "### Decision", and "### Consequences" and replace each
"###" with "##" to restore correct heading hierarchy.

In `@internal/index/builder.go`:
- Around line 81-86: The builder currently sets b.RootDir to an absolute path
but continues to call filepath.WalkDir with the original rootDir, producing
relative entries in b.Files; change the WalkDir call to use absRoot (the
absolutized root path you computed) so that b.Files contains absolute paths
consistent with b.RootDir and the overlay/tombstone logic in sync.go; update any
references in the WalkDir closure that assume rootDir to instead use absRoot or
b.RootDir to avoid mismatches.

In `@internal/index/metadata.go`:
- Around line 53-61: The CurrentCommit function uses exec.Command which can
hang; change it to create a context with a bounded timeout (e.g.,
context.WithTimeout(ctx, 5s)), call exec.CommandContext(ctx, "git", "rev-parse",
"HEAD"), defer cancel(), and use cmd.Output() as before so the child is killed
when the context times out; update any other callsites that invoke CurrentCommit
if they need to propagate a caller context (or accept a timeout param) so commit
lookups can't block indefinitely.

In `@internal/index/sync_test.go`:
- Line 71: The test currently discards errors from r.Sync() by assigning all
three return values to blank identifiers (overlay, _, _), which hides failures;
change the calls to capture all returns (e.g., overlay, drift, err := r.Sync()
or similar), check err and fail the test (t.Fatalf or require.NoError) when err
!= nil, and only use blank-identifiers for values you intentionally ignore;
update both occurrences that call r.Sync() (the one assigning overlay and the
one at line 87) to handle and assert the returned error.
- Around line 55-64: The test calls r.Sync() and assigns overlay, drift, err but
never asserts the expected value of drift, which triggers ineffassign; update
the test in sync_test.go to assert the drift value after the call to r.Sync()
(the variables overlay, drift returned by r.Sync()) — e.g., verify drift
indicates "no drift" (compare against the zero/false value your Sync() uses)
before proceeding to the overlay assertions, and ensure you don't
unintentionally overwrite the drift variable later in the test.

In `@internal/query/search.go`:
- Around line 38-41: The call to index.CurrentCommit(r.Meta.RootDir) in the
drift branch ignores the returned error which can leave current empty and later
cause a panic in ErrCommitDrift.Error(); update the drift handling in search.go
to check the error from index.CurrentCommit (the call and its err), and if it
fails either return that error (or a wrapped error) instead of constructing
ErrCommitDrift, or set Current to a safe sentinel like "<unknown>" and include
the err as contextual info so ErrCommitDrift.Error() cannot panic; reference
index.CurrentCommit, ErrCommitDrift, r.Meta.RootDir and r.Sync() when modifying
the logic.
- Around line 26-28: The Error() method on ErrCommitDrift currently slices
e.Baseline and e.Current with [:7], which will panic if either string is shorter
than 7 (e.g., when Current is empty after CurrentCommit failure); update
ErrCommitDrift.Error to safely truncate both fields (use a small helper or
inline min(len(s),7) logic) and fall back to the full string or a placeholder
when empty so no slicing panic occurs (refer to ErrCommitDrift.Error, and the
Baseline and Current fields).

---

Outside diff comments:
In `@cmd/grepturbo/main.go`:
- Around line 85-107: The code currently calls defer r.Close() and then
explicitly calls r.Close() before runBuild, causing a double-close; remove the
explicit r.Close() call (the one immediately before runBuild) and let the
existing defer r.Close() perform cleanup, keeping the rest of the ErrCommitDrift
handling (the call to runBuild(r.Meta.RootDir, idxDir, r.Meta.Skip), re-opening
via index.NewReader, and retrying query.Search) unchanged so only the defer
handles closing the original reader.

---

Nitpick comments:
In `@cmd/grepturbo/main.go`:
- Line 89: The current direct type assertion err.(*query.ErrCommitDrift) will
fail to detect wrapped errors; change it to use errors.As to unwrap and match
the concrete type: declare a variable of type *query.ErrCommitDrift (e.g., var
drift *query.ErrCommitDrift) and call errors.As(err, &drift) to test and obtain
the drift value, then proceed as before; also ensure the standard "errors"
package is imported.

In `@internal/index/sync_test.go`:
- Around line 93-100: The runCmd helper uses exec.Command which ignores test
timeouts; change runCmd to use exec.CommandContext with the test context (use
t.Context()) so the subprocess is canceled when the test times out or is
canceled. Locate the runCmd function and replace exec.Command(...) with
exec.CommandContext(t.Context(), ...), keeping the CombinedOutput call and
existing error handling logic in place.

In `@internal/index/sync.go`:
- Around line 89-96: GetGitStatus currently uses exec.Command which the static
analyzer flags; change its signature to accept a context (e.g., func
GetGitStatus(ctx context.Context, dir string) (*GitStatus, error)), replace
exec.Command with exec.CommandContext(ctx, "git", "status", "--porcelain"), and
propagate the context to all callers so they can cancel long-running git
operations; if you cannot change all callers immediately, create an internal
wrapper that calls the new ctx-aware GetGitStatus with context.Background() and
migrate callers gradually.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5626e066-a381-4d0e-b564-21aed029307e

📥 Commits

Reviewing files that changed from the base of the PR and between a8cfafa and e455428.

📒 Files selected for processing (10)

CONTEXT.md
cmd/grepturbo/main.go
docs/adr/0001-git-sync-strategy.md
internal/index/builder.go
internal/index/metadata.go
internal/index/reader.go
internal/index/sync.go
internal/index/sync_test.go
internal/index/writer.go
internal/query/search.go

coderabbitai · 2026-05-03T17:04:52Z

+
+To ensure search results reflect real-time code changes without the overhead of a background daemon or full index rebuilds, we are implementing a Git-based on-demand synchronization strategy.
+
+### Context


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Heading skips a level — use ## for top-level ADR sections.

### Context (h3) directly follows the document's h1 title, skipping h2. The same applies to ### Decision (line 8) and ### Consequences (line 15).

✏️ Proposed fix

-### Context +## Context ... -### Decision +## Decision ... -### Consequences +## Consequences

🧰 Tools

🪛 markdownlint-cli2 (0.22.1)

[warning] 5-5: Heading levels should only increment by one level at a time
Expected: h2; Actual: h3

(MD001, heading-increment)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@docs/adr/0001-git-sync-strategy.md` at line 5, The ADR uses h3 headings that skip from the h1 title; change the section headings "Context", "Decision", and "Consequences" from "### ..." to "## ..." so they are proper top-level ADR sections; locate the heading lines matching "### Context", "### Decision", and "### Consequences" and replace each "###" with "##" to restore correct heading hierarchy.

coderabbitai · 2026-05-03T17:04:52Z

+func CurrentCommit(dir string) (string, error) {
+	cmd := exec.Command("git", "rev-parse", "HEAD")
+	cmd.Dir = dir
+	out, err := cmd.Output()
+	if err != nil {
+		return "", err
+	}
+	return strings.TrimSpace(string(out)), nil
+}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cd /tmp && find . -name "metadata.go" -type f 2>/dev/null | head -20

Repository: yanurag-dev/GrepTurbo

Length of output: 47

🏁 Script executed:

# First, list the repository structure git ls-files | head -30

Repository: yanurag-dev/GrepTurbo

Length of output: 661

🏁 Script executed:

# Look for the metadata.go file specifically git ls-files | grep metadata.go

Repository: yanurag-dev/GrepTurbo

Length of output: 92

🏁 Script executed:

# Check if internal/index/metadata.go exists and read it if [ -f "internal/index/metadata.go" ]; then wc -l internal/index/metadata.go fi

Repository: yanurag-dev/GrepTurbo

Length of output: 95

🏁 Script executed:

# Read the metadata.go file cat -n internal/index/metadata.go

Repository: yanurag-dev/GrepTurbo

Length of output: 1830

🏁 Script executed:

# Now search for calls to CurrentCommit to verify if it's called from r.Sync() rg "CurrentCommit" --type go -B 2 -A 2

Repository: yanurag-dev/GrepTurbo

Length of output: 1374

🏁 Script executed:

# Check sync.go to see how CurrentCommit is used rg "r\.Sync\|Sync\(" --type go -B 3 -A 5 | head -100

Repository: yanurag-dev/GrepTurbo

Length of output: 47

🏁 Script executed:

# Look for how the search path calls Sync rg "\.Sync\(" --type go -B 2 -A 2

Repository: yanurag-dev/GrepTurbo

Length of output: 1482

🏁 Script executed:

# Read sync.go to see the Sync method cat -n internal/index/sync.go | head -50

Repository: yanurag-dev/GrepTurbo

Length of output: 1571

🏁 Script executed:

# Check if there are any Go linting tools configured that would flag noctx if [ -f ".golangci.yml" ] || [ -f "golangci.yml" ]; then echo "=== .golangci.yml ==="; cat -n .golangci.yml fi # Also check go.mod for any linters cat -n go.mod

Repository: yanurag-dev/GrepTurbo

Length of output: 368

🏁 Script executed:

# Search for context usage patterns in the codebase to see if context passing is already used rg "context\." --type go | head -20

Repository: yanurag-dev/GrepTurbo

Length of output: 47

🏁 Script executed:

# Check the search.go more carefully to see the full request path cat -n internal/query/search.go

Repository: yanurag-dev/GrepTurbo

Length of output: 5054

CurrentCommit can hang indefinitely — use exec.CommandContext with a timeout.

exec.Command without a context can block forever if the git process hangs (slow NFS mount, blocking hook, etc.). This blocks every search call via the request path: Search() → r.Sync() → CurrentCommit() (and again at line 39 when reporting commit drift).

Use exec.CommandContext with a bounded timeout:

🔒️ Proposed fix

+import ( + "context" + "time" + ... +) + // CurrentCommit returns the current Git HEAD commit hash in the given dir. func CurrentCommit(dir string) (string, error) { - cmd := exec.Command("git", "rev-parse", "HEAD") + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + cmd := exec.CommandContext(ctx, "git", "rev-parse", "HEAD") cmd.Dir = dir out, err := cmd.Output() if err != nil { return "", err } return strings.TrimSpace(string(out)), nil }

🧰 Tools

🪛 golangci-lint (2.11.4)

[error] 54-54: os/exec.Command must not be called. use os/exec.CommandContext

(noctx)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/index/metadata.go` around lines 53 - 61, The CurrentCommit function uses exec.Command which can hang; change it to create a context with a bounded timeout (e.g., context.WithTimeout(ctx, 5s)), call exec.CommandContext(ctx, "git", "rev-parse", "HEAD"), defer cancel(), and use cmd.Output() as before so the child is killed when the context times out; update any other callsites that invoke CurrentCommit if they need to propagate a caller context (or accept a timeout param) so commit lookups can't block indefinitely.

coderabbitai · 2026-05-03T17:04:52Z

+	overlay, drift, err = r.Sync()
+	if err != nil {
+		t.Fatal(err)
+	}
+	if len(overlay.Files) != 1 || overlay.Files[0] != file1 {
+		t.Errorf("expected %s in overlay, got %v", file1, overlay.Files)
+	}
+	if !overlay.Tombstones[file1] {
+		t.Errorf("expected %s in tombstones", file1)
+	}


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

drift is assigned but never checked in this block — the static analysis ineffassign flag is valid.

After a file modification (not a new commit), the test should verify there is no drift. As written, the value is silently overwritten at line 87.

🐛 Proposed fix

overlay, drift, err = r.Sync() if err != nil { t.Fatal(err) } + if drift { + t.Error("expected no drift after file modification without commit") + } if len(overlay.Files) != 1 || overlay.Files[0] != file1 {

🧰 Tools

🪛 golangci-lint (2.11.4)

[error] 55-55: ineffectual assignment to drift

(ineffassign)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/index/sync_test.go` around lines 55 - 64, The test calls r.Sync() and assigns overlay, drift, err but never asserts the expected value of drift, which triggers ineffassign; update the test in sync_test.go to assert the drift value after the call to r.Sync() (the variables overlay, drift returned by r.Sync()) — e.g., verify drift indicates "no drift" (compare against the zero/false value your Sync() uses) before proceeding to the overlay assertions, and ensure you don't unintentionally overwrite the drift variable later in the test.

coderabbitai · 2026-05-03T17:04:52Z

+	os.WriteFile(file2, []byte("package main\nfunc C() {}\n"), 0644)
+
+	// 8. Verify sync detects untracked file
+	overlay, _, _ = r.Sync()


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Errors from r.Sync() are silently dropped — use blank-identifier assignments only for overlay/drift, not err.

Lines 71 and 87 discard all three return values. If Sync() returns an error, the test continues with a zero-value overlay and misleading assertions.

🐛 Proposed fix

- overlay, _, _ = r.Sync() + overlay, _, err = r.Sync() + if err != nil { + t.Fatal(err) + } foundFile2 := false ... - _, drift, _ = r.Sync() + _, drift, err = r.Sync() + if err != nil { + t.Fatal(err) + }

Also applies to: 87-87

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/index/sync_test.go` at line 71, The test currently discards errors from r.Sync() by assigning all three return values to blank identifiers (overlay, _, _), which hides failures; change the calls to capture all returns (e.g., overlay, drift, err := r.Sync() or similar), check err and fail the test (t.Fatalf or require.NoError) when err != nil, and only use blank-identifiers for values you intentionally ignore; update both occurrences that call r.Sync() (the one assigning overlay and the one at line 87) to handle and assert the returned error.

Using absRoot in WalkDir ensures b.Files contains absolute paths, matching the expectations of the Git Sync logic.

Added a truncate helper to prevent panics when Baseline or Current hashes are shorter than 7 characters.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

internal/index/builder.go (1)
151-157: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Data race: collector goroutine still writes b.Files/b.Posts after Build() returns on error.

close(paths) at line 151 is unconditional (correct), but the return err at line 154 exits without waiting on <-done. The collector goroutine is still appending to b.Files and updating b.Posts at that point. Per the Go memory model there is no happens-before between the collector's writes and any reads the caller may perform on those fields after Build() returns — a data race.

The goroutines are not permanently leaked (they drain quickly once paths is closed), but the race window exists.
🔒 Proposed fix — always drain done before returning
 	close(paths)

+	<-done // always wait: prevents data race on b.Files/b.Posts on the error path
+
 	if err != nil {
 		return err
 	}
-
-	<-done
 	b.Posts.Finalize()
 	return nil
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/index/builder.go` around lines 151 - 157, The collector goroutine
may still append to b.Files and update b.Posts after Build() returns because
Build() returns early on err without waiting for the collector to finish; change
Build() (the function containing close(paths) and the <-done receiver) to
capture the error into a local variable, always wait on <-done after
close(paths), then return the captured error so the collector has finished
before any reads of b.Files/b.Posts occur; reference the existing close(paths),
the <-done channel receive, and the collector goroutine that writes
b.Files/b.Posts when applying this change.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@internal/index/builder.go`:
- Around line 151-157: The collector goroutine may still append to b.Files and
update b.Posts after Build() returns because Build() returns early on err
without waiting for the collector to finish; change Build() (the function
containing close(paths) and the <-done receiver) to capture the error into a
local variable, always wait on <-done after close(paths), then return the
captured error so the collector has finished before any reads of b.Files/b.Posts
occur; reference the existing close(paths), the <-done channel receive, and the
collector goroutine that writes b.Files/b.Posts when applying this change.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3a675e64-a7cd-40f9-9aa1-ba8f9b3596b4

📥 Commits

Reviewing files that changed from the base of the PR and between e455428 and 15f65f4.

📒 Files selected for processing (1)

internal/index/builder.go

coderabbitai Bot reviewed May 3, 2026

View reviewed changes

yanurag-dev added 3 commits May 3, 2026 23:53

fix: ensure absolute paths are used in index builder

15f65f4

Using absRoot in WalkDir ensures b.Files contains absolute paths, matching the expectations of the Git Sync logic.

fix: safely truncate commit hashes in ErrCommitDrift.Error

fad111a

Added a truncate helper to prevent panics when Baseline or Current hashes are shorter than 7 characters.

docs: symlink GEMINI.md to AGENTS.md

9062c84

coderabbitai Bot reviewed May 3, 2026

View reviewed changes

yanurag-dev merged commit 53622a1 into main May 3, 2026
2 checks passed

yanurag-dev deleted the feat/git-sync-incremental branch May 24, 2026 07:16


		To ensure search results reflect real-time code changes without the overhead of a background daemon or full index rebuilds, we are implementing a Git-based on-demand synchronization strategy.

		### Context

Conversation

yanurag-dev commented May 3, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram

Estimated Code Review Effort

Possibly Related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yanurag-dev commented May 3, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 3, 2026 •

edited

Loading