Skip to content

Support one local machine daemon with shared Git base indexes and worktree overlays #170

@rudironsoni

Description

@rudironsoni

Need

CocoIndex Code currently indexes by checkout/worktree path, which makes multiple worktrees of the same repository pay for separate full indexes and keeps daemon state tied to .cocoindex_code inside each checkout. For developers using Git worktrees, this is expensive and makes branch switching or parallel feature work heavier than it needs to be.

We need a local machine daemon model where one daemon per OS user owns durable state outside the repository, with one reusable base index per Git repository and cheap overlays for branch and dirty worktree changes.

Why now

We elaborated this now because the existing daemon/client/protocol pieces are already in place, so the next natural scaling step is local rather than a cloud service or hosted control plane. The daemon can become Git-aware, keep the CLI/MCP ergonomics intact, and avoid repeated full indexing across linked worktrees.

This also gives a path to correct query behavior for branch-local changes: modified files should shadow base chunks, deleted files should suppress base results, and dirty worktree changes should win over committed branch/base content.

Proposed solution

Implement a daemon-owned Git layer model:

  • Store daemon state under ${XDG_DATA_HOME:-~/.local/share}/cocoindex-code, with a test/user override.
  • Resolve each request into a Git worktree context: repo root, common dir, origin URL, branch, HEAD, base ref, merge-base, and dirty snapshot hash.
  • Identify reusable repos/layers by normalized remote URL plus indexing configuration hash, not by checkout path alone.
  • Maintain SQLite metadata for layers, overlay manifests, and layer lifecycle state.
  • Reuse the existing CocoIndex filesystem indexer by materializing immutable layer source directories under daemon state:
    • base layer from the base commit,
    • branch layer from merge-base..HEAD changed files,
    • dirty layer from uncommitted worktree files.
  • Store tombstones for deleted/renamed paths in overlay manifests.
  • Query dirty, branch, then base layers with one query embedding, drop lower-layer results shadowed by higher-layer affected paths/tombstones, then merge/rerank.
  • Preserve existing ccc index, ccc search, MCP search, and daemon auto-start behavior, while adding --cwd, --base, ccc overlay status, and ccc overlay prune.

Acceptance criteria

  • A new worktree on an unchanged branch does not require a full per-worktree reindex.
  • Feature branches index only changed files.
  • Modified files shadow base chunks.
  • Deleted files suppress base chunks.
  • Dirty worktree changes beat branch/base results.
  • Two worktrees on different branches share the same base layer.
  • Daemon restart preserves ready layers.
  • Removing a worktree does not delete shared base data.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions