roadmap: rewrite Phases 33-40 — Search Dominance strategy to beat ck

M9nx · M9nx · commit 70ada2708e2d · 2026-03-11T05:20:20.000+02:00
Phases 33-37 focus on closing every search gap with BeaconBay/ck:
- Phase 33: JSONL streaming, scored output, snippet control, full-section
- Phase 34: Chunk-level incremental indexing, cache hits, interrupt safety
- Phase 35: Tantivy + FastEmbed in Rust, sub-100ms queries, pre-built wheels
- Phase 36: MCP v2 with pagination/cursors, Claude/Cursor/Windsurf configs
- Phase 37: grep flag parity, hybrid/sem shorthands, single binary distro

Phases 38-40 build on the search lead:
- Phase 38: Hot-swap embedding models, model benchmarking, HF tokenizers
- Phase 39: Cloud mode, team features, CI plugin
- Phase 40: Multi-agent orchestration, multi-IDE, semantic diff, codegen
diff --git a/ROADMAP.md b/ROADMAP.md
@@ -533,38 +533,110 @@ providing high-performance alternatives to the Python search and indexing stack.
 
 ---
 
-### Phase 33: Remote / Cloud Mode
-Package CodexA as a Docker container with a shared REST API so teams can share
-one index server. Add authentication, rate limiting, and team dashboards.
-
-### Phase 34: GitHub / GitLab CI Plugin
-GitHub Actions / GitLab CI plugin that runs `codexa quality` on PRs, blocks
-merges on regressions, and posts inline review comments.
-
-### Phase 35: Multi-Agent Orchestration
-Allow multiple AI agents to share a single CodexA instance with isolated
-sessions, concurrent tool invocations, and coordinated context windows.
-
-### Phase 36: Incremental Embedding Models
-Hot-swap embedding models without full re-index — store raw chunks alongside
-vectors and re-embed lazily on model change.
-
-### Phase 37: Code Generation Pipeline
-Use RAG context + LLM to generate code scaffolds, tests, and documentation
-from natural language descriptions grounded in the actual codebase.
-
-### Phase 38: Distributed Workspace Federation
-Federate multiple CodexA instances across machines/orgs — search across remote
-indexes without copying data, with result merging and access control.
-
-### Phase 39: IDE Extension v2 — Multi-IDE Support
-Extend the VS Code extension to JetBrains (IntelliJ plugin) and Neovim (Lua),
-sharing the same bridge server backend.
-
-### Phase 40: Semantic Diff & Code Review AI
-AST-level diff analysis — detect semantic changes (renamed symbols, moved
-functions, signature changes) vs. cosmetic changes (formatting, comments).
-Power AI code review with structural understanding.
+### Phase 33: Search Dominance — JSONL Streaming & Output Parity
+Close the output-format gap with ck. Make CodexA the best tool for both humans
+and AI agents to consume search results.
+
+| Feature | Description |
+|---------|-------------|
+| **JSONL streaming output** | `--jsonl` flag on `search`, `grep`, `tool run` — one JSON object per line, streaming-friendly for LLMs and pipelines |
+| **Scored output** | `--scores` flag — prepend `[0.847]` relevance scores to every result line with color highlighting |
+| **Snippet length control** | `--snippet-length N` to control how much context is shown per match |
+| **No-snippet mode** | `--no-snippet` for metadata-only output (file, line, score) |
+| **Full-section extraction** | `--full-section` returns the complete function/class containing the match (tree-sitter aware) |
+| **Clean stdout/stderr separation** | All progress/status to stderr, only results to stdout — reliable piping |
+
+### Phase 34: Search Dominance — Chunk-Level Incremental Indexing
+Eliminate full re-index overhead. Match and exceed ck's delta indexing with
+chunk-level content-addressed caching.
+
+| Feature | Description |
+|---------|-------------|
+| **Chunk-level caching** | blake3 hash per chunk — only re-embed changed chunks (80-90% cache hit on typical edits) |
+| **Content-aware invalidation** | Doc comment and whitespace changes properly invalidate affected chunks |
+| **Model-consistency guard** | Detect embedding model switches and prevent silent vector corruption |
+| **Interruption safety** | Ctrl+C saves partial index; next run resumes from where it stopped |
+| **`--add` single file** | `codexa index --add <file>` index a single file without full scan |
+| **`--inspect` file** | `codexa index --inspect <file>` show chunk breakdown, token counts, cache status |
+
+### Phase 35: Search Dominance — Native Rust Search Engine v2
+Push all hot-path search operations into `codexa-core` for sub-100ms queries
+on million-LOC codebases. Beat ck's Tantivy/FastEmbed stack on raw speed.
+
+| Feature | Description |
+|---------|-------------|
+| **Tantivy integration** | Replace Python BM25 with Tantivy full-text engine in Rust (same lib ck uses) |
+| **FastEmbed in Rust** | ONNX embedding inference fully in Rust — no Python overhead on the hot path |
+| **ANN index persistence** | HNSW index saved/loaded via mmap in <50ms (currently rebuilds) |
+| **Parallel query** | Rayon-parallel semantic + BM25 + regex queries fused in a single Rust call |
+| **Benchmarking parity** | `codexa benchmark` reports indexing speed (LOC/s), query latency (p50/p95/p99), cache hit rate |
+| **Pre-built wheels** | Publish manylinux, macOS (arm64+x86), Windows wheels to PyPI — `pip install codexa` just works |
+
+### Phase 36: Search Dominance — MCP Server v2 & Agent Protocol
+Make CodexA the best MCP server for every AI client — Claude, Cursor, Copilot,
+Windsurf. Full pagination, cursors, and streaming.
+
+| Feature | Description |
+|---------|-------------|
+| **MCP pagination** | `page_size`, `cursor`, `next_cursor` on all search tools — handle 10K+ results gracefully |
+| **MCP streaming** | SSE token-by-token delivery for long search results |
+| **`codexa --serve`** | Single-flag MCP server start (match ck's `ck --serve` simplicity) |
+| **Claude Desktop config** | `claude mcp add codexa` one-liner install with auto-config JSON |
+| **Tool permissions** | Per-tool read/write permission model for safe agent use |
+| **Health + status** | `index_status`, `reindex`, `health_check` MCP tools (match ck's tool set) |
+| **Cursor / Windsurf integration** | Documented setup guides and tested configs for Cursor, Windsurf, Continue.dev |
+
+### Phase 37: Search Dominance — grep Parity & Single-Binary Distribution
+Make `codexa` a true drop-in replacement for grep/ripgrep with zero-config
+install on every platform.
+
+| Feature | Description |
+|---------|-------------|
+| **grep flag parity** | `-l` (list files), `-L` (list files without match), `-R` (recursive default), `--exclude` glob patterns, `--no-ignore` |
+| **Hybrid search flag** | `codexa --hybrid "query"` — combined semantic+keyword in one flag (match ck --hybrid) |
+| **Semantic search flag** | `codexa --sem "query"` — shorthand for semantic search (match ck --sem) |
+| **`.codexaignore` auto-create** | Auto-generate `.codexaignore` on first index with sensible defaults (images, binaries, config files) |
+| **Single binary** | PyInstaller/Nuitka-compiled standalone binary — no Python required |
+| **Homebrew tap** | `brew install m9nx/tap/codexa` with auto-updating formula |
+| **Cargo install** | `cargo install codexa` for the Rust engine with embedded Python runtime (stretch goal) |
+| **Scoop / Chocolatey** | Windows package manager support |
+
+### Phase 38: Incremental Embedding Models & Model Hub
+Hot-swap embedding models without full re-index. Built-in model benchmarking
+and recommendations.
+
+| Feature | Description |
+|---------|-------------|
+| **Lazy re-embedding** | Store raw chunks alongside vectors; re-embed only on query if model changed |
+| **`codexa models benchmark`** | Benchmark all installed models on your actual codebase (speed, quality, memory) |
+| **`--switch-model`** | `codexa index --switch-model jina-code` with smart cache invalidation |
+| **Model download** | `codexa models download bge-small` with progress bar and verification |
+| **HuggingFace tokenizers** | Token-exact chunk boundaries (match ck's tokenizer precision) |
+
+### Phase 39: Remote / Cloud Mode & Team Features
+Package CodexA as a shared server for teams. Authentication, dashboards, and
+collaborative search.
+
+| Feature | Description |
+|---------|-------------|
+| **Docker image** | Production multi-stage image with pre-loaded models |
+| **Team REST API** | Shared index server with API key authentication |
+| **Rate limiting** | Per-user RPM/TPM limits on the shared server |
+| **Team dashboard** | Web UI showing search analytics, popular queries, index health |
+| **GitHub / GitLab CI plugin** | `codexa quality` on PRs, block merges on regressions, inline review comments |
+
+### Phase 40: Multi-Agent Orchestration & IDE v2
+Multiple AI agents sharing one CodexA instance. Multi-IDE support beyond
+VS Code.
+
+| Feature | Description |
+|---------|-------------|
+| **Concurrent sessions** | Isolated agent sessions with independent context windows |
+| **Coordinated context** | Agents share discovered context to avoid redundant searches |
+| **JetBrains plugin** | IntelliJ/PyCharm plugin sharing the same bridge server |
+| **Neovim integration** | Lua plugin with telescope.nvim integration |
+| **Semantic Diff** | AST-level diff — detect renamed symbols, moved functions, signature changes vs cosmetic edits |
+| **Code Generation** | RAG context + LLM → code scaffolds, tests, docs grounded in actual codebase |
 
 ---