docs: redesign roadmap phases based on self-analysis and ck competitor comparison

M9nx · M9nx · commit bdca6542ce75 · 2026-03-11T00:56:46.000+02:00
- Reorder phases by competitive priority (close UX/perf gaps first)
- Phase 32: Search UX &amp; Output Modes (scores, full-section, JSONL, inspect)
- Phase 33: Precise Token Management (moved up — ck already has this)
- Phase 34: Performance &amp; Smart Indexing (content-hash, parallel embedding)
- Phase 35: Advanced Embedding &amp; Model Selection (multi-model, RRF weights)
- Phase 36: CI/CD Deep Integration (unique strength, kept)
- Phase 37: VS Code Extension &amp; Editor Integration (merged old 37+39)
- Phase 38: Async Web &amp; Real-Time Streaming (kept)
- Phase 39: Cross-Language Intelligence (moved down)
- Plugin Marketplace moved to Low Priority
- Removed old Phase 33 (Team &amp; Cloud Mode) — local-only tool
diff --git a/docs/guide/roadmap.md b/docs/guide/roadmap.md
@@ -56,82 +56,102 @@ Planned improvements for CodexA, organized by priority.
 
 ## Upcoming Improvements
 
-### Phase 32 — Cross-Language Intelligence
+> Phases redesigned after self-analysis with CodexA tools and competitive
+> comparison with [ck](https://github.com/BeaconBay/ck) (Rust-based semantic
+> search, v0.7.4). Priorities: close visible UX/performance gaps first, then
+> double down on CodexA's unique AI-powered strengths.
 
-Unified code intelligence across language boundaries:
+### Phase 32 — Search UX & Output Modes
 
-- Cross-language symbol resolution (e.g., Python calling Rust via FFI)
-- Polyglot dependency graphs linking imports across languages
-- Language-aware search boosting (prefer results in the query's context language)
-- Universal call graph spanning multiple languages in a workspace
+Close the biggest visible gaps in search ergonomics:
+
+- `--scores` flag to display similarity scores with color highlighting
+- `--full-section` flag to return complete function/class bodies, not just chunk snippets
+- `--threshold` flag to filter results below a minimum similarity score
+- JSONL streaming output mode (`--jsonl`) for piping into downstream tools
+- `codexa search --inspect <file>` to visualize chunks, token counts, and embeddings for a file
+- `.codexaignore` auto-generation from detected binary/vendored/generated files
+- Smart binary detection to skip non-text files during indexing
 
-### Phase 33 — Team & Cloud Mode
+### Phase 33 — Precise Token Management
 
-Optional team collaboration features (privacy-first, opt-in):
+Replace rough token estimation with model-specific counting (ck already ships exact tokenization):
 
-- Shared search indices with team-scoped access control
-- Remote index hosting for large monorepos (gRPC or HTTP)
-- Index sharding and distributed search across machines
-- Audit logging for compliance-sensitive environments
+- `tiktoken` for OpenAI models, HuggingFace `tokenizers` for local/Ollama models
+- Accurate context window budgeting with overflow protection in RAG pipeline
+- Token usage reporting and cost estimation per query
+- Smart context truncation preserving semantic boundaries (function/class edges)
+- `codexa search --tokens` to show token count per result
 
-### Phase 34 — CI/CD Deep Integration
+### Phase 34 — Performance & Smart Indexing
 
-First-class CI pipeline integration beyond quality gates:
+Content-aware incremental indexing to close the speed gap:
 
-- PR diff-aware indexing — only re-index changed files in CI
-- Automated PR review comments via GitHub Actions / GitLab CI
-- Quality trend dashboards exported as CI artifacts
-- Breaking-change detection based on call graph + reference analysis
-- Configurable CI profiles (fast/thorough/security-only)
+- Content-hash (blake3) per chunk — skip re-embedding unchanged code
+- Parallel embedding with configurable worker count
+- Batch FAISS insertion instead of one-by-one vector adds
+- Memory-mapped FAISS indices for low-RAM machines
+- `codexa index --diff` to index only git-changed files
+- Indexing progress bar with ETA and throughput stats
 
-### Phase 35 — Advanced Embedding & Search
+### Phase 35 — Advanced Embedding & Model Selection
 
-Next-generation search infrastructure:
+Multiple embedding models and smarter search infrastructure:
 
-- Fine-tuned code embedding models (CodeBERT, StarEncoder)
+- Support BGE, mxbai-embed, nomic-embed, jina-code-v2 alongside current MiniLM
+- Model switching at query time without full re-index (dual-index mode)
 - GPU-accelerated FAISS with IVF-PQ indices for million-file repos
 - Field-scoped search filters (`--lang`, `--symbol-type`, `--file`)
 - Configurable RRF weights for hybrid search tuning
-- Re-ranking with cross-encoders for precision-critical queries
+- `codexa models compare` to benchmark models on the user's actual codebase
 
-### Phase 36 — Async Web & Real-Time Streaming
+### Phase 36 — CI/CD Deep Integration
 
-Migrate the web server to a modern async framework:
+First-class CI pipeline integration — a unique CodexA strength:
 
-- WebSocket streaming for live search results
-- Non-blocking request handling with connection pooling
-- Server-sent events for long-running operations (indexing progress)
-- Real-time collaboration widgets in the web UI
+- PR diff-aware indexing — only re-index changed files in CI
+- Automated PR review comments via GitHub Actions / GitLab CI
+- Quality trend dashboards exported as CI artifacts (HTML + JSON)
+- Breaking-change detection based on call graph + reference analysis
+- Configurable CI profiles (`fast` / `thorough` / `security-only`)
 
-### Phase 37 — Plugin Marketplace & Sandboxing
+### Phase 37 — VS Code Extension & Editor Integration
 
-Mature the plugin ecosystem:
+Marketplace-ready VS Code extension with deep editor features:
 
-- Plugin sandboxing with resource limits and restricted filesystem access
-- Community plugin registry with versioning and discovery
-- Plugin dependency resolution and conflict detection
-- Visual plugin configuration in the web UI
+- Inline code explanations as CodeLens / inlay hints
+- Semantic go-to-definition across indexed repos
+- Live quality annotations in the editor gutter
+- Multi-root workspace support with cross-repo navigation
+- Extension marketplace publishing and auto-update
 
-### Phase 38 — Precise Token Management
+### Phase 38 — Async Web & Real-Time Streaming
 
-Replace rough token estimation with model-specific counting:
+Migrate the web server to a modern async framework:
 
-- `tiktoken` for OpenAI models, model-specific tokenizers for Ollama
-- Accurate context window budgeting with overflow protection
-- Token usage reporting and cost estimation per query
-- Smart context truncation preserving semantic boundaries
+- WebSocket streaming for live search results
+- Non-blocking request handling with connection pooling
+- Server-sent events for long-running operations (indexing progress)
+- Real-time dashboard with quality trends and search analytics
 
-### Phase 39 — LSP 2.0 & Editor Deep Integration
+### Phase 39 — Cross-Language Intelligence
 
-Enhanced editor integration beyond current LSP:
+Unified code intelligence across language boundaries:
 
-- Inline code explanations as CodeLens / inlay hints
-- Semantic go-to-definition across indexed repos
-- Live quality annotations in the editor gutter
-- Multi-root workspace support with cross-repo navigation
+- Cross-language symbol resolution (e.g., Python calling Rust via FFI)
+- Polyglot dependency graphs linking imports across languages
+- Language-aware search boosting (prefer results in the query's context language)
+- Universal call graph spanning multiple languages in a workspace
 
 ## Low Priority (Future)
 
+### Plugin Marketplace & Sandboxing
+
+- Plugin sandboxing with resource limits and restricted filesystem access
+- Community plugin registry with versioning and discovery
+- Plugin dependency resolution and conflict detection
+- Visual plugin configuration in the web UI
+
 ### Fine-Tuned Embedding Models
 
 - Domain-specific vocabulary handling