CodeAlive-AI · rodion-m · Apr 13, 2026 · Apr 8, 2026 · Apr 8, 2026 · gemini-code-assist
diff --git a/README.md b/README.md
@@ -81,8 +81,10 @@ The key is stored once and shared across all agents on the same machine.
 Start your agent and ask naturally:
 
 - *"How is authentication implemented?"*
+- *"Find the exact regex or string match for this token parser"*
 - *"Show me error handling patterns across services"*
 - *"Find similar features to guide my implementation"*
+- *"Show me who calls this handler and what it depends on"*
 
 No special commands needed — the agent picks up the skill automatically.
 

diff --git a/skills/codealive-context-engine/SKILL.md b/skills/codealive-context-engine/SKILL.md
@@ -38,7 +38,8 @@ Do NOT retry the failed script until setup completes successfully.
 | Tool | Script | Speed | Cost | Best For |
 |------|--------|-------|------|----------|
 | **List Data Sources** | `datasources.py` | Instant | Free | Discovering indexed repos and workspaces |
-| **Search** | `search.py` | Fast | Low | Finding code locations, descriptions, identifiers |
+| **Semantic Search** | `search.py` | Fast | Low | Finding relevant artifacts by meaning |
+| **Grep Search** | `grep.py` | Fast | Low | Exact text and regex matches with line previews |
 | **Fetch Artifacts** | `fetch.py` | Fast | Low | Retrieving full content for search results |
 | **Artifact Relationships** | `relationships.py` | Fast | Low | Drilling into call graph, inheritance, references for one artifact |
 | **Chat with Codebase** | `chat.py` | Slow | High | Synthesized answers, architectural explanations |
@@ -85,8 +86,9 @@ python scripts/datasources.py
 
 ```bash
 python scripts/search.py "JWT token validation" my-backend
-python scripts/search.py "error handling patterns" workspace:platform-team --mode deep
-python scripts/search.py "authentication flow" my-repo --description-detail full
+python scripts/search.py "authentication flow" my-repo --path src/auth --ext .py
+python scripts/grep.py "AuthService" my-repo
+python scripts/grep.py "auth\\(" my-repo --regex
 ```
 
 ### 3. Fetch full content (for external repos)
@@ -135,11 +137,9 @@ python scripts/search.py <query> <data_sources...> [options]
 
 | Option | Description |
 |--------|-------------|
-| `--mode auto` | Default. Intelligent semantic search — use 80% of the time |
-| `--mode fast` | Quick lexical search for known terms |
-| `--mode deep` | Exhaustive search for complex cross-cutting queries. Resource-intensive |
-| `--description-detail short` | Default. Brief description of each result |
-| `--description-detail full` | More detailed description of each result |
+| `--max-results N` | Optional cap for the number of returned artifacts |
+| `--path PATH` | Repo-relative path or directory scope (repeatable) |
+| `--ext EXT` | File extension scope such as `.py` or `.ts` (repeatable) |
 
 **`description` is a triage pointer ONLY** — it tells you which artifacts are
 worth a closer look. It is NOT the source of truth and you must NOT draw
@@ -148,6 +148,25 @@ source: use `fetch.py <identifier>` for external repos, or your editor's
 file-read tool on the path for repos in the current working directory. Treat
 only that real `content` as ground truth.
 
+### `grep.py` — Exact / Regex Search
+
+Returns artifact-level matches with line previews. Use this when the pattern
+itself matters more than semantic similarity.
+
+```bash
+python scripts/grep.py <query> <data_sources...> [--regex] [--max-results N] [--path PATH] [--ext EXT]
+```
+
+| Option | Description |
+|--------|-------------|
+| `--regex` | Interpret the query as a regex pattern |
+| `--max-results N` | Optional cap for the number of returned artifacts |
+| `--path PATH` | Repo-relative path or directory scope (repeatable) |
+| `--ext EXT` | File extension scope such as `.py` or `.ts` (repeatable) |
+
+Line previews are still search evidence, not source of truth. Use `fetch.py`
+or your local file-read tool before drawing conclusions about behavior.
+
 ### `fetch.py` — Fetch Artifact Content
 
 Retrieves the full source code content for artifacts found via search. Use this for external repositories you cannot access locally.
@@ -270,7 +289,7 @@ This skill works standalone, but delivers the best experience when combined with
 | Component | What it provides |
 |-----------|-----------------|
 | **This skill** | Query patterns, workflow guidance, cost-aware tool selection |
-| **MCP server** | Direct `codebase_search`, `fetch_artifacts`, `get_artifact_relationships`, `codebase_consultant`, `get_data_sources` tools |
+| **MCP server** | Direct `semantic_search`, `grep_search`, `fetch_artifacts`, `get_artifact_relationships`, `codebase_consultant`, `get_data_sources` tools |
 
 When both are installed, prefer the MCP server's tools for direct operations and this skill's scripts for guided workflows.
 

diff --git a/skills/codealive-context-engine/scripts/fetch.py b/skills/codealive-context-engine/scripts/fetch.py
@@ -15,7 +15,7 @@
     # Fetch multiple artifacts
     python fetch.py "my-org/backend::src/auth.py::login" "my-org/backend::src/utils.py::helper"
 
-Identifiers come from codebase_search results (the `identifier` field).
+Identifiers come from semantic/grep search results (the `identifier` field).
 The format is: {owner/repo}::{path}::{symbol} (for symbols/chunks)
                {owner/repo}::{path} (for files)
 

diff --git a/skills/codealive-context-engine/scripts/grep.py b/skills/codealive-context-engine/scripts/grep.py
@@ -0,0 +1,115 @@
+#!/usr/bin/env python3
+"""
+CodeAlive Grep Search - exact text or regex search across indexed repositories.
+
+Usage:
+    python grep.py "AuthService" my-repo
+    python grep.py "auth\\(" my-repo --regex --max-results 25
+    python grep.py "TODO" workspace:backend-team --path src --ext .py
+"""
+
+import sys
+from pathlib import Path
+
+sys.path.insert(0, str(Path(__file__).parent / "lib"))
+
+from api_client import CodeAliveClient
+
+
+def format_grep_results(results: dict) -> str:
+    items = results.get("results", []) if isinstance(results, dict) else []
+    if not items:
+        return "No results found."
+
+    output = []
+    for idx, result in enumerate(items, 1):
+        location = result.get("location", {})
+        file_path = location.get("path") or result.get("path")
+        matches = result.get("matches", [])
+
+        output.append(f"\n--- Result #{idx} [{result.get('kind', 'Artifact')}] ---")
+        if file_path:
+            output.append(f"  File: {file_path}")
+        if result.get("identifier"):
+            output.append(f"  Identifier: {result['identifier']}")
+        if result.get("matchCount") is not None:
+            output.append(f"  Match count: {result['matchCount']}")
+
+        for match in matches:
+            output.append(
+                "  "
+                f"{match.get('lineNumber', '?')}:{match.get('startColumn', '?')}-"
+                f"{match.get('endColumn', '?')}  {match.get('lineText', '')}"
+            )
+
+    output.append(
+        "\nHint: match previews are search evidence only. Fetch the full source "
+        "with `python fetch.py <identifier>` or read the local file before reasoning about behavior."
+    )
+    return "\n".join(output)
+
+
+def main():
+    if len(sys.argv) < 3:
+        print("Error: Missing required arguments.", file=sys.stderr)
+        print(
+            "Usage: python grep.py <query> <data_source> [data_source2...] "
+            "[--regex] [--max-results N] [--path PATH] [--ext EXT]",
+            file=sys.stderr,
+        )
+        sys.exit(1)
+
+    query = sys.argv[1]
+    data_sources = []
+    paths = []
+    extensions = []
+    max_results = None
+    regex = False
+
+    i = 2
+    while i < len(sys.argv):
+        arg = sys.argv[i]
+        if arg == "--regex":
+            regex = True
+            i += 1
+        elif arg == "--max-results" and i + 1 < len(sys.argv):
+            max_results = int(sys.argv[i + 1])
+            i += 2
-        elif arg == "--max-results" and i + 1 < len(sys.argv):
-            max_results = int(sys.argv[i + 1])
-            i += 2
+        elif arg == "--max-results" and i + 1 < len(sys.argv):
+            try:
+                max_results = int(sys.argv[i + 1])
+            except ValueError:
+                print(f"Error: --max-results must be an integer, got '{sys.argv[i + 1]}'", file=sys.stderr)
+                sys.exit(1)
+            i += 2
-        elif arg == "--max-results" and i + 1 < len(sys.argv):
-            max_results = int(sys.argv[i + 1])
-            i += 2
+        elif arg == "--max-results" and i + 1 < len(sys.argv):
+            try:
+                max_results = int(sys.argv[i + 1])
+            except ValueError:
+                print(f"Error: --max-results must be an integer, got '{sys.argv[i + 1]}'", file=sys.stderr)
+                sys.exit(1)
+            i += 2
+        elif arg == "--path" and i + 1 < len(sys.argv):
+            paths.append(sys.argv[i + 1])
+            i += 2
+        elif arg == "--ext" and i + 1 < len(sys.argv):
+            extensions.append(sys.argv[i + 1])
+            i += 2
+        elif arg == "--help":
+            print(__doc__)
+            sys.exit(0)
+        else:
+            data_sources.append(arg)
+            i += 1
+
+    if not data_sources:
+        print(
+            "Error: At least one data source is required. Run datasources.py to see available sources.",
+            file=sys.stderr,
+        )
+        sys.exit(1)
+
+    try:
+        client = CodeAliveClient()
+        results = client.grep_search(
+            query=query,
+            data_sources=data_sources,
+            paths=paths or None,
+            extensions=extensions or None,
+            max_results=max_results,
+            regex=regex,
+        )
+        print(format_grep_results(results))
+    except Exception as e:
+        print(f"Error: {e}", file=sys.stderr)
+        sys.exit(1)
+
+
+if __name__ == "__main__":
+    main()
diff --git a/skills/codealive-context-engine/scripts/lib/api_client.py b/skills/codealive-context-engine/scripts/lib/api_client.py
@@ -268,6 +268,52 @@ def search(
         }
         return self._make_request("GET", "/api/search", params=params)
 
+    def semantic_search(
+        self,
+        query: str,
+        data_sources: List[str],
+        paths: Optional[List[str]] = None,
+        extensions: Optional[List[str]] = None,
+        max_results: Optional[int] = None,
+    ) -> Dict[str, Any]:
+        """Search indexed artifacts semantically using the canonical API."""
+        params: Dict[str, Any] = {
+            "Query": query,
+            "Names": data_sources,
+        }
+        if paths:
+            params["Paths"] = paths
+        if extensions:
+            params["Extensions"] = extensions
+        if max_results is not None:
+            params["MaxResults"] = max_results
+
+        return self._make_request("GET", "/api/search/semantic", params=params)
+
+    def grep_search(
+        self,
+        query: str,
+        data_sources: List[str],
+        paths: Optional[List[str]] = None,
+        extensions: Optional[List[str]] = None,
+        max_results: Optional[int] = None,
+        regex: bool = False,
+    ) -> Dict[str, Any]:
+        """Search indexed artifacts by exact text or regex using the canonical API."""
+        params: Dict[str, Any] = {
+            "Query": query,
+            "Names": data_sources,
+            "Regex": str(regex).lower(),
+        }
+        if paths:
+            params["Paths"] = paths
+        if extensions:
+            params["Extensions"] = extensions
+        if max_results is not None:
+            params["MaxResults"] = max_results
+
+        return self._make_request("GET", "/api/search/grep", params=params)
+
     def fetch_artifacts(
         self,
         identifiers: List[str],
@@ -393,6 +439,8 @@ def main():
         print("Commands:")
         print("  datasources [--all]")
         print("  search <query> <data_source1> [data_source2...] [--mode auto|fast|deep] [--description-detail short|full]")
+        print("  semantic-search <query> <data_source1> [data_source2...] [--path PATH] [--ext EXT] [--max-results N]")
+        print("  grep-search <query> <data_source1> [data_source2...] [--regex] [--path PATH] [--ext EXT] [--max-results N]")
         print("  fetch <identifier1> [identifier2...]")
         print("  relationships <identifier> [--profile callsOnly|inheritanceOnly|allRelevant|referencesOnly] [--max-count N]")
         print("  chat <question> <data_source1> [data_source2...] [--conversation-id ID]")
@@ -433,6 +481,83 @@ def main():
             result = client.search(query, data_sources, mode, description_detail)
             print(json.dumps(result, indent=2))
 
+        elif command == "semantic-search":
+            if len(sys.argv) < 4:
+                print("Usage: semantic-search <query> <data_source1> [data_source2...] [--path PATH] [--ext EXT] [--max-results N]")
+                sys.exit(1)
+
+            query = sys.argv[2]
+            data_sources = []
+            paths = []
+            extensions = []
+            max_results = None
+
+            i = 3
+            while i < len(sys.argv):
+                arg = sys.argv[i]
+                if arg == "--path" and i + 1 < len(sys.argv):
+                    paths.append(sys.argv[i + 1])
+                    i += 2
+                elif arg == "--ext" and i + 1 < len(sys.argv):
+                    extensions.append(sys.argv[i + 1])
+                    i += 2
+                elif arg == "--max-results" and i + 1 < len(sys.argv):
+                    max_results = int(sys.argv[i + 1])
+                    i += 2
-                elif arg == "--max-results" and i + 1 < len(sys.argv):
-                    max_results = int(sys.argv[i + 1])
-                    i += 2
+                elif arg == "--max-results" and i + 1 < len(sys.argv):
+                    try:
+                        max_results = int(sys.argv[i + 1])
+                    except ValueError:
+                        print(f"Error: --max-results must be an integer, got '{sys.argv[i + 1]}'", file=sys.stderr)
+                        sys.exit(1)
+                    i += 2
-                elif arg == "--max-results" and i + 1 < len(sys.argv):
-                    max_results = int(sys.argv[i + 1])
-                    i += 2
+                elif arg == "--max-results" and i + 1 < len(sys.argv):
+                    try:
+                        max_results = int(sys.argv[i + 1])
+                    except ValueError:
+                        print(f"Error: --max-results must be an integer, got '{sys.argv[i + 1]}'", file=sys.stderr)
+                        sys.exit(1)
+                    i += 2
+                else:
+                    data_sources.append(arg)
+                    i += 1
+
+            result = client.semantic_search(
+                query,
+                data_sources,
+                paths=paths or None,
+                extensions=extensions or None,
+                max_results=max_results,
+            )
+            print(json.dumps(result, indent=2))
+
+        elif command == "grep-search":
+            if len(sys.argv) < 4:
+                print("Usage: grep-search <query> <data_source1> [data_source2...] [--regex] [--path PATH] [--ext EXT] [--max-results N]")
+                sys.exit(1)
+
+            query = sys.argv[2]
+            data_sources = []
+            paths = []
+            extensions = []
+            max_results = None
+            regex = False
+
+            i = 3
+            while i < len(sys.argv):
+                arg = sys.argv[i]
+                if arg == "--regex":
+                    regex = True
+                    i += 1
+                elif arg == "--path" and i + 1 < len(sys.argv):
+                    paths.append(sys.argv[i + 1])
+                    i += 2
+                elif arg == "--ext" and i + 1 < len(sys.argv):
+                    extensions.append(sys.argv[i + 1])
+                    i += 2
+                elif arg == "--max-results" and i + 1 < len(sys.argv):
+                    max_results = int(sys.argv[i + 1])
+                    i += 2
-                elif arg == "--max-results" and i + 1 < len(sys.argv):
-                    max_results = int(sys.argv[i + 1])
-                    i += 2
+                elif arg == "--max-results" and i + 1 < len(sys.argv):
+                    try:
+                        max_results = int(sys.argv[i + 1])
+                    except ValueError:
+                        print(f"Error: --max-results must be an integer, got '{sys.argv[i + 1]}'", file=sys.stderr)
+                        sys.exit(1)
+                    i += 2
-                elif arg == "--max-results" and i + 1 < len(sys.argv):
-                    max_results = int(sys.argv[i + 1])
-                    i += 2
+                elif arg == "--max-results" and i + 1 < len(sys.argv):
+                    try:
+                        max_results = int(sys.argv[i + 1])
+                    except ValueError:
+                        print(f"Error: --max-results must be an integer, got '{sys.argv[i + 1]}'", file=sys.stderr)
+                        sys.exit(1)
+                    i += 2
+                else:
+                    data_sources.append(arg)
+                    i += 1
+
+            result = client.grep_search(
+                query,
+                data_sources,
+                paths=paths or None,
+                extensions=extensions or None,
+                max_results=max_results,
+                regex=regex,
+            )
+            print(json.dumps(result, indent=2))
+
         elif command == "fetch":
             if len(sys.argv) < 3:
                 print("Usage: fetch <identifier1> [identifier2...]")