Skip to content

Commit 1e859d7

Browse files
rodion-mclaude
andcommitted
Sharpen semantic vs grep search responsibility boundary
- SKILL.md: table and "When to Use" section now clearly separate semantic search (default, by meaning) from grep (exact text/regex). Tool reference descriptions updated. - search.py: docstring marks it as "the default discovery tool", empty-result message guides agent to rephrase or try grep. - grep.py: docstring explains when to use exact text search, empty-result message guides agent to check case or try semantic. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 0656df7 commit 1e859d7

3 files changed

Lines changed: 41 additions & 12 deletions

File tree

skills/codealive-context-engine/SKILL.md

Lines changed: 17 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -38,8 +38,8 @@ Do NOT retry the failed script until setup completes successfully.
3838
| Tool | Script | Speed | Cost | Best For |
3939
|------|--------|-------|------|----------|
4040
| **List Data Sources** | `datasources.py` | Instant | Free | Discovering indexed repos and workspaces |
41-
| **Semantic Search** | `search.py` | Fast | Low | Finding relevant artifacts by meaning |
42-
| **Grep Search** | `grep.py` | Fast | Low | Exact text and regex matches with line previews |
41+
| **Semantic Search** | `search.py` | Fast | Low | Default discovery — finds code by meaning (concepts, behavior, architecture) |
42+
| **Grep Search** | `grep.py` | Fast | Low | Finds code containing a specific string or regex (identifiers, literals, patterns) |
4343
| **Fetch Artifacts** | `fetch.py` | Fast | Low | Retrieving full content for search results |
4444
| **Artifact Relationships** | `relationships.py` | Fast | Low | Drilling into call graph, inheritance, references for one artifact |
4545
| **Chat with Codebase** | `chat.py` | Slow | High | Synthesized answers, architectural explanations |
@@ -64,12 +64,18 @@ or references.
6464

6565
## When to Use
6666

67-
**Use this skill for semantic understanding:**
67+
**Semantic search (default) — you describe behavior or concept:**
6868
- "How is authentication implemented?"
6969
- "Show me error handling patterns across services"
7070
- "How does this library work internally?"
7171
- "Find similar features to guide my implementation"
7272

73+
**Grep search — you know the exact text:**
74+
- "Find all usages of `RepositoryDeleted`"
75+
- "Where is `ConnectionString` configured?"
76+
- "Search for `TODO: fix` across the codebase"
77+
- Error messages, URLs, config keys, import paths, regex patterns
78+
7379
**Use local file tools instead for:**
7480
- Finding specific files by name or pattern
7581
- Exact keyword search in the current directory
@@ -129,9 +135,11 @@ python scripts/datasources.py --all # All (including processing)
129135
python scripts/datasources.py --json # JSON output
130136
```
131137

132-
### `search.py` — Semantic Code Search
138+
### `search.py` — Semantic Code Search (default discovery tool)
133139

134-
Returns file paths, line numbers, descriptions, identifiers, and content sizes. Fast and cheap.
140+
The default starting point. Finds code by WHAT it does — concepts, behavior,
141+
architecture — not by exact text. Use when you can describe what you're
142+
looking for but don't know the exact names in the codebase.
135143

136144
```bash
137145
python scripts/search.py <query> <data_sources...> [options]
@@ -150,10 +158,11 @@ source: use `fetch.py <identifier>` for external repos, or your editor's
150158
file-read tool on the path for repos in the current working directory. Treat
151159
only that real `content` as ground truth.
152160

153-
### `grep.py` — Exact / Regex Search
161+
### `grep.py` — Exact Text / Regex Search
154162

155-
Returns artifact-level matches with line previews. Use this when the pattern
156-
itself matters more than semantic similarity.
163+
Finds code containing a specific string or regex pattern. Use when you know
164+
the exact text to look for: identifiers, error messages, config keys, URLs,
165+
domain events, import paths, TODO comments.
157166

158167
```bash
159168
python scripts/grep.py <query> <data_sources...> [--regex] [--max-results N] [--path PATH] [--ext EXT]

skills/codealive-context-engine/scripts/grep.py

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,10 @@
11
#!/usr/bin/env python3
22
"""
3-
CodeAlive Grep Search - exact text or regex search across indexed repositories.
3+
CodeAlive Grep Search — exact text or regex search across indexed repositories.
4+
5+
Finds code containing a specific string or pattern. Use when you know the
6+
exact identifier, error message, config key, or regex to match.
7+
For concept-based discovery, use search.py instead.
48
59
Usage:
610
python grep.py "AuthService" my-repo
@@ -19,7 +23,13 @@
1923
def format_grep_results(results: dict) -> str:
2024
items = results.get("results", []) if isinstance(results, dict) else []
2125
if not items:
22-
return "No results found."
26+
return (
27+
"No grep matches found. This does NOT mean the code doesn't exist.\n"
28+
"Try: (1) check case — grep is case-sensitive by default; "
29+
"(2) use search.py for concept-based discovery if unsure of exact naming; "
30+
"(3) check that the data source is correct (run datasources.py); "
31+
"(4) remove --path/--ext filters if used."
32+
)
2333

2434
output = []
2535
for idx, result in enumerate(items, 1):

skills/codealive-context-engine/scripts/search.py

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,10 @@
11
#!/usr/bin/env python3
22
"""
3-
CodeAlive Semantic Search - semantic retrieval across indexed repositories.
3+
CodeAlive Semantic Search — the default discovery tool.
4+
5+
Finds code by meaning (concepts, behavior, architecture), not by exact text.
6+
Use when you can describe WHAT the code does but don't know exact names.
7+
For exact identifiers, literals, or regex, use grep.py instead.
48
59
Usage:
610
python search.py "How is authentication handled?" my-repo
@@ -36,7 +40,13 @@ def format_search_results(results: dict) -> str:
3640
items = [results]
3741

3842
if not items:
39-
return "No results found."
43+
return (
44+
"No results found. This does NOT mean the code doesn't exist.\n"
45+
"Try: (1) rephrase with synonyms or broader terms; "
46+
"(2) use grep.py if you know a specific identifier or literal string; "
47+
"(3) check that the data source is correct (run datasources.py); "
48+
"(4) remove --path/--ext filters if used."
49+
)
4050

4151
output = []
4252
for idx, result in enumerate(items, 1):

0 commit comments

Comments
 (0)