Skip to content

Commit f2ae8a3

Browse files
rodion-mclaude
andcommitted
Surface grep_search file-name matching + matchedByName flag (#375)
Pairs with CodeAlive-AI/backend#376. The backend's grep_search now also matches file names/paths for literal queries and flags name-only hits with matchedByName=true (omitted when null via global JsonIgnoreCondition on the .NET side). Previously the MCP layer dropped matchedByName entirely in transform_grep_response, so the new signal never reached LLM agents even though the backend emitted it. Changes: - response_transformer.transform_grep_response now forwards matchedByName into the MCP dict output, only when the backend set it (mirrors the backend's omit-on-null wire semantics so content-match responses stay byte-identical to the pre-change shape). - grep_search tool docstring updated: mentions literal file-name matching, explains the Form.xml use case, documents the matchedByName contract (empty matches, location points at line 1 as a file-level reference — do NOT interpret it as a content match), and flags the Phase 1 limitation that regex=true still only searches content. - README.md one-line summary of grep_search extended accordingly. - Unit test test_grep_forwards_matched_by_name_flag asserts name-only hits surface the flag and content hits do not. Tests: 17/17 response_transformer tests pass (16 pre-existing + 1 new). Full MCP unit suite: 249 passed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent e0415b5 commit f2ae8a3

4 files changed

Lines changed: 74 additions & 6 deletions

File tree

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ Once connected, you'll have access to these powerful tools:
2727

2828
1. **`get_data_sources`** - List your indexed repositories and workspaces
2929
2. **`semantic_search`** - Canonical semantic search across indexed artifacts
30-
3. **`grep_search`** - Exact text or regex search with line-level matches
30+
3. **`grep_search`** - Exact literal or regex text search inside file content, plus literal file-name/path matching (returns files like `Form.xml` even when their content never mentions the name), with line-level previews for content matches
3131
4. **`fetch_artifacts`** - Load the full source for relevant search hits
3232
5. **`get_artifact_relationships`** - Expand call graph, inheritance, and reference relationships for one artifact
3333
6. **`chat`** - Slower synthesized codebase Q&A, typically only after search

src/tests/test_response_transformer.py

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -354,3 +354,49 @@ def test_grep_unicode_in_line_text(self):
354354
line = result["results"][0]["matches"][0]["lineText"]
355355
assert "ТипШтрихкода" in line
356356
assert "GS1_DataMatrix" in line
357+
358+
def test_grep_forwards_matched_by_name_flag(self):
359+
"""Name-only hits must carry matchedByName=True through to the MCP output
360+
so LLM agents can distinguish a file-level name match from a content match.
361+
Content hits must NOT include the field (backend omits null via
362+
JsonIgnoreCondition.WhenWritingNull; the transformer mirrors that)."""
363+
response = {
364+
"results": [
365+
{
366+
"kind": "File",
367+
"identifier": "biterp/.../Ext/Form.xml",
368+
"location": {
369+
"path": "bsl-checks/src/test/resources/checks/VerifyMetadata/CommonForms/Форма/Ext/Form.xml",
370+
"range": {"start": {"line": 1}, "end": {"line": 1}},
371+
},
372+
"matchCount": 0,
373+
"matches": [],
374+
"matchedByName": True,
375+
},
376+
{
377+
"kind": "File",
378+
"identifier": "biterp/.../renames.txt",
379+
"location": {"path": "renames.txt"},
380+
"matchCount": 2,
381+
"matches": [
382+
{
383+
"lineNumber": 3,
384+
"startColumn": 1,
385+
"endColumn": 9,
386+
"lineText": "Form.xml -> Form2.xml",
387+
}
388+
],
389+
# matchedByName intentionally absent — backend omits it for content hits
390+
},
391+
]
392+
}
393+
394+
result = transform_grep_response(response)
395+
396+
assert len(result["results"]) == 2
397+
name_only, content_hit = result["results"]
398+
assert name_only["matchedByName"] is True
399+
assert name_only["matchCount"] == 0
400+
assert "matches" not in name_only # transformer only copies matches when non-empty
401+
assert "matchedByName" not in content_hit
402+
assert content_hit["matchCount"] == 2

src/tools/search.py

Lines changed: 21 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -239,16 +239,20 @@ async def grep_search(
239239
regex: bool = False,
240240
) -> Dict[str, Any]:
241241
"""
242-
Search indexed code by exact text or regex — finds code containing
243-
a specific string.
242+
Search indexed code by exact text or regex — matches file content
243+
and, for literal queries, also file names/paths.
244244
245245
Use this when you know WHAT TEXT to look for: an identifier, an error
246-
message, a config key, a literal string that must appear in the source.
246+
message, a config key, or a file whose name you know (even if nothing
247+
inside the file references that name — 1C `Form.xml`, `.mdo`, config
248+
XML, media files, etc.).
247249
248250
**When to use grep_search:**
249251
- Specific identifiers: class/function/variable names, domain events
250252
(e.g. `RepositoryDeleted`, `handlePayment`, `AUTH_PROVIDERS`)
251253
- Literal strings: error messages, URLs, config keys, file paths
254+
- File names whose content may never contain their own name
255+
(e.g. `Form.xml`, `schema.graphql`, `appsettings.json`)
252256
- Import paths, TODO/FIXME comments, annotations
253257
- Regex patterns: `def test_.*async`, `Status\\.(Alive|Failed)`
254258
- Finding ALL occurrences of a known symbol across the codebase
@@ -276,16 +280,23 @@ async def grep_search(
276280
max_results: Maximum number of results to return (1–500).
277281
278282
regex: If True, treat `query` as a regex pattern. Default: False (literal).
283+
**Regex currently matches file content only** — file-name/path
284+
matching is literal-substring only. This is a known limitation.
279285
280286
Returns:
281287
{"results": [...], "hint": "..."}
282288
283289
Each result contains:
284290
- path: file path
285291
- identifier: pass to `fetch_artifacts` for full source
286-
- matchCount: total matches in this file
292+
- matchCount: total matches in this file (0 for file-name-only hits)
287293
- matches: array of line-level hits, each with:
288294
- lineNumber, startColumn, endColumn, lineText
295+
- matchedByName: present and `true` only when the artifact matched
296+
by its file name/path and has no content match. In that case
297+
`matches` is empty and `location.line` defaults to 1 as a
298+
file-level reference — do NOT interpret `location.line` as an
299+
actual line match. Content-match results omit this field.
289300
290301
The `hint` reminds you that line previews are evidence only — load
291302
full source via `fetch_artifacts` or local `Read()` before reasoning.
@@ -295,7 +306,12 @@ async def grep_search(
295306
grep_search(query="ConnectionString",
296307
data_sources=["backend"])
297308
298-
2. Regex search for test methods:
309+
2. Find a file by name (returns the file even if nothing inside
310+
it references `Form.xml`):
311+
grep_search(query="Form.xml",
312+
data_sources=["biterp-bsl"])
313+
314+
3. Regex search for test methods (content only):
299315
grep_search(query="def test_.*auth",
300316
data_sources=["backend"],
301317
extensions=[".py"],

src/utils/response_transformer.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -95,6 +95,12 @@ def transform_grep_response(grep_results: Dict[str, Any]) -> Dict[str, Any]:
9595
item["matches"] = [
9696
_build_match_dict(match) for match in result["matches"]
9797
]
98+
# Forward matchedByName only when the backend set it (name-only hits).
99+
# The backend omits the field for content matches via System.Text.Json
100+
# WhenWritingNull, so `get("matchedByName")` is None/missing for those
101+
# and we skip it here to keep the happy path free of an extra key.
102+
if result.get("matchedByName"):
103+
item["matchedByName"] = True
98104
formatted_results.append(item)
99105

100106
if not formatted_results:

0 commit comments

Comments
 (0)