You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: AGENTS.md
+7-3Lines changed: 7 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -148,9 +148,12 @@ If no new rule is detected -> do not update the file.
148
148
- Implement code and tests together for every behavior change.
149
149
- Keep the gateway reusable as a NuGet library, not as an app-specific host.
150
150
- Preserve one public execution surface for local `AITool` instances and MCP tools.
151
-
- Preserve one searchable catalog that supports vector ranking when embeddings are available and lexical fallback when they are not.
151
+
- Preserve one searchable catalog that uses Markdown-LD graph ranking by default and supports vector ranking only when embeddings are explicitly selected.
152
+
- Tool search must support sparse high-confidence selection plus an explicit related/next-step expansion path; do not make consumers pass the full tool catalog when a smaller capability set can answer the request.
152
153
- For multilingual or noisy search inputs, prefer a generic English-normalization step before ranking when an AI/query-rewrite component is available, because the user wants the searchable representation to converge to English instead of relying only on language-specific token overlap.
153
154
- Keep meta-tools available through `McpGatewayToolSet` and `IMcpGateway.CreateMetaTools(...)`.
155
+
- When Markdown-LD graph search is selected, startup or explicit index initialization must build and validate the tool graph before search/tool discovery so LLM-facing MCP tool selection is based on the correct focused graph.
156
+
- Markdown-LD graph search must support both startup-generated graphs and filesystem-provided graph files; tests for file-backed graph mode must generate the graph fixture through the package flow rather than relying on a hand-authored static artifact.
154
157
- If a user adds or corrects a persistent workflow rule, update `AGENTS.md` first and only then continue with the task.
155
158
156
159
### Repository Layout
@@ -209,7 +212,7 @@ If no new rule is detected -> do not update the file.
209
212
- local tool indexing and invocation
210
213
- MCP tool indexing and invocation
211
214
- vector search behavior
212
-
-lexical fallback behavior
215
+
-Markdown-LD graph search and vector-to-graph fallback behavior
213
216
- Keep embedding-based search covered with deterministic local tests by using a fake or test-only embedding generator.
214
217
- Keep request context behavior covered when search or invocation consumes contextual inputs.
215
218
- Do not remove tests to get green builds.
@@ -252,7 +255,8 @@ If no new rule is detected -> do not update the file.
252
255
- Prefer direct generic DI registrations such as `services.TryAddSingleton<IService, Implementation>()` over lambda alias registrations when wiring package services, because the lambda style has already been called out as unreadable and error-prone in this repository.
253
256
- Keep runtime services DI-native from their public/internal constructors; types such as `McpGatewayRegistry` must be creatable through `IOptions<McpGatewayOptions>` and other DI-managed dependencies rather than ad-hoc state-only constructors, because the package design requires services to live fully inside the container.
254
257
- When emitting package identity to external protocols such as MCP client info, never hardcode a fake version string; use the actual assembly/build version so runtime metadata stays aligned with the package being shipped.
255
-
- For search-quality improvements, prefer mathematical or statistical ranking changes over hardcoded phrase lists or ad-hoc query text hacks, because the user explicitly wants tokenizer search to improve through general scoring behavior rather than manual exceptions.
258
+
- For search-quality improvements, prefer mathematical, statistical, or graph-ranking changes over hardcoded phrase lists or ad-hoc query text hacks, because the user explicitly wants token-distance search to improve through general scoring behavior rather than manual exceptions.
259
+
- Do not keep a separate local tokenizer search path when `ManagedCode.MarkdownLd.Kb` already provides token-based graph search; route tokenizer-backed retrieval through Markdown-LD so the package does not carry duplicate ranking implementations.
256
260
- Prefer framework-provided in-memory caching primitives such as `IMemoryCache` over custom process-local storage implementations when they cover the lifecycle and lookup needs, because self-rolled memory stores age poorly and make scaling/concurrency behavior harder to trust.
257
261
- Never keep legacy compatibility shims, obsolete paths, or lingering documentation references to removed implementations when a replacement is accepted, because this repository should converge on the current design instead of carrying dead historical baggage.
258
262
- Never leave `ManagedCode`-prefixed DI/setup extension method names such as `AddManagedCodeMcpGateway(...)` in the public API once concise `McpGateway` naming is available, because these branded leftovers make the package surface inconsistent and read like stale legacy.
- one gateway for local `AITool` instances and MCP tools
25
-
- one search surface with vector ranking when embeddings are available and lexical fallback when they are not
25
+
- one search surface with default Markdown-LD graph ranking and opt-in vector ranking
26
26
- one invoke surface for both local tools and MCP tools
27
27
- runtime registration through `IMcpGatewayRegistry`
28
28
- reusable gateway meta-tools for chat clients and agents
@@ -75,8 +75,9 @@ var invoke = await gateway.InvokeAsync(new McpGatewayInvokeRequest(
75
75
76
76
Important defaults:
77
77
78
-
- search is `Auto` by default
79
-
-`Auto` uses embeddings when available and lexical fallback otherwise
78
+
- search is `Graph` by default
79
+
- graph search uses `ManagedCode.MarkdownLd.Kb` and does not require embeddings
80
+
- embeddings are opt-in through `McpGatewaySearchStrategy.Embeddings` or `McpGatewaySearchStrategy.Auto`
80
81
- the default result size is `5`
81
82
- the maximum result size is `15`
82
83
- the index is built lazily on first list, search, or invoke
@@ -350,7 +351,7 @@ var response = await agent.RunAsync(
350
351
351
352
## Optional Warmup
352
353
353
-
The gateway works without explicit initialization, but you can warm the index eagerly when you want startup validation or a pre-built cache.
354
+
The gateway works without explicit initialization, but you can warm the index eagerly when you want startup validation or a pre-built cache. When Markdown-LD graph search is selected, warmup builds the graph during startup instead of waiting for the first search.
If no embedding generator is registered, the same gateway still works and falls back to lexical search automatically.
411
+
If vector search cannot run for a request, the gateway falls back to the same Markdown-LD graph index used by the default mode and reports a diagnostic. If you register an embedding generator but leave the default `Graph` strategy in place, the generator is not used.
Description="Search GitHub repositories by user query."
462
+
}));
463
+
});
438
464
```
439
465
440
466
This built-in store reuses the application's shared `IMemoryCache` and only caches embeddings inside the current process. It is useful for local reuse, but it is not durable and does not synchronize across replicas.
@@ -447,25 +473,117 @@ For multi-instance or durable caching, register your own `IMcpGatewayToolEmbeddi
Description="Search GitHub repositories by user query."
510
+
}));
511
+
});
512
+
```
513
+
514
+
You can also build the same Markdown-LD source documents ahead of time and point the gateway at a file or directory. This is useful when the graph should be generated in a separate step and loaded by the runtime:
515
+
516
+
```csharp
517
+
varauthoringServices=newServiceCollection();
518
+
authoringServices.AddMcpGateway(options=>
519
+
{
520
+
options.AddTool(
521
+
"local",
522
+
AIFunctionFactory.Create(
523
+
static (stringquery) =>$"github:{query}",
524
+
newAIFunctionFactoryOptions
525
+
{
526
+
Name="github_search_repositories",
527
+
Description="Search GitHub repositories by user query."
Description="Search GitHub repositories by user query."
556
+
}));
557
+
});
450
558
```
451
559
560
+
`UseMarkdownLdGraphFile(...)` accepts:
561
+
562
+
- a gateway graph bundle JSON file created by `McpGatewayMarkdownLdGraphFile.WriteAsync(...)`
563
+
- a directory containing Markdown-LD source documents
564
+
- a single Markdown-LD source file supported by `ManagedCode.MarkdownLd.Kb`
565
+
566
+
The bundle is a portable set of Markdown-LD source documents, not a serialized RDF store. The runtime still builds the in-memory `ManagedCode.MarkdownLd.Kb` graph from those documents so focused graph search, related matches, and next-step matches behave the same way as generated startup mode.
567
+
452
568
## Search Modes
453
569
454
-
`McpGatewaySearchStrategy.Auto` is the default and usually the right choice:
570
+
`McpGatewaySearchStrategy.Graph` is the default and usually the right choice for zero-cost local retrieval:
455
571
456
-
- use vector ranking when embeddings are available
457
-
- fall back to lexical ranking when they are not
572
+
- build or load a Markdown-LD graph during index build
573
+
- use deterministic token-distance search from `ManagedCode.MarkdownLd.Kb`
574
+
- return primary matches, related matches, next-step matches, and focused graph counts
Graph mode uses `ManagedCode.MarkdownLd.Kb` to convert every local `AITool` and MCP tool descriptor into an in-memory Markdown-LD knowledge graph. Each tool becomes a Markdown document with structured front matter, source metadata, required arguments, input schema text, graph groups, related-tool hints, and next-step hints. Search uses the graph's deterministic Tiktoken token-distance focused search to rank tool documents and returns normal `McpGatewaySearchMatch` results, so invocation still uses the same `ToolId` flow.
605
+
606
+
The old separate local tokenizer strategy is intentionally not exposed. Token-based search is provided by `ManagedCode.MarkdownLd.Kb` inside the Markdown-LD graph path.
607
+
477
608
`McpGatewaySearchResult.RankingMode` reports:
478
609
479
610
-`vector`
480
-
-`lexical`
611
+
-`graph`
481
612
-`browse`
482
613
-`empty`
483
614
615
+
`McpGatewayIndexBuildResult` also reports graph index state through `IsGraphSearchEnabled`, `GraphNodeCount`, and `GraphEdgeCount`. These values are useful for startup validation and tests when a host requires graph-backed search to be available.
616
+
484
617
## Deeper Docs
485
618
486
619
Use these when you need design details rather than package onboarding:
@@ -489,6 +622,7 @@ Use these when you need design details rather than package onboarding:
489
622
-[ADR-0001: Runtime boundaries and index lifecycle](docs/ADR/ADR-0001-runtime-boundaries-and-index-lifecycle.md)
490
623
-[ADR-0002: Search ranking and query normalization](docs/ADR/ADR-0002-search-ranking-and-query-normalization.md)
491
624
-[ADR-0003: Reusable chat-client and agent auto-discovery modules](docs/ADR/ADR-0003-reusable-chat-client-and-agent-tool-modules.md)
625
+
-[ADR-0005: Markdown-LD graph search for tool retrieval](docs/ADR/ADR-0005-markdown-ld-graph-search-for-tool-retrieval.md)
492
626
-[Feature spec: Search query normalization and ranking](docs/Features/SearchQueryNormalizationAndRanking.md)
0 commit comments