Skip to content

Commit cb495c8

Browse files
committed
refactoring
1 parent d8cb920 commit cb495c8

48 files changed

Lines changed: 2357 additions & 279 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

AGENTS.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -169,6 +169,10 @@ If no new rule is detected -> do not update the file.
169169
- When a search-quality fix is requested as a concrete architectural change, finish the intended runtime behavior in the same task instead of stopping at a partial scoring-only step, because partial search fixes leave the real retrieval defect unresolved.
170170
- When the user asks for a shipped feature set, implement the runtime behavior end-to-end with production-ready code and tests instead of leaving placeholder, mocked, or temporary execution paths, because partial delivery is explicitly rejected in this repository.
171171
- During stabilization and release-hardening work, proactively search for analogous fragile patterns after fixing the first defect, and fix every confirmed issue with the correct SDK, platform, or .NET primitive plus tests instead of hacks, suppressions, or narrow one-off reactions.
172+
- Empty `catch {}` blocks are forbidden in source and tests; every caught exception must either be intentionally handled with visible control flow, logged/asserted, wrapped with context, or rethrown so failures are not silently hidden.
173+
- Runtime memory stores must have explicit ownership, bounded retention, and cleanup/disposal paths; unbounded in-memory dictionaries keyed by sessions, tasks, resources, prompts, or upstream bindings are allowed only when test-scoped or when production code removes entries deterministically and has regression coverage for cleanup.
174+
- Public/runtime state, cache, and store surfaces must be designed interface-first and implementation-agnostic; in-memory, persistent, distributed, or test implementations are interchangeable details behind clear interfaces rather than assumptions baked into the package contracts.
175+
- In-memory cache/store implementations must use host-owned infrastructure through DI, such as `IMemoryCache`, or an explicit caller-provided dependency; production package types must not create hidden private `MemoryCache` instances because ownership, size limits, and disposal must stay centralized.
172176
- Do not hide runtime or transport timeouts as magic numbers in package code; when a timeout is required, expose it as an explicit `TimeSpan` option with a clear default, caller override, and validation. Test-only bounded waits are acceptable only as harness hang guards, not as product behavior.
173177
- When adopting new upstream graph/search capabilities such as `ManagedCode.MarkdownLd.Kb` schema-aware search, implement the real hybrid runtime benefits with tests and docs instead of limiting the task to dependency bumps or export helpers, because the user expects the package to capture the upstream search value end-to-end.
174178
- When adding federated graph search, do not reject federation as a category; expose it through explicit allowlisted APIs or built-in tools with diagnostics and tests, because the user wants powerful hybrid/federated search while keeping hidden unconfigured network calls out of the default path.
@@ -214,7 +218,7 @@ If no new rule is detected -> do not update the file.
214218
- When a README refresh is requested, remove stale sections and replace them with the current shipped behavior instead of appending changelog-style notes, because the user wants the README to read as the authoritative current guide.
215219
- For non-trivial architecture, runtime-flow, or cross-cutting search changes, always add or update an ADR under `docs/ADR/`, update `docs/Architecture/Overview.md`, and keep `README.md` synchronized with the shipped behavior and examples so the docs describe the real package rather than an older design snapshot.
216220
- When the package requires an initialization step such as index building, provide an ergonomic optional integration path (for example DI extension or hosted background warmup) instead of forcing every consumer to call it manually, and document when manual initialization is still appropriate.
217-
- Keep documented configuration defaults synchronized with the actual `McpGatewayOptions` defaults; for example, `MaxSearchResults` default is `15`, not stale sample values.
221+
- Keep documented configuration defaults synchronized with the actual `McpGatewayOptions` defaults; public numeric limits must be named constants/options with clear intent instead of unexplained magic numbers.
218222
- Keep the README focused on package usage and onboarding, not internal implementation notes.
219223
- Keep `README.md` free of unnecessary internal detail; it should stay clear, example-driven, and focused on what consumers need to understand and use the package quickly.
220224
- Document optional DI dependencies explicitly in README examples so consumers know which services they must register themselves, such as embedding generators.

README.md

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -99,11 +99,14 @@ Default behavior:
9999
- `MarkdownLdGraphSearchMode = Hybrid`
100100
- `SearchQueryNormalization = TranslateToEnglishWhenAvailable`
101101
- `DefaultSearchLimit = 5`
102-
- `MaxSearchResults = 15`
102+
- `MaxSearchResults = 50`
103+
- `MaxDescriptorLength = 16384`
103104
- the index is built lazily on first list, search, or invoke
104105

105106
`Hybrid` means the Markdown-LD graph path runs schema-aware SPARQL search first, then uses the gateway-built ranked graph candidate path as supporting evidence and fuzzy fallback for noisy queries. It is not a tokenizer-only search path.
106107

108+
`DefaultSearchLimit` is the normal top-N result size when a caller does not ask for a count. `MaxSearchResults` is only a hard cap for caller-requested result sizes so LLM-facing tool discovery cannot accidentally flood context. `MaxDescriptorLength` bounds the generated descriptor text used by search and Markdown-LD graph indexing; hosts with very large tool schemas can raise it explicitly.
109+
107110
## Register Tools And Sources
108111

109112
Register local tools during startup:
@@ -329,6 +332,20 @@ services.AddMcpServer()
329332

330333
Exported MCP tool and prompt names are source-qualified gateway ids such as `docs:search_repository`, `ops:deployment_review_system_prompt`, or `local:release_review_bundle`, so multiple upstream servers and gateway-owned prompts can be combined without name collisions. Exported MCP resource URIs and URI templates are rewritten into gateway-owned opaque URIs so downstream `resources/read` calls route back to the correct upstream source even when multiple servers expose overlapping URI spaces. The same source-aware rewrite is also used for `completion/complete`, forwarded prompt list changes, and forwarded resource update notifications, so downstream clients always talk in terms of gateway-owned prompt names and resource URIs while the gateway proxies the corresponding upstream MCP operations. When an upstream MCP tool already advertises task support, the gateway preserves that contract on the exported tool and proxies the corresponding upstream task flow. Local gateway tools are exported as optional task-capable tools and are executed through the gateway-owned task store.
331334

335+
The exported task store uses the official SDK `InMemoryMcpTaskStore` with MCPGateway-owned bounded defaults: task TTL 30 minutes, maximum task TTL 2 hours, cleanup every minute, maximum 10,000 tasks globally, and maximum 1,000 tasks per downstream session. Hosts can override those limits through `McpGatewayOptions.McpTaskStore`, or replace `McpServerOptions.TaskStore` with a durable production store when tasks must survive process restarts:
336+
337+
```csharp
338+
services.AddMcpGateway(options =>
339+
{
340+
options.McpTaskStore.TaskTimeToLive = TimeSpan.FromHours(1);
341+
options.McpTaskStore.MaximumTaskTimeToLive = TimeSpan.FromHours(4);
342+
options.McpTaskStore.MaximumTasks = 50_000;
343+
options.McpTaskStore.MaximumTasksPerSession = 5_000;
344+
});
345+
```
346+
347+
`WithMcpGatewayCatalog()` does not replace the official SDK HTTP session manager with a custom session store. When used with `ModelContextProtocol.AspNetCore` Streamable HTTP transport, the gateway composes the SDK `HttpServerTransportOptions.RunSessionHandler` lifecycle so gateway-owned per-session prompt notification, resource subscription, and active task binding state is removed when the SDK session ends. Hosts should still use the official transport options for session policy such as `IdleTimeout`, `MaxIdleSessionCount`, `EventStreamStore`, and `SessionMigrationHandler`.
348+
332349
If the downstream MCP host cannot use the default singleton `IMcpGateway`, `IMcpGatewayPromptCatalog`, and `IMcpGatewayResourceCatalog` registrations directly, register a custom `IMcpGatewayServerBindingResolver`. The resolver can create or select a request-specific or session-specific gateway instance and return it through `McpGatewayServerBinding`, while `WithMcpGatewayCatalog()` continues to own the exported MCP handlers, prompt/resource notifications, subscriptions, and task flow:
333350

334351
```csharp
@@ -779,7 +796,7 @@ services.AddMcpGatewayInMemorySearchCache();
779796
services.AddMcpGateway();
780797
```
781798

782-
If the host needs a different cache technology or policy, register its own `IMcpGatewaySearchCache`.
799+
`McpGatewayInMemorySearchCache` is an adapter over a host-owned `IMemoryCache`; it does not create or dispose a private cache. The DI helper provisions `IMemoryCache`, and direct construction must pass an explicit cache from the host or test container. Entries include short TTLs and cache sizes so host-owned `MemoryCacheOptions.SizeLimit` policies can bound memory. If the host needs a different cache technology or policy, register its own `IMcpGatewaySearchCache`.
783800

784801
### Tool Embedding Store
785802

@@ -807,6 +824,8 @@ services.AddMcpGateway(options =>
807824
});
808825
```
809826

827+
`McpGatewayInMemoryToolEmbeddingStore` is a process-local implementation of the `IMcpGatewayToolEmbeddingStore` persistence boundary. It requires a host-owned `IMemoryCache`, either through `AddMcpGatewayInMemoryToolEmbeddingStore()` or an explicit constructor argument, so production hosts keep cache ownership, size policy, and disposal centralized.
828+
810829
## Routing, Meta-Tools, And Chat Integration
811830

812831
For category-first discovery:

docs/ADR/ADR-0002-search-ranking-and-query-normalization.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,8 @@ Search modes:
2424
- `Embeddings`: vector ranking first, Markdown-LD graph fallback when vector query generation fails or returns an unusable vector
2525
- `Auto`: vector-first policy mode that keeps semantic ranking as the primary result set and uses Markdown-LD graph expansion only after vector ranking succeeds
2626

27+
Search-size and descriptor-size limits are public configuration, not hidden runtime constants. The default result size stays small for LLM-facing discovery, while `MaxSearchResults` is only the caller-requested hard cap and `MaxDescriptorLength` only bounds generated descriptor text used by search and graph indexing.
28+
2729
## Diagram
2830

2931
```mermaid
@@ -122,6 +124,7 @@ Mitigations:
122124
- `Auto` MUST run vector ranking first when query vectors are available and usable.
123125
- `Auto` MUST preserve vector primary ordering and MUST use graph search only as bounded supplementation or fallback.
124126
- Search-quality improvements MUST prefer mathematical or graph-ranking changes over hardcoded phrase exceptions.
127+
- Public search and descriptor-size limits MUST be named configuration defaults with validation instead of inline magic numbers or silent runtime clamping.
125128

126129
## Rollout And Rollback
127130

docs/ADR/ADR-0004-process-local-embedding-store-uses-imemorycache.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,9 @@ Horizontal scale, durability, and cross-replica cache coherence remain separate
1313

1414
## Decision
1515

16-
The built-in `McpGatewayInMemoryToolEmbeddingStore` will use `IMemoryCache` for process-local embedding reuse, and the package will expose a dedicated `AddMcpGatewayInMemoryToolEmbeddingStore()` registration path that wires the store to the host's shared cache services.
16+
The built-in `McpGatewayInMemoryToolEmbeddingStore` will use `IMemoryCache` for process-local embedding reuse, and the package will expose a dedicated `AddMcpGatewayInMemoryToolEmbeddingStore()` registration path that wires the store to host-owned cache services. Direct store construction must pass an explicit `IMemoryCache`; the store must not create or dispose a private cache. Cache entries set sizes so host-owned `MemoryCacheOptions.SizeLimit` policies can bound memory.
1717

18-
The gateway runtime will expose `IMcpGatewaySearchCache` as the runtime cache boundary. `AddMcpGateway(...)` registers a no-op default so the core package path does not force `IMemoryCache` into every host. Hosts that want process-local search-path reuse can opt into `AddMcpGatewayInMemorySearchCache()`, which wires `McpGatewayInMemorySearchCache` to the host's shared `IMemoryCache` registration.
18+
The gateway runtime will expose `IMcpGatewaySearchCache` as the runtime cache boundary. `AddMcpGateway(...)` registers a no-op default so the core package path does not force `IMemoryCache` into every host. Hosts that want process-local search-path reuse can opt into `AddMcpGatewayInMemorySearchCache()`, which wires `McpGatewayInMemorySearchCache` to host-owned `IMemoryCache`. Direct search cache construction must pass an explicit `IMemoryCache`; entries carry both TTL and size metadata.
1919

2020
Durable or distributed embedding reuse will remain the responsibility of host-provided `IMcpGatewayToolEmbeddingStore` implementations.
2121

@@ -85,6 +85,7 @@ Positive:
8585

8686
- the built-in store relies on a standard .NET caching primitive
8787
- hosts can register the process-local store with one DI call and reuse the shared `IMemoryCache`
88+
- direct cache/store construction requires an explicit caller-owned cache, so ownership and disposal remain visible
8889
- the runtime can reuse expensive search-path artifacts without wrapping `IChatClient` or `IEmbeddingGenerator`
8990
- hosts can choose no-op, `IMemoryCache`, or custom runtime search-cache behavior explicitly
9091
- process-local cache behavior stays explicitly separate from durable/distributed storage concerns
@@ -94,7 +95,7 @@ Trade-offs:
9495

9596
- the package takes a new runtime dependency on `Microsoft.Extensions.Caching.Memory`
9697
- the built-in store is still process-local only and does not solve multi-instance cache sharing
97-
- direct construction without DI now owns a private `MemoryCache` instance and must be disposed like any other cache owner
98+
- hosts that need hard process-local size limits must configure the supplied `IMemoryCache` policy or provide a dedicated cache instance through DI
9899

99100
Mitigations:
100101

@@ -109,6 +110,8 @@ Mitigations:
109110
- `AddMcpGatewayInMemorySearchCache()` MUST remain opt-in and MUST provision `IMemoryCache`.
110111
- `McpGatewayInMemoryToolEmbeddingStore` MUST remain optional and MUST NOT become a mandatory dependency for gateway usage.
111112
- `AddMcpGatewayInMemoryToolEmbeddingStore()` MUST register the built-in store through the host `IServiceCollection` and MUST provision `IMemoryCache`.
113+
- `McpGatewayInMemorySearchCache` and `McpGatewayInMemoryToolEmbeddingStore` MUST require caller-provided `IMemoryCache` and MUST NOT create hidden private `MemoryCache` instances.
114+
- Built-in `IMemoryCache` entries MUST set entry sizes so host-owned cache size policies work when configured.
112115
- Hosts that need cross-instance persistence or replication MUST continue to provide their own `IMcpGatewayToolEmbeddingStore`.
113116
- The built-in store MUST clone vectors on read/write boundaries so callers cannot mutate cached embedding buffers in place.
114117

docs/ADR/ADR-0009-mcp-export-completion-subscriptions-and-logging-parity.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,8 @@ The repository now treats the official C# SDK as the MCP baseline. Stable MCP ex
1919
- completion requests for exported prompts and resource references will be resolved back to the owning upstream `SourceId` and proxied through the upstream MCP client
2020
- resource subscriptions will be created against the owning upstream MCP client and tracked per downstream MCP session plus exported gateway URI
2121
- upstream `notifications/resources/updated` messages received through those subscriptions will be forwarded to the subscribed downstream client, but with the exported gateway URI instead of the raw upstream URI
22+
- per-session resource subscription state will be removed when unsubscribe, subscribe failure, or notification-forwarding failure leaves no active upstream subscription
23+
- when hosted over the official SDK ASP.NET Core Streamable HTTP transport, per-session resource subscription state will also be removed from the gateway when the SDK-owned HTTP session ends
2224
- the gateway export will advertise MCP logging capability and accept `logging/setLevel`, relying on the SDK server runtime to track the latest requested logging level on the downstream server instance
2325

2426
No new public catalog surface is introduced for these protocol features. They remain part of the downstream MCP export behavior owned by the `Hosting` slice.
@@ -30,6 +32,7 @@ Positive:
3032
- downstream MCP clients can now use prompt and resource completions against the aggregated gateway export
3133
- resource subscriptions become source-aware and continue to work when resource URIs are gateway-rewritten
3234
- forwarded resource update notifications keep downstream clients on gateway-owned URIs instead of leaking upstream transport details
35+
- per-session subscription bookkeeping does not grow permanently for resource URIs that have already been unsubscribed, failed during setup/forwarding, or belonged to an ended Streamable HTTP session
3336
- the exported MCP server behavior is closer to the stable SDK baseline without widening the public DI surface unnecessarily
3437

3538
Trade-offs:
@@ -43,6 +46,8 @@ Trade-offs:
4346
- `WithMcpGatewayCatalog()` MUST export MCP `completion/complete`.
4447
- `WithMcpGatewayCatalog()` MUST export MCP `resources/subscribe` and `resources/unsubscribe`.
4548
- Forwarded `notifications/resources/updated` MUST use the exported gateway URI that the downstream client subscribed to, not the raw upstream URI.
49+
- Resource subscription bookkeeping MUST remove inactive session/resource entries after unsubscribe, setup failure, or forwarding failure.
50+
- Streamable HTTP hosting MUST compose the official SDK session lifecycle and remove gateway-owned subscription state when the SDK session ends.
4651
- Completion requests for exported prompts and resources MUST resolve back to the owning upstream `SourceId` before proxying.
4752
- `logging/setLevel` support MUST stay aligned with the official SDK server behavior instead of introducing a gateway-specific logging protocol.
4853

0 commit comments

Comments
 (0)