You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: AGENTS.md
+5-1Lines changed: 5 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -169,6 +169,10 @@ If no new rule is detected -> do not update the file.
169
169
- When a search-quality fix is requested as a concrete architectural change, finish the intended runtime behavior in the same task instead of stopping at a partial scoring-only step, because partial search fixes leave the real retrieval defect unresolved.
170
170
- When the user asks for a shipped feature set, implement the runtime behavior end-to-end with production-ready code and tests instead of leaving placeholder, mocked, or temporary execution paths, because partial delivery is explicitly rejected in this repository.
171
171
- During stabilization and release-hardening work, proactively search for analogous fragile patterns after fixing the first defect, and fix every confirmed issue with the correct SDK, platform, or .NET primitive plus tests instead of hacks, suppressions, or narrow one-off reactions.
172
+
- Empty `catch {}` blocks are forbidden in source and tests; every caught exception must either be intentionally handled with visible control flow, logged/asserted, wrapped with context, or rethrown so failures are not silently hidden.
173
+
- Runtime memory stores must have explicit ownership, bounded retention, and cleanup/disposal paths; unbounded in-memory dictionaries keyed by sessions, tasks, resources, prompts, or upstream bindings are allowed only when test-scoped or when production code removes entries deterministically and has regression coverage for cleanup.
174
+
- Public/runtime state, cache, and store surfaces must be designed interface-first and implementation-agnostic; in-memory, persistent, distributed, or test implementations are interchangeable details behind clear interfaces rather than assumptions baked into the package contracts.
175
+
- In-memory cache/store implementations must use host-owned infrastructure through DI, such as `IMemoryCache`, or an explicit caller-provided dependency; production package types must not create hidden private `MemoryCache` instances because ownership, size limits, and disposal must stay centralized.
172
176
- Do not hide runtime or transport timeouts as magic numbers in package code; when a timeout is required, expose it as an explicit `TimeSpan` option with a clear default, caller override, and validation. Test-only bounded waits are acceptable only as harness hang guards, not as product behavior.
173
177
- When adopting new upstream graph/search capabilities such as `ManagedCode.MarkdownLd.Kb` schema-aware search, implement the real hybrid runtime benefits with tests and docs instead of limiting the task to dependency bumps or export helpers, because the user expects the package to capture the upstream search value end-to-end.
174
178
- When adding federated graph search, do not reject federation as a category; expose it through explicit allowlisted APIs or built-in tools with diagnostics and tests, because the user wants powerful hybrid/federated search while keeping hidden unconfigured network calls out of the default path.
@@ -214,7 +218,7 @@ If no new rule is detected -> do not update the file.
214
218
- When a README refresh is requested, remove stale sections and replace them with the current shipped behavior instead of appending changelog-style notes, because the user wants the README to read as the authoritative current guide.
215
219
- For non-trivial architecture, runtime-flow, or cross-cutting search changes, always add or update an ADR under `docs/ADR/`, update `docs/Architecture/Overview.md`, and keep `README.md` synchronized with the shipped behavior and examples so the docs describe the real package rather than an older design snapshot.
216
220
- When the package requires an initialization step such as index building, provide an ergonomic optional integration path (for example DI extension or hosted background warmup) instead of forcing every consumer to call it manually, and document when manual initialization is still appropriate.
217
-
- Keep documented configuration defaults synchronized with the actual `McpGatewayOptions` defaults; for example, `MaxSearchResults` default is `15`, not stale sample values.
221
+
- Keep documented configuration defaults synchronized with the actual `McpGatewayOptions` defaults; public numeric limits must be named constants/options with clear intent instead of unexplained magic numbers.
218
222
- Keep the README focused on package usage and onboarding, not internal implementation notes.
219
223
- Keep `README.md` free of unnecessary internal detail; it should stay clear, example-driven, and focused on what consumers need to understand and use the package quickly.
220
224
- Document optional DI dependencies explicitly in README examples so consumers know which services they must register themselves, such as embedding generators.
- the index is built lazily on first list, search, or invoke
104
105
105
106
`Hybrid` means the Markdown-LD graph path runs schema-aware SPARQL search first, then uses the gateway-built ranked graph candidate path as supporting evidence and fuzzy fallback for noisy queries. It is not a tokenizer-only search path.
106
107
108
+
`DefaultSearchLimit` is the normal top-N result size when a caller does not ask for a count. `MaxSearchResults` is only a hard cap for caller-requested result sizes so LLM-facing tool discovery cannot accidentally flood context. `MaxDescriptorLength` bounds the generated descriptor text used by search and Markdown-LD graph indexing; hosts with very large tool schemas can raise it explicitly.
109
+
107
110
## Register Tools And Sources
108
111
109
112
Register local tools during startup:
@@ -329,6 +332,20 @@ services.AddMcpServer()
329
332
330
333
Exported MCP tool and prompt names are source-qualified gateway ids such as `docs:search_repository`, `ops:deployment_review_system_prompt`, or `local:release_review_bundle`, so multiple upstream servers and gateway-owned prompts can be combined without name collisions. Exported MCP resource URIs and URI templates are rewritten into gateway-owned opaque URIs so downstream `resources/read` calls route back to the correct upstream source even when multiple servers expose overlapping URI spaces. The same source-aware rewrite is also used for `completion/complete`, forwarded prompt list changes, and forwarded resource update notifications, so downstream clients always talk in terms of gateway-owned prompt names and resource URIs while the gateway proxies the corresponding upstream MCP operations. When an upstream MCP tool already advertises task support, the gateway preserves that contract on the exported tool and proxies the corresponding upstream task flow. Local gateway tools are exported as optional task-capable tools and are executed through the gateway-owned task store.
331
334
335
+
The exported task store uses the official SDK `InMemoryMcpTaskStore` with MCPGateway-owned bounded defaults: task TTL 30 minutes, maximum task TTL 2 hours, cleanup every minute, maximum 10,000 tasks globally, and maximum 1,000 tasks per downstream session. Hosts can override those limits through `McpGatewayOptions.McpTaskStore`, or replace `McpServerOptions.TaskStore` with a durable production store when tasks must survive process restarts:
`WithMcpGatewayCatalog()` does not replace the official SDK HTTP session manager with a custom session store. When used with `ModelContextProtocol.AspNetCore` Streamable HTTP transport, the gateway composes the SDK `HttpServerTransportOptions.RunSessionHandler` lifecycle so gateway-owned per-session prompt notification, resource subscription, and active task binding state is removed when the SDK session ends. Hosts should still use the official transport options for session policy such as `IdleTimeout`, `MaxIdleSessionCount`, `EventStreamStore`, and `SessionMigrationHandler`.
348
+
332
349
If the downstream MCP host cannot use the default singleton `IMcpGateway`, `IMcpGatewayPromptCatalog`, and `IMcpGatewayResourceCatalog` registrations directly, register a custom `IMcpGatewayServerBindingResolver`. The resolver can create or select a request-specific or session-specific gateway instance and return it through `McpGatewayServerBinding`, while `WithMcpGatewayCatalog()` continues to own the exported MCP handlers, prompt/resource notifications, subscriptions, and task flow:
If the host needs a different cache technology or policy, register its own `IMcpGatewaySearchCache`.
799
+
`McpGatewayInMemorySearchCache` is an adapter over a host-owned `IMemoryCache`; it does not create or dispose a private cache. The DI helper provisions `IMemoryCache`, and direct construction must pass an explicit cache from the host or test container. Entries include short TTLs and cache sizes so host-owned `MemoryCacheOptions.SizeLimit` policies can bound memory. If the host needs a different cache technology or policy, register its own `IMcpGatewaySearchCache`.
`McpGatewayInMemoryToolEmbeddingStore` is a process-local implementation of the `IMcpGatewayToolEmbeddingStore` persistence boundary. It requires a host-owned `IMemoryCache`, either through `AddMcpGatewayInMemoryToolEmbeddingStore()` or an explicit constructor argument, so production hosts keep cache ownership, size policy, and disposal centralized.
Copy file name to clipboardExpand all lines: docs/ADR/ADR-0002-search-ranking-and-query-normalization.md
+3Lines changed: 3 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -24,6 +24,8 @@ Search modes:
24
24
-`Embeddings`: vector ranking first, Markdown-LD graph fallback when vector query generation fails or returns an unusable vector
25
25
-`Auto`: vector-first policy mode that keeps semantic ranking as the primary result set and uses Markdown-LD graph expansion only after vector ranking succeeds
26
26
27
+
Search-size and descriptor-size limits are public configuration, not hidden runtime constants. The default result size stays small for LLM-facing discovery, while `MaxSearchResults` is only the caller-requested hard cap and `MaxDescriptorLength` only bounds generated descriptor text used by search and graph indexing.
28
+
27
29
## Diagram
28
30
29
31
```mermaid
@@ -122,6 +124,7 @@ Mitigations:
122
124
-`Auto` MUST run vector ranking first when query vectors are available and usable.
123
125
-`Auto` MUST preserve vector primary ordering and MUST use graph search only as bounded supplementation or fallback.
124
126
- Search-quality improvements MUST prefer mathematical or graph-ranking changes over hardcoded phrase exceptions.
127
+
- Public search and descriptor-size limits MUST be named configuration defaults with validation instead of inline magic numbers or silent runtime clamping.
Copy file name to clipboardExpand all lines: docs/ADR/ADR-0004-process-local-embedding-store-uses-imemorycache.md
+6-3Lines changed: 6 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,9 +13,9 @@ Horizontal scale, durability, and cross-replica cache coherence remain separate
13
13
14
14
## Decision
15
15
16
-
The built-in `McpGatewayInMemoryToolEmbeddingStore` will use `IMemoryCache` for process-local embedding reuse, and the package will expose a dedicated `AddMcpGatewayInMemoryToolEmbeddingStore()` registration path that wires the store to the host's shared cache services.
16
+
The built-in `McpGatewayInMemoryToolEmbeddingStore` will use `IMemoryCache` for process-local embedding reuse, and the package will expose a dedicated `AddMcpGatewayInMemoryToolEmbeddingStore()` registration path that wires the store to host-owned cache services. Direct store construction must pass an explicit `IMemoryCache`; the store must not create or dispose a private cache. Cache entries set sizes so host-owned `MemoryCacheOptions.SizeLimit` policies can bound memory.
17
17
18
-
The gateway runtime will expose `IMcpGatewaySearchCache` as the runtime cache boundary. `AddMcpGateway(...)` registers a no-op default so the core package path does not force `IMemoryCache` into every host. Hosts that want process-local search-path reuse can opt into `AddMcpGatewayInMemorySearchCache()`, which wires `McpGatewayInMemorySearchCache` to the host's shared `IMemoryCache` registration.
18
+
The gateway runtime will expose `IMcpGatewaySearchCache` as the runtime cache boundary. `AddMcpGateway(...)` registers a no-op default so the core package path does not force `IMemoryCache` into every host. Hosts that want process-local search-path reuse can opt into `AddMcpGatewayInMemorySearchCache()`, which wires `McpGatewayInMemorySearchCache` to host-owned `IMemoryCache`. Direct search cache construction must pass an explicit `IMemoryCache`; entries carry both TTL and size metadata.
19
19
20
20
Durable or distributed embedding reuse will remain the responsibility of host-provided `IMcpGatewayToolEmbeddingStore` implementations.
21
21
@@ -85,6 +85,7 @@ Positive:
85
85
86
86
- the built-in store relies on a standard .NET caching primitive
87
87
- hosts can register the process-local store with one DI call and reuse the shared `IMemoryCache`
88
+
- direct cache/store construction requires an explicit caller-owned cache, so ownership and disposal remain visible
88
89
- the runtime can reuse expensive search-path artifacts without wrapping `IChatClient` or `IEmbeddingGenerator`
89
90
- hosts can choose no-op, `IMemoryCache`, or custom runtime search-cache behavior explicitly
90
91
- process-local cache behavior stays explicitly separate from durable/distributed storage concerns
@@ -94,7 +95,7 @@ Trade-offs:
94
95
95
96
- the package takes a new runtime dependency on `Microsoft.Extensions.Caching.Memory`
96
97
- the built-in store is still process-local only and does not solve multi-instance cache sharing
97
-
-direct construction without DI now owns a private `MemoryCache` instance and must be disposed like any other cache owner
98
+
-hosts that need hard process-local size limits must configure the supplied `IMemoryCache` policy or provide a dedicated cache instance through DI
98
99
99
100
Mitigations:
100
101
@@ -109,6 +110,8 @@ Mitigations:
109
110
-`AddMcpGatewayInMemorySearchCache()` MUST remain opt-in and MUST provision `IMemoryCache`.
110
111
-`McpGatewayInMemoryToolEmbeddingStore` MUST remain optional and MUST NOT become a mandatory dependency for gateway usage.
111
112
-`AddMcpGatewayInMemoryToolEmbeddingStore()` MUST register the built-in store through the host `IServiceCollection` and MUST provision `IMemoryCache`.
113
+
-`McpGatewayInMemorySearchCache` and `McpGatewayInMemoryToolEmbeddingStore` MUST require caller-provided `IMemoryCache` and MUST NOT create hidden private `MemoryCache` instances.
114
+
- Built-in `IMemoryCache` entries MUST set entry sizes so host-owned cache size policies work when configured.
112
115
- Hosts that need cross-instance persistence or replication MUST continue to provide their own `IMcpGatewayToolEmbeddingStore`.
113
116
- The built-in store MUST clone vectors on read/write boundaries so callers cannot mutate cached embedding buffers in place.
Copy file name to clipboardExpand all lines: docs/ADR/ADR-0009-mcp-export-completion-subscriptions-and-logging-parity.md
+5Lines changed: 5 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,6 +19,8 @@ The repository now treats the official C# SDK as the MCP baseline. Stable MCP ex
19
19
- completion requests for exported prompts and resource references will be resolved back to the owning upstream `SourceId` and proxied through the upstream MCP client
20
20
- resource subscriptions will be created against the owning upstream MCP client and tracked per downstream MCP session plus exported gateway URI
21
21
- upstream `notifications/resources/updated` messages received through those subscriptions will be forwarded to the subscribed downstream client, but with the exported gateway URI instead of the raw upstream URI
22
+
- per-session resource subscription state will be removed when unsubscribe, subscribe failure, or notification-forwarding failure leaves no active upstream subscription
23
+
- when hosted over the official SDK ASP.NET Core Streamable HTTP transport, per-session resource subscription state will also be removed from the gateway when the SDK-owned HTTP session ends
22
24
- the gateway export will advertise MCP logging capability and accept `logging/setLevel`, relying on the SDK server runtime to track the latest requested logging level on the downstream server instance
23
25
24
26
No new public catalog surface is introduced for these protocol features. They remain part of the downstream MCP export behavior owned by the `Hosting` slice.
@@ -30,6 +32,7 @@ Positive:
30
32
- downstream MCP clients can now use prompt and resource completions against the aggregated gateway export
31
33
- resource subscriptions become source-aware and continue to work when resource URIs are gateway-rewritten
32
34
- forwarded resource update notifications keep downstream clients on gateway-owned URIs instead of leaking upstream transport details
35
+
- per-session subscription bookkeeping does not grow permanently for resource URIs that have already been unsubscribed, failed during setup/forwarding, or belonged to an ended Streamable HTTP session
33
36
- the exported MCP server behavior is closer to the stable SDK baseline without widening the public DI surface unnecessarily
34
37
35
38
Trade-offs:
@@ -43,6 +46,8 @@ Trade-offs:
43
46
-`WithMcpGatewayCatalog()` MUST export MCP `completion/complete`.
44
47
-`WithMcpGatewayCatalog()` MUST export MCP `resources/subscribe` and `resources/unsubscribe`.
45
48
- Forwarded `notifications/resources/updated` MUST use the exported gateway URI that the downstream client subscribed to, not the raw upstream URI.
49
+
- Resource subscription bookkeeping MUST remove inactive session/resource entries after unsubscribe, setup failure, or forwarding failure.
50
+
- Streamable HTTP hosting MUST compose the official SDK session lifecycle and remove gateway-owned subscription state when the SDK session ends.
46
51
- Completion requests for exported prompts and resources MUST resolve back to the owning upstream `SourceId` before proxying.
47
52
-`logging/setLevel` support MUST stay aligned with the official SDK server behavior instead of introducing a gateway-specific logging protocol.
0 commit comments