You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: rfcs/THV-0038-session-scoped-client-lifecycle.md
+21-9Lines changed: 21 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -616,7 +616,7 @@ The default factory implementation follows this pattern:
616
616
-**Client initialization includes MCP handshake**: Each client sends `InitializeRequest` to its backend, and the backend responds with capabilities and its own `Mcp-Session-Id`. The client stores the session ID for protocol compliance (includes it in subsequent request headers).
617
617
-**Capture backend session IDs**: Factory also captures each backend's session ID (via `client.SessionID()`) for observability, storing them in a map to pass to the session
618
618
-**Performance requirement**: Use parallel initialization (e.g., `errgroup` with bounded concurrency) to avoid sequential latency accumulation. Connection initialization (TCP handshake + TLS negotiation + MCP protocol handshake) can take tens to hundreds of milliseconds per backend depending on network latency and backend responsiveness. With 20 backends, sequential initialization could easily exceed acceptable session creation latency.
619
-
-**Bounded concurrency**: Limit parallel goroutines (e.g., 10 concurrent initializations) to avoid resource exhaustion. This limit is **per-session-creation** (not global), implemented as a semaphore inside the factory. It should be a configurable vMCP server-level parameter (e.g., `max_backend_init_concurrency`, default: 10). Operators with many backends on a fast private network can raise it; resource-constrained deployments or backends with expensive initialization should lower it. A global limit across concurrent session creations is not necessary — the per-session semaphore already bounds the worst case per event.
619
+
-**Bounded concurrency**: Limit parallel goroutines (e.g., 10 concurrent initializations) per session to avoid resource exhaustion. This limit is **per-session-creation**, implemented as a semaphore inside the factory via a configurable vMCP server-level parameter (`max_backend_init_concurrency`, default: 10). Operators should note that the aggregate system load is protected by the global `TOOLHIVE_MAX_SESSIONS` limit (see [Resource Exhaustion & DoS Protection](#concurrency--resource-safety)). For additional safety during traffic spikes, the factory may optionally implement a global initialization semaphore (e.g., `max_global_backend_init_concurrency`, default: 100) to cap the total number of simultaneous connection attempts across all active session creations (preventing, for example, 100 concurrent session requests from triggering 1,000 backend initializations).
620
620
-**Per-backend timeout**: Apply context timeout (e.g., 5s per backend) so one slow backend doesn't block session creation
621
621
-**Partial initialization**: If some backends fail, log warnings and continue with successfully initialized backends (failed backends not added to clients map)
622
622
- Clients are connection-ready and stateful (each maintains its backend session for protocol use)
@@ -1055,9 +1055,15 @@ When vMCP detects backend session expiration (404 or "session expired" error), a
-**Approach**: Best effort - attempt keepalive, gracefully handle backends that don't support it
1149
1161
-**Configuration**: Enable per backend, configurable interval (default: 5 min)
1150
1162
1151
-
The preferred keepalive method is the MCP spec-defined `ping` protocol request, which is side-effect-free and supported by all compliant servers; explicit tool calls should only be used as a fallback. Keepalive failures must not affect healthy sessions — after N consecutive failures the feature should be disabled for that backend, with a periodic probe to re-enable on recovery. Keepalive should default to disabled for stateless backends or where TTL alignment already covers the session lifetime. The keepalive goroutine must hold the backend lock to avoid races with session re-initialization. Operators should be able to observe keepalive health via per-backend metrics covering attempt counts, failure reasons, and auto-disable events.
1163
+
The preferred keepalive method is the MCP spec-defined `ping` protocol request, which is side-effect-free and supported by all compliant servers; explicit tool calls should only be used as a fallback. Keepalive failures must not affect healthy sessions — after N consecutive failures the feature should be disabled for that backend, with a periodic probe to re-enable on recovery. Keepalive should default to disabled for stateless backends or where TTL alignment already covers the session lifetime. The keepalive goroutine must use the same in-flight counter (`sync.WaitGroup`) approach as other operations to avoid races with session re-initialization while ensuring no locks are held during network I/O. Operators should be able to observe keepalive health via per-backend metrics covering attempt counts, failure reasons, and auto-disable events.
1152
1164
1153
1165
2.**Session TTL alignment**:
1154
1166
- Configure backend session TTLs longer than vMCP session TTL
0 commit comments