Update docs for ToolHive v0.12.3–v0.13.0

rdimitrov · claude · rdimitrov · commit b9da28d3f327 · 2026-03-27T12:27:01.000-04:00
Catch up documentation with features shipped in v0.12.3 through v0.13.0.
Auto-generated CLI/CRD reference docs were already current; these changes
cover manual doc updates verified against source code at each release tag.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
diff --git a/docs/toolhive/concepts/backend-auth.mdx b/docs/toolhive/concepts/backend-auth.mdx
@@ -211,18 +211,25 @@ deployments using the ToolHive Operator.
 - **Direct upstream redirect:** The embedded authorization server redirects
   clients directly to the upstream provider for authentication (for example,
   GitHub or Atlassian).
-- **Single upstream provider:** Currently supports one upstream identity
-  provider per configuration.
-
-:::info[Chained authentication not yet supported]
-
-The embedded authorization server redirects clients directly to the upstream
-provider. This means the upstream provider must be the service whose API the MCP
-server calls. Chained authentication—where a client authenticates with a
+- **Multiple upstream providers (VirtualMCPServer):** VirtualMCPServer supports
+  configuring multiple upstream identity providers with sequential
+  authentication. When multiple providers are configured, the authorization
+  server chains the authentication flow through each provider in sequence,
+  collecting tokens from all of them. This enables scenarios where backend tools
+  require tokens from different providers (such as a corporate IdP and
+  GitHub). MCPServer and MCPRemoteProxy support a single upstream provider per
+  configuration.
+
+:::info[Chained authentication for MCPServer]
+
+MCPServer and MCPRemoteProxy support only one upstream provider. The embedded
+authorization server redirects clients directly to that provider, so the
+provider must be the service whose API the MCP server calls. If your MCPServer
+deployment requires chained authentication—where a client authenticates with a
 corporate IdP like Okta, which then federates to an external provider like
-GitHub—is not yet supported. If your deployment requires this pattern, consider
-using [token exchange](#same-idp-with-token-exchange) with a federated identity
-provider instead.
+GitHub—consider using
+[token exchange](#same-idp-with-token-exchange) with a federated identity
+provider instead, or use a VirtualMCPServer with multiple upstream providers.
 
 :::
 
diff --git a/docs/toolhive/concepts/skills.mdx b/docs/toolhive/concepts/skills.mdx
@@ -95,16 +95,17 @@ an older version does not change the latest pointer.
 You can retrieve a specific version or request `latest` to get the most recent
 one.
 
-## Current status and what's next
+## Current status
 
 The skills API is available as an extension endpoint on the Registry server
 (`/{registryName}/v0.1/x/dev.toolhive/skills`). You can publish, list, search,
 retrieve, and delete skills through this API.
 
-Skill installation via agent clients (such as the ToolHive CLI or IDE
-extensions) is planned for a future release. For now, the registry serves as a
-discovery and distribution layer where you can browse available skills and
-retrieve their package references.
+The ToolHive CLI supports installing skills with `thv skill install`. You can
+install skills by plain name (resolved from the registry) or from a git
+repository. See
+[Manage skills](../guides-registry/skills.mdx#install-skills-with-the-cli) for
+details and examples.
 
 ## Next steps
 
diff --git a/docs/toolhive/guides-k8s/auth-k8s.mdx b/docs/toolhive/guides-k8s/auth-k8s.mdx
@@ -470,7 +470,7 @@ kubectl apply -f embedded-auth-config.yaml
 | `signingKeySecretRefs` | References to Secrets containing JWT signing keys. First key is active; additional keys support rotation.              |
 | `hmacSecretRefs`       | References to Secrets with symmetric keys for signing authorization codes and refresh tokens.                          |
 | `tokenLifespans`       | Configurable durations for access tokens (default: 1h), refresh tokens (default: 168h), and auth codes (default: 10m). |
-| `upstreamProviders`    | Configuration for the upstream identity provider. Currently supports one provider.                                     |
+| `upstreamProviders`    | Configuration for upstream identity providers. MCPServer and MCPRemoteProxy support one provider; VirtualMCPServer supports multiple providers for sequential authentication. |
 
 **Step 5: Create the MCPServer resource**
 
diff --git a/docs/toolhive/guides-k8s/redis-session-storage.mdx b/docs/toolhive/guides-k8s/redis-session-storage.mdx
@@ -2,7 +2,7 @@
 title: Redis Sentinel session storage
 description:
   How to deploy Redis Sentinel and configure persistent session storage for the
-  ToolHive embedded authorization server.
+  ToolHive embedded authorization server and horizontal scaling.
 ---
 
 Deploy Redis Sentinel and configure it as the session storage backend for the
@@ -12,6 +12,11 @@ re-authenticate. Redis Sentinel provides persistent storage with automatic
 master discovery, ACL-based access control, and optional failover when replicas
 are configured.
 
+Redis session storage is also required for
+[horizontal scaling](../guides-vmcp/scaling-and-performance.mdx#session-storage-for-multi-replica-deployments)
+when running multiple MCPServer or VirtualMCPServer replicas, so that sessions
+are shared across pods.
+
 :::info[Prerequisites]
 
 Before you begin, ensure you have:
diff --git a/docs/toolhive/guides-k8s/run-mcp-k8s.mdx b/docs/toolhive/guides-k8s/run-mcp-k8s.mdx
@@ -455,6 +455,8 @@ kubectl -n <NAMESPACE> describe mcpserver <NAME>
 
 - [Kubernetes CRD reference](../reference/crd-spec.md#apiv1alpha1mcpserver) -
   Reference for the `MCPServer` Custom Resource Definition (CRD)
+- [Scaling and performance](../guides-vmcp/scaling-and-performance.mdx#mcpserver-horizontal-scaling) -
+  Configure horizontal scaling with `replicas` and `backendReplicas`
 - [Deploy the operator](./deploy-operator.mdx) - Install the ToolHive operator
 - [Build MCP containers](../guides-cli/build-containers.mdx) - Create custom MCP
   server container images
diff --git a/docs/toolhive/guides-registry/skills.mdx b/docs/toolhive/guides-registry/skills.mdx
@@ -189,11 +189,45 @@ The API returns standard HTTP status codes:
 | 409  | Version already exists                                        |
 | 500  | Internal server error                                         |
 
-## Next steps
+## Install skills with the CLI
+
+The ToolHive CLI can install skills directly from the registry or from git
+repositories.
+
+### Install from the registry
+
+Install a skill by its plain name. The CLI resolves the name against the
+configured registry:
+
+```bash
+thv skill install <SKILL_NAME>
+```
+
+For example:
+
+```bash
+thv skill install osv
+```
+
+### Install from a git repository
+
+Install a skill from a git repository using the `git://` prefix:
+
+```bash
+thv skill install git://<REPOSITORY_URL>
+```
+
+### CLI flags
 
-Skill installation via agent clients (such as the ToolHive CLI or IDE
-extensions) is planned for a future release. For now, the registry serves as a
-discovery and distribution layer.
+| Flag             | Description                                            |
+| ---------------- | ------------------------------------------------------ |
+| `--client`       | Target client application (for example, `claude-code`) |
+| `--scope`        | Installation scope: `user` or `project`                |
+| `--force`        | Overwrite an existing skill directory                  |
+| `--project-root` | Project root path for project-scoped installs          |
+| `--group`        | Group to add the skill to (defaults to `default`)      |
+
+## Next steps
 
 - [Configure telemetry](./telemetry-metrics.mdx) to monitor your registry
   deployment
diff --git a/docs/toolhive/guides-vmcp/composite-tools.mdx b/docs/toolhive/guides-vmcp/composite-tools.mdx
@@ -19,6 +19,7 @@ backend MCP servers, handling dependencies and collecting results.
   wait for their prerequisites
 - **Template expansion**: Dynamic arguments using step outputs
 - **Elicitation**: Request user input mid-workflow (approval gates, choices)
+- **Iteration**: Loop over collections with forEach steps
 - **Error handling**: Configurable abort, continue, or retry behavior
 - **Timeouts**: Workflow and per-step timeout configuration
 
@@ -290,7 +291,7 @@ spec:
 
 ### Steps
 
-Each step can be a tool call or an elicitation:
+Each step can be a tool call, an elicitation, or a forEach loop:
 
 ```yaml title="VirtualMCPServer resource"
 spec:
@@ -344,6 +345,61 @@ spec:
             timeout: '5m'
 ```
 
+### forEach steps
+
+Iterate over a collection from a previous step's output and execute a tool call
+for each item:
+
+```yaml title="VirtualMCPServer resource"
+spec:
+  config:
+    compositeTools:
+      - name: scan_repositories
+        description: Check each repository for security advisories
+        parameters:
+          type: object
+          properties:
+            org:
+              type: string
+          required:
+            - org
+        steps:
+          - id: list_repos
+            tool: github_list_repos
+            arguments:
+              org: '{{.params.org}}'
+          # highlight-start
+          - id: check_advisories
+            type: forEach
+            collection: '{{json .steps.list_repos.output.repositories}}'
+            itemVar: repo
+            maxParallel: 5
+            step:
+              type: tool
+              tool: github_list_security_advisories
+              arguments:
+                repo: '{{.forEach.repo.name}}'
+            onError:
+              action: continue
+            dependsOn: [list_repos]
+          # highlight-end
+```
+
+**forEach fields:**
+
+| Field           | Description                                         | Default |
+| --------------- | --------------------------------------------------- | ------- |
+| `collection`    | Template expression that produces an array           | —       |
+| `itemVar`       | Variable name for the current item                   | —       |
+| `maxParallel`   | Maximum concurrent iterations (max 50)               | 10      |
+| `maxIterations` | Maximum total iterations (max 1000)                  | 100     |
+| `step`          | Inner step definition (tool call to execute per item) | —       |
+| `onError`       | Error handling: `abort` (stop) or `continue` (skip)  | abort   |
+
+Access the current item inside the inner step using
+`{{.forEach.<itemVar>.<field>}}`. In the example above, `{{.forEach.repo.name}}`
+accesses the `name` field of the current repository.
+
 ### Error handling
 
 Configure behavior when steps fail:
@@ -507,13 +563,15 @@ without defaultResults defined
 
 Access workflow context in arguments:
 
-| Template                    | Description                                |
-| --------------------------- | ------------------------------------------ |
-| `{{.params.name}}`          | Input parameter                            |
-| `{{.steps.id.output}}`      | Step output (map)                          |
-| `{{.steps.id.output.text}}` | Text content from step output              |
-| `{{.steps.id.content}}`     | Elicitation response content               |
-| `{{.steps.id.action}}`      | Elicitation action (accept/decline/cancel) |
+| Template                          | Description                                |
+| --------------------------------- | ------------------------------------------ |
+| `{{.params.name}}`                | Input parameter                            |
+| `{{.steps.id.output}}`            | Step output (map)                          |
+| `{{.steps.id.output.text}}`       | Text content from step output              |
+| `{{.steps.id.content}}`           | Elicitation response content               |
+| `{{.steps.id.action}}`            | Elicitation action (accept/decline/cancel) |
+| `{{.forEach.<itemVar>}}`          | Current forEach item                       |
+| `{{.forEach.<itemVar>.<field>}}`  | Field on current forEach item              |
 
 ### Template functions
 
diff --git a/docs/toolhive/guides-vmcp/scaling-and-performance.mdx b/docs/toolhive/guides-vmcp/scaling-and-performance.mdx
@@ -1,10 +1,12 @@
 ---
 title: Scaling and Performance
 description:
-  How to scale Virtual MCP Server deployments vertically and horizontally.
+  How to scale MCPServer and Virtual MCP Server deployments vertically and
+  horizontally.
 ---
 
-This guide explains how to scale Virtual MCP Server (vMCP) deployments.
+This guide explains how to scale MCPServer and Virtual MCP Server (vMCP)
+deployments.
 
 ## Vertical scaling
 
@@ -37,24 +39,91 @@ higher request volumes.
 
 ### How to scale horizontally
 
-The VirtualMCPServer CRD does not have a `replicas` field. The operator creates
-a Deployment named `vmcp-<NAME>` (where `<NAME>` is your VirtualMCPServer name)
-with 1 replica and preserves the replicas count, allowing you to manage scaling
-separately.
+Set the `replicas` field in your VirtualMCPServer spec to control the number of
+vMCP pods:
+
+```yaml title="VirtualMCPServer resource"
+spec:
+  replicas: 3
+```
+
+When `replicas` is not set, the operator does not manage the replica count,
+leaving it to an HPA or other external controller. You can also scale manually
+or with an HPA:
 
 **Option 1: Manual scaling**
 
 ```bash
-kubectl scale deployment vmcp-<vmcp-name> -n <NAMESPACE> --replicas=3
+kubectl scale deployment vmcp-<VMCP_NAME> -n <NAMESPACE> --replicas=3
 ```
 
 **Option 2: Autoscaling with HPA**
 
 ```bash
-kubectl autoscale deployment vmcp-<vmcp-name> -n <NAMESPACE> \
+kubectl autoscale deployment vmcp-<VMCP_NAME> -n <NAMESPACE> \
   --min=2 --max=5 --cpu-percent=70
 ```
 
+### Session storage for multi-replica deployments
+
+When running multiple replicas, configure Redis session storage so that sessions
+are shared across pods. Without session storage, a request routed to a different
+replica than the one that established the session will fail.
+
+```yaml title="VirtualMCPServer resource"
+spec:
+  replicas: 3
+  sessionStorage:
+    provider: redis
+    address: redis-master.toolhive-system.svc.cluster.local:6379
+    db: 0
+    keyPrefix: vmcp-sessions
+    passwordRef:
+      name: redis-secret
+      key: password
+```
+
+See
+[Redis Sentinel session storage](../guides-k8s/redis-session-storage.mdx)
+for a complete Redis deployment guide.
+
+:::warning
+
+The operator warns if you configure multiple replicas without session storage.
+Ensure Redis is available before scaling beyond a single replica.
+
+:::
+
+### MCPServer horizontal scaling
+
+MCPServer creates two separate Deployments: one for the proxy runner and one for
+the MCP server backend. You can scale each independently:
+
+- `spec.replicas` controls the proxy runner pod count
+- `spec.backendReplicas` controls the backend MCP server pod count
+
+```yaml title="MCPServer resource"
+spec:
+  replicas: 2
+  backendReplicas: 3
+  sessionStorage:
+    provider: redis
+    address: redis-master.toolhive-system.svc.cluster.local:6379
+    db: 0
+    keyPrefix: mcp-sessions
+    passwordRef:
+      name: redis-secret
+      key: password
+```
+
+:::warning[Stdio transport limitation]
+
+Backends using the `stdio` transport are limited to a single replica. The
+operator rejects configurations with `backendReplicas` greater than 1 for stdio
+backends.
+
+:::
+
 ### When horizontal scaling is challenging
 
 Horizontal scaling works well for **stateless backends** (fetch, search,
@@ -63,22 +132,22 @@ read-only operations) where sessions can be resumed on any instance.
 However, **stateful backends** make horizontal scaling difficult:
 
 - **Stateful backends** (Playwright browser sessions, database connections, file
-  system operations) require requests to be routed to the same vMCP instance
-  that established the session
+  system operations) require requests to be routed to the same instance that
+  established the session
 - Session resumption may not work reliably for stateful backends
 
-The `VirtualMCPServer` CRD includes a `sessionAffinity` field that controls how
-the Kubernetes Service routes repeated client connections. By default, it uses
-`ClientIP` affinity, which routes connections from the same client IP to the
-same pod. You can configure this using the `sessionAffinity` field:
+The `VirtualMCPServer` and `MCPServer` CRDs include a `sessionAffinity` field
+that controls how the Kubernetes Service routes repeated client connections. By
+default, it uses `ClientIP` affinity, which routes connections from the same
+client IP to the same pod:
 
 ```yaml
 spec:
   sessionAffinity: ClientIP # default
 ```
 
-For stateful backends, vertical scaling or dedicated vMCP instances per team/use
-case are recommended instead of horizontal scaling.
+For stateful backends, vertical scaling or dedicated instances per team/use case
+are recommended instead of horizontal scaling.
 
 ## Next steps
 
diff --git a/docs/toolhive/guides-vmcp/tool-aggregation.mdx b/docs/toolhive/guides-vmcp/tool-aggregation.mdx