Skip to content

Latest commit

 

History

History
970 lines (779 loc) · 27 KB

File metadata and controls

970 lines (779 loc) · 27 KB

Registry System

The registry system is one of ToolHive's key innovations - providing a curated catalog of trusted MCP servers with metadata, configuration, and provenance information. This document explains how registries work, how to use them, and how to host your own.

Overview

ToolHive was early to adopt the concept of an MCP server registry. The registry provides:

  • Curated catalog of trusted MCP servers
  • Metadata including tools, permissions, and configuration
  • Provenance information for supply chain security
  • Easy deployment - just reference by name
  • Custom registries for organizations

Registry Architecture

graph TB
    subgraph "Registry Sources"
        Builtin[Built-in Registry<br/>Embedded JSON]
        Git[Git Repository]
        CM[ConfigMap]
        ExtAPI[External Registry API<br/>ToolHive Registry Server<br/>or MCP Registry]
    end

    subgraph "ToolHive CLI"
        CLI[thv CLI]
        Provider[Provider Interface<br/>Local/Remote/API]
    end

    subgraph "Kubernetes"
        MCPReg[MCPRegistry CRD]
        Operator[thv-operator]
        IntAPI[Internal Registry API<br/>Optional per-CRD]
    end

    Builtin --> Provider
    ExtAPI --> Provider
    Git --> MCPReg
    CM --> MCPReg
    Provider --> CLI

    MCPReg --> Operator
    Operator --> IntAPI

    style Builtin fill:#81c784
    style Git fill:#90caf9
    style CM fill:#90caf9
    style ExtAPI fill:#ce93d8
Loading

Built-in Registry

ToolHive ships with a curated registry from toolhive-catalog.

Features:

  • Maintained by Stacklok
  • Trusted and verified servers
  • Provenance information
  • Regular updates

Browse registry:

thv registry list
thv search <query>

Run from registry:

thv run server-name

Implementation:

  • Embedded: pkg/registry/data/registry.json
  • Manager: pkg/registry/provider.go, pkg/registry/provider_local.go, pkg/registry/provider_remote.go

Registry Format

Top-Level Structure

Implementation: pkg/registry/types.go

{
  "version": "1.0.0",
  "last_updated": "2025-10-13T12:00:00Z",
  "servers": {
    "server-name": { /* ImageMetadata */ }
  },
  "remote_servers": {
    "remote-name": { /* RemoteServerMetadata */ }
  },
  "groups": [
    { /* Group */ }
  ]
}

Server Entry (Container-based)

Implementation: pkg/registry/types.go

{
  "name": "weather-server",
  "description": "Provides weather information for locations",
  "tier": "Official",
  "status": "active",
  "image": "ghcr.io/stacklok/mcp-weather:v1.0.0",
  "transport": "sse",
  "target_port": 3000,
  "tools": ["get-weather", "get-forecast"],
  "permissions": {
    "network": {
      "outbound": {
        "allow_host": ["api.weather.gov"],
        "allow_port": [443]
      }
    }
  },
  "env_vars": [
    {
      "name": "API_KEY",
      "description": "Weather API key",
      "required": true,
      "secret": true
    }
  ],
  "args": ["--port", "3000"],
  "docker_tags": ["v1.0.0", "latest"],
  "metadata": {
    "stars": 150,
    "pulls": 5000,
    "last_updated": "2025-10-01T10:00:00Z"
  },
  "repository_url": "https://github.com/example/weather-mcp",
  "tags": ["weather", "api", "official"],
  "provenance": {
    "sigstore_url": "https://rekor.sigstore.dev",
    "repository_uri": "https://github.com/example/weather-mcp",
    "signer_identity": "build@example.com",
    "runner_environment": "github-actions",
    "cert_issuer": "https://token.actions.githubusercontent.com"
  }
}

Remote Server Entry

Implementation: pkg/registry/types.go

{
  "name": "cloud-mcp-server",
  "description": "Cloud-hosted MCP server",
  "tier": "Partner",
  "status": "active",
  "url": "https://mcp.example.com/sse",
  "transport": "sse",
  "tools": ["data-analysis", "ml-inference"],
  "headers": [
    {
      "name": "X-API-Key",
      "description": "API key for authentication",
      "required": true,
      "secret": true
    }
  ],
  "env_vars": [
    {
      "name": "REGION",
      "description": "Cloud region",
      "required": false,
      "default": "us-east-1"
    }
  ],
  "metadata": {
    "stars": 200,
    "last_updated": "2025-10-10T15:00:00Z"
  },
  "repository_url": "https://github.com/example/cloud-mcp",
  "tags": ["cloud", "ml", "partner"]
}

Group Entry

Implementation: pkg/registry/types.go

{
  "name": "data-pipeline",
  "description": "Data processing pipeline tools",
  "servers": {
    "data-ingestion": { /* ImageMetadata */ },
    "data-transform": { /* ImageMetadata */ }
  },
  "remote_servers": {
    "data-storage": { /* RemoteServerMetadata */ }
  }
}

Using the Registry

Discovery

List all servers:

thv registry list

Search by keyword:

thv search weather

Show server details:

thv registry info weather-server

Implementation: cmd/thv/app/registry.go, cmd/thv/app/search.go

Running from Registry

Simple run:

thv run weather-server

What happens:

  1. Look up weather-server in registry
  2. Get image, transport, permissions from metadata
  3. Prompt for required env vars
  4. Create RunConfig with registry defaults
  5. Deploy workload

With overrides:

thv run weather-server \
  --env API_KEY=xyz \
  --proxy-port 9000 \
  --permission-profile custom.json

User overrides take precedence over registry defaults.

Implementation: cmd/thv/app/run.go

Environment Variables from Registry

Registry defines requirements:

{
  "env_vars": [
    {
      "name": "API_KEY",
      "description": "Weather API key from weather.gov",
      "required": true,
      "secret": true
    },
    {
      "name": "CACHE_TTL",
      "description": "Cache TTL in seconds",
      "required": false,
      "default": "3600"
    }
  ]
}

ToolHive handles:

  • Prompts for required variables if not provided
  • Uses defaults for optional variables
  • Stores secrets securely
  • Adds to RunConfig

Implementation: pkg/registry/types.go

Custom Registries

Organizations can provide their own registries.

File-Based Registry

Create registry JSON:

{
  "version": "1.0.0",
  "servers": {
    "internal-tool": {
      "name": "internal-tool",
      "image": "registry.company.com/mcp/internal-tool:latest",
      "transport": "stdio",
      "permissions": { "network": { "outbound": { "insecure_allow_all": true }}}
    }
  }
}

Add to ToolHive:

Custom registries can be configured in the ToolHive configuration file.

Configuration location:

  • Linux: ~/.config/toolhive/config.yaml
  • macOS: ~/Library/Application Support/toolhive/config.yaml

Implementation: pkg/config/

Remote Registry

Remote registries can be configured in the ToolHive configuration file to fetch registry data from external sources.

ToolHive fetches:

  • On startup
  • Caches locally

Authentication:

  • Basic auth: https://user:pass@registry.company.com/registry.json
  • Bearer token: via environment variable

Implementation: pkg/registry/provider.go, pkg/registry/provider_local.go, pkg/registry/provider_remote.go, pkg/registry/factory.go

API Registry Provider

ToolHive supports live MCP Registry API endpoints that implement the official MCP Registry API v0.1 specification. This enables on-demand querying of servers from dynamic registry APIs.

Key differences from Remote Registry:

  • On-demand queries: Fetches servers as needed, not bulk download
  • Live data: Always queries the latest data from the API
  • Standard protocol: Uses official MCP Registry API specification
  • Pagination support: Handles large registries via cursor-based pagination
  • Search capabilities: Supports server search via API queries

Set API registry:

# URLs without .json extension are probed - if they implement /v0.1/servers, they're treated as API endpoints
thv config set-registry https://registry.example.com

With private IP support:

thv config set-registry https://registry.internal.company.com --allow-private-ip

Check current registry:

thv config get-registry
# Output: Current registry: https://registry.example.com (API endpoint)

Unset API registry:

thv config unset-registry

API Requirements:

The API endpoint must implement:

  • GET /v0.1/servers - List all servers with pagination
  • GET /v0.1/servers/:name - Get specific server by reverse-DNS name
  • GET /v0.1/servers?search=<query> - Search servers
  • GET /openapi.yaml - OpenAPI specification (version 1.0.0)

Response format:

Servers are returned in the upstream MCP Registry format:

{
  "server": {
    "name": "io.github.example/weather",
    "description": "Weather information MCP server",
    "packages": [
      {
        "registry_type": "oci",
        "identifier": "ghcr.io/example/weather-mcp:v1.0.0",
        "version": "v1.0.0"
      }
    ],
    "remotes": [],
    "repository": {
      "type": "git",
      "url": "https://github.com/example/weather-mcp"
    }
  }
}

Type conversion:

ToolHive automatically converts upstream MCP Registry types to internal format:

  • Container servers: packages with registry_type: "oci"ImageMetadata
  • Remote servers: remotes with SSE/HTTP transport → RemoteServerMetadata
  • Package formats:
    • oci/docker → Docker image reference
    • npmnpx://<package>@<version>
    • pypiuvx://<package>@<version>

Implementation:

  • pkg/registry/api/client.go - MCP Registry API client
  • pkg/registry/provider_api.go - API provider implementation with type conversion
  • pkg/config/registry.go - Configuration methods (setRegistryAPI)
  • pkg/registry/factory.go - Provider factory with API support
  • cmd/thv/app/config.go - CLI commands

Use cases:

  • Connect to official MCP Registry at https://registry.modelcontextprotocol.io
  • Point to organization's private MCP Registry API
  • Use third-party registry services
  • Dynamic server catalogs that update frequently

Stacklok's Registry Server Implementation:

For organizations needing a full-featured registry server, ToolHive Registry Server provides enterprise features:

  • Multiple data sources (Git, API, File, Managed, Kubernetes)
  • PostgreSQL backend for scalable storage
  • Enterprise OAuth 2.0/OIDC authentication (Okta, Auth0, Azure AD)
  • Background synchronization with automatic updates
  • Docker Compose and Kubernetes/Helm deployment options

For detailed setup and configuration, see the Registry Server documentation.

Registry Priority

When multiple registries configured, ToolHive uses this priority order:

  1. API Registry (if configured) - Highest priority for live data
  2. Remote Registry (if configured) - Static remote registry URL
  3. Local Registry (if configured) - Custom local file
  4. Built-in Registry - Default embedded registry

The factory selects the first configured registry type in this order. The thv config set-registry command auto-detects the registry type:

# API registry - URLs without .json are probed for /v0.1/servers endpoint
thv config set-registry https://registry.modelcontextprotocol.io

# Remote static registry - URLs ending in .json are treated as static files
thv config set-registry https://example.com/registry.json

# Local file registry
thv config set-registry /path/to/registry.json

# Check current registry configuration
thv config get-registry

# Remove custom registry (fall back to built-in)
thv config unset-registry

Implementation: pkg/registry/factory.go, pkg/registry/provider.go, pkg/registry/provider_local.go, pkg/registry/provider_remote.go, pkg/registry/provider_api.go

Enterprise Registry Deployment

For organizations requiring a centralized, scalable registry server, ToolHive Registry Server provides enterprise-grade capabilities.

When to Use ToolHive Registry Server

Scenario Recommended Solution
Single user, local development Built-in embedded registry (default)
Team sharing curated servers Static JSON file via thv config set-registry https://example.com/registry.json
Dynamic organization-wide registry Standalone ToolHive Registry Server with thv config set-registry https://registry.company.com
Kubernetes cluster with shared registry MCPRegistry CRD (deploys ToolHive Registry Server in-cluster)
Multi-cluster enterprise Standalone ToolHive Registry Server as central API, connect via thv config set-registry

Architecture Overview

ToolHive Registry Server implements a 4-layer architecture:

  1. API Layer: Chi router with OAuth/OIDC middleware
  2. Service Layer: PostgreSQL or in-memory backends
  3. Registry Layer: Git, API, File, Managed, Kubernetes registry handlers
  4. Sync Layer: Background coordinator for automatic updates

Registry Types

Type Sync Mode Description
API Automatic Upstream MCP Registry API endpoints
Git Automatic Git repositories containing registry JSON
File Automatic Local filesystem (ToolHive or upstream format)
Managed On-demand API-managed registries with publish/delete
Kubernetes On-demand K8s deployment discovery

Connecting ToolHive to Registry Server

CLI configuration:

# Point CLI to your registry server
thv config set-registry https://registry.company.com

# For internal deployments
thv config set-registry https://registry.internal.company.com --allow-private-ip

Documentation Resources

For complete registry server documentation, see:

MCPRegistry CRD (Kubernetes)

For Kubernetes deployments, registries managed via MCPRegistry CRD.

Implementation: cmd/thv-operator/api/v1beta1/mcpregistry_types.go

How configYAML Works

The MCPRegistry CRD uses a configYAML field that contains the complete ToolHive Registry Server config.yaml verbatim. The operator passes this content through to the registry server without parsing or transforming it -- configuration validation is the registry server's responsibility.

Any files referenced in configYAML (registry data, Git credentials, TLS certs) must be mounted into the registry-api container via explicit volumes and volumeMounts fields on the CRD.

Example CRD

apiVersion: toolhive.stacklok.dev/v1beta1
kind: MCPRegistry
metadata:
  name: company-registry
  namespace: toolhive-system
spec:
  configYAML: |
    sources:
      - name: company-repo
        git:
          repository: https://github.com/company/mcp-registry
          branch: main
          path: registry.json
        syncPolicy:
          interval: 1h
    registries:
      - name: default
        sources: ["company-repo"]
    database:
      host: registry-db-rw
      port: 5432
      user: db_app
      database: registry
    auth:
      mode: anonymous

Source Types

Sources are defined inside configYAML. The registry server supports several source types; the most common are Git, file (ConfigMap-backed), and Kubernetes.

Git Source

configYAML: |
  sources:
    - name: my-source
      git:
        repository: https://github.com/example/registry
        branch: main
        path: registry.json
      syncPolicy:
        interval: 1h
  registries:
    - name: default
      sources: ["my-source"]
  database:
    host: postgres
    port: 5432
    user: db_app
    database: registry
  auth:
    mode: anonymous

Features:

  • Automatic sync from Git repository
  • Branch or tag tracking
  • Shallow clones for efficiency
  • Private repository authentication via HTTP Basic Auth

Private Repository Authentication:

Git credentials are mounted as files using volumes/volumeMounts and referenced via passwordFile in the source configuration.

spec:
  configYAML: |
    sources:
      - name: private-repo
        git:
          repository: https://github.com/org/private-registry
          branch: main
          path: registry.json
          auth:
            username: "git"  # Use "git" for GitHub PATs
            passwordFile: /secrets/git-credentials/token
        syncPolicy:
          interval: 1h
    registries:
      - name: default
        sources: ["private-repo"]
    database:
      host: postgres
      port: 5432
      user: db_app
      database: registry
    auth:
      mode: anonymous
  volumes:
    - name: git-auth-credentials
      secret:
        secretName: git-credentials
        items:
          - key: token
            path: token
  volumeMounts:
    - name: git-auth-credentials
      mountPath: /secrets/git-credentials
      readOnly: true

The password Secret is mounted explicitly into the registry-api pod via the volumes and volumeMounts fields. The passwordFile path in configYAML must match the mountPath.

Implementation: cmd/thv-operator/pkg/registryapi/

ConfigMap Source

Registry data from a ConfigMap is served by using a file: source in configYAML and mounting the ConfigMap with volumes/volumeMounts.

spec:
  configYAML: |
    sources:
      - name: production
        file:
          path: /config/registry/production/registry.json
        syncPolicy:
          interval: 1h
    registries:
      - name: default
        sources: ["production"]
    database:
      host: postgres
      port: 5432
      user: db_app
      database: registry
    auth:
      mode: anonymous
  volumes:
    - name: registry-data-production
      configMap:
        name: mcp-registry-data
        items:
          - key: registry.json
            path: registry.json
  volumeMounts:
    - name: registry-data-production
      mountPath: /config/registry/production
      readOnly: true

Features:

  • Native Kubernetes resource
  • Direct updates via kubectl
  • No external dependencies
  • File path in configYAML must match the mountPath

Implementation: cmd/thv-operator/pkg/registryapi/

Sync Policy

Sync intervals are configured per-source inside configYAML:

configYAML: |
  sources:
    - name: my-source
      git:
        repository: https://github.com/example/registry
        branch: main
        path: registry.json
      syncPolicy:
        interval: 1h

Omit the syncPolicy block on a source for manual-only sync.

Implementation: cmd/thv-operator/controllers/mcpregistry_controller.go

API Service

The operator always creates a registry API deployment for each MCPRegistry:

  1. Deployment: Running ToolHive Registry Server (image: ghcr.io/stacklok/thv-registry-api)
  2. Service: Exposing API endpoints
  3. ConfigMap: Containing the configYAML content mounted at /config/config.yaml

Access:

# Within cluster
curl http://company-registry-api.default.svc.cluster.local:8080/api/v1/registry

# Via port-forward
kubectl port-forward svc/company-registry-api 8080:8080
curl http://localhost:8080/api/v1/registry

Implementation: cmd/thv-operator/pkg/registryapi/

Status Management

Status fields:

status:
  phase: Ready
  message: "Registry API is ready and serving requests"
  url: "http://company-registry-api.default.svc.cluster.local:8080"
  readyReplicas: 1
  observedGeneration: 1
  conditions:
    - type: Ready
      status: "True"
      reason: Ready
      message: "Registry API is ready and serving requests"

Phases:

  • Pending - Initial state, deployment not ready yet
  • Ready - Registry API is ready and serving requests
  • Failed - Deployment or reconciliation failed
  • Terminating - Registry being deleted

Implementation: cmd/thv-operator/controllers/mcpregistry_controller.go

Storage

Registry data is managed by the registry server itself. The operator creates a {name}-registry-server-config ConfigMap containing the registry server's configuration (from configYAML), and the registry server fetches and stores data from its configured sources (Git, API, Kubernetes, etc.) at runtime.

Registry Schema

ImageMetadata (Container Servers)

Required fields:

  • image - Container image reference
  • description - What the server does
  • transport - Communication protocol
  • tier - Classification (Official, Partner, Community)

Optional fields:

  • target_port - Port for SSE/Streamable HTTP
  • permissions - Permission profile
  • env_vars - Environment variable definitions
  • args - Default command arguments
  • docker_tags - Available tags
  • provenance - Supply chain metadata
  • tools - List of tool names
  • metadata - Stars, pulls, last updated
  • repository_url - Source code URL
  • tags - Categorization labels

Implementation: pkg/registry/types.go

RemoteServerMetadata (Remote Servers)

Required fields:

  • url - Remote server endpoint
  • description - What the server does
  • transport - Must be sse or streamable-http
  • tier - Classification

Optional fields:

  • headers - HTTP headers for authentication
  • oauth_config - OAuth/OIDC configuration
  • env_vars - Client environment variables
  • tools - List of tool names
  • metadata - Popularity metrics
  • repository_url - Documentation URL
  • tags - Categorization labels

Implementation: pkg/registry/types.go

Group

Structure:

{
  "name": "data-pipeline",
  "description": "Complete data processing pipeline",
  "servers": {
    "data-reader": { /* ImageMetadata */ },
    "data-processor": { /* ImageMetadata */ }
  },
  "remote_servers": {
    "data-warehouse": { /* RemoteServerMetadata */ }
  }
}

Use cases:

  • Deploy related servers together
  • Virtual MCP aggregation
  • Organizational structure

Run all servers in group:

thv group run data-pipeline  # assuming 'data-pipeline' is defined in your registry

Implementation: pkg/registry/types.go

Provenance and Security

Image Provenance

ToolHive supports Sigstore verification:

Provenance fields:

  • sigstore_url - Sigstore/Rekor instance
  • repository_uri - Source repository
  • repository_ref - Git ref (tag, commit)
  • signer_identity - Who built the image
  • runner_environment - Build environment
  • cert_issuer - Certificate authority
  • attestation - SLSA attestation data

Verification:

thv run weather-server --image-verification enabled

Implementation:

  • pkg/registry/types.go - Provenance type definitions
  • pkg/container/verifier/ - Sigstore/cosign verification using sigstore-go library
  • pkg/runner/retriever/retriever.go - Image verification orchestration

Supply Chain Security

Best practices:

  1. Pin image tags: Use specific versions, not latest
  2. Verify provenance: Check signer identity
  3. Review permissions: Audit network/file access
  4. Check repository: Review source code
  5. Monitor updates: Track registry updates

Upstream MCP Registry Format

ToolHive consumes registries in the upstream MCP registry format. The legacy ToolHive-native format is no longer accepted; existing files can be migrated with thv registry convert --in <file> --in-place.

Key features:

  1. Standardized schema: Upstream MCP server format from the modelcontextprotocol/registry project
  2. Publisher-provided extensions: ToolHive-specific metadata via _meta["io.modelcontextprotocol.registry/publisher-provided"]
  3. Lossless migration: Every legacy ToolHive field maps to a publisher-provided extension on the corresponding upstream server entry

Publisher-Provided Extensions

ToolHive uses the io.modelcontextprotocol.registry/publisher-provided extension mechanism to add custom metadata to MCP server definitions in the upstream format. This allows ToolHive to provide:

  • Security permissions for container-based servers
  • OAuth/OIDC configuration for remote servers
  • Categorization metadata (tags, tier, tools)
  • Supply chain provenance information
  • Popularity metrics (stars, pulls, last_updated)

Extension structure:

{
  "_meta": {
    "io.modelcontextprotocol.registry/publisher-provided": {
      "io.github.stacklok": {
        "ghcr.io/stacklok/mcp-server-example:latest": {
          "status": "active",
          "tier": "Official",
          "tools": ["example-tool"],
          "permissions": {
            "network": {
              "outbound": {
                "allow_host": ["api.example.com"]
              }
            }
          }
        }
      }
    }
  }
}

For the complete schema definition, see:

  • Schemas: published in stacklok/toolhive-core under registry/types/data/
  • Documentation: docs/registry/schema.md
  • Validation: pkg/registry/schema_validation.go

Implementation: pkg/registry/

Registry Operations

CLI Operations

List servers:

thv registry list

Show server info:

thv registry info <server-name>

Implementation: cmd/thv/app/registry.go

Kubernetes Operations

Create registry:

kubectl apply -f mcpregistry.yaml

Check status:

kubectl get mcpregistry company-registry -o yaml

Trigger manual sync:

kubectl annotate mcpregistry company-registry toolhive.stacklok.dev/sync-trigger=true

Implementation: cmd/thv-operator/controllers/mcpregistry_controller.go

Related Documentation

Internal Documentation

External Documentation

Related Repositories