|
| 1 | +# RFC-0015: Add Tool Definitions to Publisher Extensions |
| 2 | + |
| 3 | +> **Note**: This was originally [THV-2733](https://github.com/stacklok/toolhive/pull/2733) in the toolhive repository. |
| 4 | +
|
| 5 | +- **Status**: Draft |
| 6 | +- **Author(s)**: Juan Antonio Osorio (@JAORMX) |
| 7 | +- **Created**: 2025-11-25 |
| 8 | +- **Last Updated**: 2025-11-25 |
| 9 | +- **Target Repository**: toolhive-registry |
| 10 | +- **Related Issues**: [toolhive#2733](https://github.com/stacklok/toolhive/pull/2733) |
| 11 | + |
| 12 | +## Summary |
| 13 | + |
| 14 | +Add comprehensive tool metadata to the existing ToolHive publisher extension (`io.github.stacklok`) in the upstream MCP Registry format. This extends beyond tool names to include descriptions, input/output schemas, and annotations. |
| 15 | + |
| 16 | +## Problem Statement |
| 17 | + |
| 18 | +The current registry stores only tool names (`tools: []string`), which limits discoverability and tooling capabilities. Users and AI agents need richer metadata to: |
| 19 | +- Understand what each tool does before running a server |
| 20 | +- Validate tool inputs/outputs programmatically |
| 21 | +- Enable better tool selection and filtering in aggregation scenarios (e.g., Virtual MCP Server) |
| 22 | +- Support IDE/editor integrations with type information |
| 23 | + |
| 24 | +## Goals |
| 25 | + |
| 26 | +- Store full MCP tool definitions in the registry without breaking existing consumers |
| 27 | +- Reuse the existing `io.github.stacklok` publisher extension namespace |
| 28 | +- Align with the MCP specification's tool schema |
| 29 | +- Maintain backward compatibility with the existing `tools` field |
| 30 | + |
| 31 | +## Non-Goals |
| 32 | + |
| 33 | +- Not changing how tools are discovered at runtime (still via `tools/list`) |
| 34 | +- Not requiring `tool_definitions` for registry entries |
| 35 | +- Not adding new publisher extension namespaces |
| 36 | +- Not modifying the upstream MCP Registry schema |
| 37 | + |
| 38 | +## Proposed Solution |
| 39 | + |
| 40 | +Add a new `tool_definitions` field to the existing ToolHive publisher extension structure. This field contains an array of tool objects matching the MCP specification. |
| 41 | + |
| 42 | +### High-Level Design |
| 43 | + |
| 44 | +The `tool_definitions` field is added alongside existing fields in the `io.github.stacklok` extension: |
| 45 | + |
| 46 | +```json |
| 47 | +{ |
| 48 | + "meta": { |
| 49 | + "publisher_provided": { |
| 50 | + "io.github.stacklok": { |
| 51 | + "<image_or_url>": { |
| 52 | + "status": "active", |
| 53 | + "tier": "Official", |
| 54 | + "tools": ["get_weather", "search_location"], |
| 55 | + "tool_definitions": [ |
| 56 | + { |
| 57 | + "name": "get_weather", |
| 58 | + "title": "Weather Information", |
| 59 | + "description": "Get current weather for a location", |
| 60 | + "inputSchema": { |
| 61 | + "type": "object", |
| 62 | + "properties": { |
| 63 | + "location": { |
| 64 | + "type": "string", |
| 65 | + "description": "City name or coordinates" |
| 66 | + } |
| 67 | + }, |
| 68 | + "required": ["location"] |
| 69 | + }, |
| 70 | + "outputSchema": { |
| 71 | + "type": "object", |
| 72 | + "properties": { |
| 73 | + "temperature": { "type": "number" }, |
| 74 | + "conditions": { "type": "string" } |
| 75 | + } |
| 76 | + }, |
| 77 | + "annotations": { |
| 78 | + "readOnly": true |
| 79 | + } |
| 80 | + } |
| 81 | + ] |
| 82 | + } |
| 83 | + } |
| 84 | + } |
| 85 | + } |
| 86 | +} |
| 87 | +``` |
| 88 | + |
| 89 | +### Detailed Design |
| 90 | + |
| 91 | +#### Tool Definition Fields |
| 92 | + |
| 93 | +Per the [MCP specification](https://modelcontextprotocol.io/specification/2025-06-18/server/tools): |
| 94 | + |
| 95 | +| Field | Type | Required | Description | |
| 96 | +|-------|------|----------|-------------| |
| 97 | +| `name` | `string` | Yes | Unique identifier for the tool | |
| 98 | +| `title` | `string` | No | Human-readable display name | |
| 99 | +| `description` | `string` | No | Human-readable description of functionality | |
| 100 | +| `inputSchema` | `object` | No | JSON Schema defining expected parameters | |
| 101 | +| `outputSchema` | `object` | No | JSON Schema defining expected output structure | |
| 102 | +| `annotations` | `object` | No | Properties describing tool behavior | |
| 103 | + |
| 104 | +#### Annotations |
| 105 | + |
| 106 | +Tool annotations provide hints about tool behavior: |
| 107 | + |
| 108 | +| Annotation | Type | Description | |
| 109 | +|------------|------|-------------| |
| 110 | +| `readOnly` | `boolean` | Tool only reads data, no side effects | |
| 111 | +| `destructive` | `boolean` | Tool may perform destructive operations | |
| 112 | +| `idempotent` | `boolean` | Repeated calls with same args have same effect | |
| 113 | +| `openWorld` | `boolean` | Tool interacts with external entities | |
| 114 | + |
| 115 | +#### Data Model Changes |
| 116 | + |
| 117 | +Add a new type to `pkg/registry/registry/registry_types.go`: |
| 118 | + |
| 119 | +```go |
| 120 | +// ToolDefinition represents the full metadata for an MCP tool |
| 121 | +type ToolDefinition struct { |
| 122 | + Name string `json:"name"` |
| 123 | + Title string `json:"title,omitempty"` |
| 124 | + Description string `json:"description,omitempty"` |
| 125 | + InputSchema map[string]any `json:"inputSchema,omitempty"` |
| 126 | + OutputSchema map[string]any `json:"outputSchema,omitempty"` |
| 127 | + Annotations map[string]any `json:"annotations,omitempty"` |
| 128 | +} |
| 129 | +``` |
| 130 | + |
| 131 | +Update `BaseServerMetadata` to include the new field: |
| 132 | + |
| 133 | +```go |
| 134 | +type BaseServerMetadata struct { |
| 135 | + // ... existing fields ... |
| 136 | + Tools []string `json:"tools" yaml:"tools"` |
| 137 | + ToolDefinitions []*ToolDefinition `json:"tool_definitions,omitempty" yaml:"tool_definitions,omitempty"` |
| 138 | + // ... remaining fields ... |
| 139 | +} |
| 140 | +``` |
| 141 | + |
| 142 | +#### Converter Updates |
| 143 | + |
| 144 | +**ToolHive to Upstream (`toolhive_to_upstream.go`)**: |
| 145 | + |
| 146 | +Add `tool_definitions` to the extension creation functions: |
| 147 | + |
| 148 | +```go |
| 149 | +if len(metadata.ToolDefinitions) > 0 { |
| 150 | + extensions["tool_definitions"] = metadata.ToolDefinitions |
| 151 | +} |
| 152 | +``` |
| 153 | + |
| 154 | +**Upstream to ToolHive (`upstream_to_toolhive.go`)**: |
| 155 | + |
| 156 | +Extract `tool_definitions` from the extension data: |
| 157 | + |
| 158 | +```go |
| 159 | +if toolDefs, ok := extensions["tool_definitions"].([]interface{}); ok { |
| 160 | + metadata.ToolDefinitions = remarshalToType[[]*ToolDefinition](toolDefs) |
| 161 | +} |
| 162 | +``` |
| 163 | + |
| 164 | +## Security Considerations |
| 165 | + |
| 166 | +### Threat Model |
| 167 | + |
| 168 | +This change introduces minimal security risk as it only adds metadata storage. |
| 169 | + |
| 170 | +- **Data integrity**: Tool definitions are provided by MCP server publishers. Malicious publishers could provide misleading metadata (e.g., claiming a tool is read-only when it isn't). |
| 171 | +- **Schema injection**: Input/output schemas are stored as JSON objects. These could potentially contain malicious content if rendered unsafely. |
| 172 | + |
| 173 | +### Data Security |
| 174 | + |
| 175 | +- Tool definitions are public metadata, similar to existing tool names |
| 176 | +- No sensitive data is stored in tool definitions |
| 177 | +- Schemas follow JSON Schema standard, which is data-only (no executable code) |
| 178 | + |
| 179 | +### Input Validation |
| 180 | + |
| 181 | +- Tool names must be valid identifiers (alphanumeric, underscores) |
| 182 | +- Schemas should be validated as proper JSON Schema format |
| 183 | +- Size limits should be enforced to prevent oversized definitions |
| 184 | + |
| 185 | +### Mitigations |
| 186 | + |
| 187 | +- Validate schema structure on ingestion |
| 188 | +- Sanitize any user-facing display of descriptions |
| 189 | +- Enforce reasonable size limits on tool definitions |
| 190 | + |
| 191 | +## Alternatives Considered |
| 192 | + |
| 193 | +### Alternative 1: Separate Extension Namespace |
| 194 | + |
| 195 | +Create a new extension namespace specifically for tool definitions. |
| 196 | + |
| 197 | +- **Pros**: Cleaner separation, independent evolution |
| 198 | +- **Cons**: More complexity, duplicated tool references |
| 199 | +- **Why not chosen**: Reusing existing namespace is simpler and maintains consistency |
| 200 | + |
| 201 | +### Alternative 2: External Tool Definition Registry |
| 202 | + |
| 203 | +Store tool definitions in a separate registry or service. |
| 204 | + |
| 205 | +- **Pros**: Independent scaling, specialized service |
| 206 | +- **Cons**: Additional infrastructure, synchronization challenges |
| 207 | +- **Why not chosen**: Adds operational complexity without clear benefit |
| 208 | + |
| 209 | +## Compatibility |
| 210 | + |
| 211 | +### Backward Compatibility |
| 212 | + |
| 213 | +- The `tools` field remains unchanged and continues to store tool names as strings |
| 214 | +- Consumers that only read `tools` are unaffected |
| 215 | +- The `tool_definitions` field is optional; servers without it continue to work |
| 216 | +- When both fields exist, `tool_definitions` is the authoritative source; `tools` serves as a quick lookup |
| 217 | + |
| 218 | +### Forward Compatibility |
| 219 | + |
| 220 | +- Schema is designed to accommodate additional MCP tool fields as the spec evolves |
| 221 | +- Using `map[string]any` for schemas allows flexibility without breaking changes |
| 222 | + |
| 223 | +## Implementation Plan |
| 224 | + |
| 225 | +### Phase 1: Type Definitions |
| 226 | + |
| 227 | +- Add `ToolDefinition` type to registry types |
| 228 | +- Update `BaseServerMetadata` with new field |
| 229 | +- Add JSON/YAML serialization support |
| 230 | + |
| 231 | +### Phase 2: Converter Updates |
| 232 | + |
| 233 | +- Update ToolHive-to-upstream converter |
| 234 | +- Update upstream-to-ToolHive converter |
| 235 | +- Add unit tests for round-trip conversion |
| 236 | + |
| 237 | +### Phase 3: CLI Integration |
| 238 | + |
| 239 | +- Add `--show-tools` flag to `thv list` command |
| 240 | +- Display tool descriptions in output |
| 241 | + |
| 242 | +## Testing Strategy |
| 243 | + |
| 244 | +- **Unit tests**: Serialization/deserialization of tool definitions |
| 245 | +- **Integration tests**: Round-trip conversion between formats |
| 246 | +- **E2E tests**: Registry operations with tool definitions |
| 247 | + |
| 248 | +## Documentation |
| 249 | + |
| 250 | +- Update registry format documentation |
| 251 | +- Add examples of tool definitions in registry entries |
| 252 | +- Document use cases for tool definitions |
| 253 | + |
| 254 | +## Open Questions |
| 255 | + |
| 256 | +1. Should we validate that `tools` array matches `tool_definitions[*].name`? |
| 257 | +2. What size limits should be enforced on tool definitions? |
| 258 | + |
| 259 | +## References |
| 260 | + |
| 261 | +- [MCP Specification - Tools](https://modelcontextprotocol.io/specification/2025-06-18/server/tools) |
| 262 | +- [JSON Schema Specification](https://json-schema.org/) |
| 263 | + |
| 264 | +--- |
| 265 | + |
| 266 | +## RFC Lifecycle |
| 267 | + |
| 268 | +### Review History |
| 269 | + |
| 270 | +| Date | Reviewer | Decision | Notes | |
| 271 | +|------|----------|----------|-------| |
| 272 | +| 2025-11-25 | - | Draft | Ported from toolhive PR #2733 | |
| 273 | + |
| 274 | +### Implementation Tracking |
| 275 | + |
| 276 | +| Repository | PR | Status | |
| 277 | +|------------|-----|--------| |
| 278 | +| toolhive | - | Pending | |
0 commit comments