|
| 1 | +# THV-0032: ToolHive Core Shared Library with Package Graduation Criteria |
| 2 | + |
| 3 | +- **Status**: Draft |
| 4 | +- **Author(s)**: Juan Antonio Osorio (@JAORMX) |
| 5 | +- **Created**: 2026-01-29 |
| 6 | +- **Last Updated**: 2026-01-30 |
| 7 | +- **Target Repository**: multiple (new repository: `toolhive-core`) |
| 8 | +- **Related Issues**: N/A |
| 9 | + |
| 10 | +## Summary |
| 11 | + |
| 12 | +This RFC proposes creating a shared Go library (`toolhive-core`) to provide stable, well-tested utilities with explicit API guarantees for the ToolHive Go ecosystem. The proposal establishes formal graduation criteria for promoting packages from internal/unstable status to the shared library, enabling projects like `dockyard` (which already imports from toolhive) to depend on stable APIs without risk of unexpected breakage. |
| 13 | + |
| 14 | +## Problem Statement |
| 15 | + |
| 16 | +The ToolHive ecosystem spans multiple Go repositories: |
| 17 | +- **toolhive** - Core runtime, CLI, operator, proxy-runner, virtual MCP |
| 18 | +- **toolhive-registry** - Registry data and CLI tools |
| 19 | +- **toolhive-registry-server** - Registry API server |
| 20 | +- **dockyard** - Container packaging for MCP servers |
| 21 | + |
| 22 | +> **Note**: This RFC focuses on shared library extraction. While a stable core library could eventually enable components like the operator to be versioned independently, that would require additional considerations (e.g., RunConfig format stability, API versioning) beyond the scope of this proposal. |
| 23 | +
|
| 24 | +Additionally, TypeScript projects exist (toolhive-studio, toolhive-cloud-ui) but are out of scope for a Go shared library. |
| 25 | + |
| 26 | +**Current state of code sharing:** |
| 27 | + |
| 28 | +| Project | Imports from toolhive | Pattern Divergence | |
| 29 | +|---------|----------------------|-------------------| |
| 30 | +| **dockyard** | `pkg/logger`, `pkg/container/images`, `pkg/runner` | Aligned - uses toolhive utilities | |
| 31 | +| **toolhive-registry** | `pkg/logger`, `pkg/permissions`, `pkg/registry/*`, `pkg/container/verifier` | Aligned - heavily uses toolhive utilities | |
| 32 | +| **toolhive-registry-server** | Only `pkg/versions` + `pkg/registry` types | Uses stdlib `log/slog` instead of toolhive's zap logger; custom error handling | |
| 33 | + |
| 34 | +> **Future sharing opportunities**: Beyond the packages listed above, `pkg/auth` contains OAuth-related utilities (protected resource metadata, auth middleware) that could be refactored into a standalone `pkg/oauth` package. Similarly, `pkg/networking` has HTTP builder patterns and endpoint validation utilities that could benefit multiple projects. |
| 35 | +
|
| 36 | +**Current limitations:** |
| 37 | + |
| 38 | +1. **No stability guarantees**: Projects like dockyard import toolhive's `pkg/` packages directly, but these are internal packages with no API stability commitment. A breaking change in toolhive could break dockyard unexpectedly. |
| 39 | + |
| 40 | +2. **Pattern divergence without clear rationale**: toolhive-registry-server uses Go stdlib `log/slog` while toolhive uses zap-based logging. Both are valid choices, but there's no documented decision about when to use which approach. |
| 41 | + |
| 42 | +3. **Unclear package maturity**: The `toolhive` repository contains 93+ packages in `pkg/`, with `logger` imported 185 times internally. Contributors and downstream consumers have no way to know which packages are stable vs. experimental. |
| 43 | + |
| 44 | +4. **Friction for new Go projects**: When creating new ToolHive ecosystem projects, developers must decide whether to import from toolhive (risking breakage) or reimplement utilities (causing divergence). |
| 45 | + |
| 46 | +5. **No graduation path**: There's no process for promoting battle-tested internal packages to "blessed" shared status with stability guarantees. |
| 47 | + |
| 48 | +## Goals |
| 49 | + |
| 50 | +- **Establish `toolhive-core`**: Create a new repository for shared, stable Go packages with explicit API stability guarantees |
| 51 | +- **Define graduation criteria**: Formalize how packages move from internal/unstable to shared/stable |
| 52 | +- **Provide stability guarantees**: Implement semantic versioning so downstream projects (dockyard, future services) can depend on toolhive-core without fear of breakage |
| 53 | +- **Document architectural decisions**: Clarify when to use toolhive-core utilities vs. stdlib alternatives (e.g., zap vs. slog) |
| 54 | +- **Enable independent versioning**: Allow the shared library to evolve on its own release cadence, decoupled from toolhive releases |
| 55 | +- **Create a graduation path**: Provide a clear process for promoting mature internal packages to shared status |
| 56 | + |
| 57 | +## Non-Goals |
| 58 | + |
| 59 | +- **Migrate all packages immediately**: Only graduated packages move to `toolhive-core` |
| 60 | +- **Break existing APIs**: The initial release must maintain compatibility with current usage |
| 61 | +- **Force adoption**: Projects may continue using internal implementations (e.g., toolhive-registry-server can keep using slog) |
| 62 | +- **Include domain-specific code**: Container runtime, MCP protocol, VMCP logic stay in their respective repos |
| 63 | +- **Create a monorepo**: `toolhive-core` is a standalone library, not a consolidation of all projects |
| 64 | +- **TypeScript/frontend utilities**: toolhive-studio and toolhive-cloud-ui are TypeScript projects and out of scope |
| 65 | + |
| 66 | +## Proposed Solution |
| 67 | + |
| 68 | +### High-Level Design |
| 69 | + |
| 70 | +```mermaid |
| 71 | +flowchart TB |
| 72 | + subgraph "Package Lifecycle" |
| 73 | + Internal[Internal Package<br/>pkg/foo in toolhive] |
| 74 | + Candidate[Graduation Candidate<br/>Meets criteria] |
| 75 | + Core[toolhive-core/foo<br/>Stable, versioned] |
| 76 | +
|
| 77 | + Internal -->|"Meets graduation criteria"| Candidate |
| 78 | + Candidate -->|"RFC + Review"| Core |
| 79 | + end |
| 80 | +
|
| 81 | + subgraph "Go Consumers" |
| 82 | + TH[toolhive] |
| 83 | + RS[toolhive-registry-server] |
| 84 | + DY[dockyard] |
| 85 | + Future[future Go services] |
| 86 | + end |
| 87 | +
|
| 88 | + Core --> TH |
| 89 | + Core --> RS |
| 90 | + Core --> DY |
| 91 | + Core --> Future |
| 92 | +
|
| 93 | + style Core fill:#81c784 |
| 94 | + style Candidate fill:#ffb74d |
| 95 | + style Internal fill:#90caf9 |
| 96 | +``` |
| 97 | + |
| 98 | +### Package Graduation Criteria |
| 99 | + |
| 100 | +**Guiding Principle**: Packages graduating to `toolhive-core` must provide genuinely reusable value and be designed as reusable from the start. This means they should not be tied to the inner workings of toolhive in either their API surface or internal logic. A package that requires knowledge of toolhive internals to use correctly is not a good candidate for graduation. |
| 101 | + |
| 102 | +Packages can graduate via two tracks depending on complexity: |
| 103 | + |
| 104 | +#### Fast Track (Simple Packages) |
| 105 | + |
| 106 | +For small, focused packages with minimal dependencies (e.g., `env`, `errors`, `validation`): |
| 107 | + |
| 108 | +| Criterion | Requirement | |
| 109 | +|-----------|-------------| |
| 110 | +| **Production usage** | Deployed in production for ≥1 month | |
| 111 | +| **No internal dependencies** | Cannot depend on non-graduated internal packages | |
| 112 | +| **No global state** | No singletons, global variables for state, or `init()` side effects | |
| 113 | +| **Test coverage** | ≥70% line coverage | |
| 114 | +| **Documentation** | Package-level godoc | |
| 115 | +| **Approval** | GitHub issue approved by one maintainer | |
| 116 | + |
| 117 | +#### Standard Track (Complex Packages) |
| 118 | + |
| 119 | +For packages with external dependencies, multiple types, or broader API surface: |
| 120 | + |
| 121 | +| Criterion | Requirement | |
| 122 | +|-----------|-------------| |
| 123 | +| **Production usage** | Deployed in production for ≥2 months without breaking changes | |
| 124 | +| **API stability** | No breaking changes in the last 2 minor releases | |
| 125 | +| **Interface design** | Uses Go interfaces for dependency injection and testability | |
| 126 | +| **Error handling** | Returns typed errors; no panics except for programming bugs | |
| 127 | +| **No global state** | No singletons, global variables for state, or `init()` side effects | |
| 128 | +| **Test coverage** | ≥70% line coverage with meaningful assertions | |
| 129 | +| **Documentation** | Package-level godoc with usage examples | |
| 130 | +| **Linting** | Passes `golangci-lint` with project configuration | |
| 131 | +| **Minimal dependencies** | Only essential external dependencies | |
| 132 | +| **No circular imports** | Must not create import cycles when extracted | |
| 133 | +| **No internal dependencies** | Cannot depend on non-graduated internal packages | |
| 134 | +| **Stable external deps** | External dependencies must be v1.0+ or widely adopted | |
| 135 | +| **Sponsorship** | At least one maintainer sponsors the graduation | |
| 136 | +| **Approval** | RFC or detailed GitHub issue reviewed and approved | |
| 137 | + |
| 138 | +### Graduation Process |
| 139 | + |
| 140 | +1. **Proposal**: Open a GitHub issue identifying the graduation candidate and proposed track (fast/standard) |
| 141 | +2. **Track determination**: Maintainers confirm which track applies based on package complexity |
| 142 | +3. **Evaluation**: Assess against the relevant track's criteria |
| 143 | +4. **Approval**: Fast track requires one maintainer approval; standard track requires RFC or detailed issue review |
| 144 | +5. **Extraction**: Move package to `toolhive-core` with necessary adaptations |
| 145 | +6. **Release**: Tag a new semver release of `toolhive-core` |
| 146 | +7. **Migration**: Update consuming projects to import from `toolhive-core` |
| 147 | + |
| 148 | +### Stability Levels |
| 149 | + |
| 150 | +Each package in `toolhive-core` is marked with a stability level in its godoc: |
| 151 | + |
| 152 | +| Level | Meaning | API Guarantees | |
| 153 | +|-------|---------|----------------| |
| 154 | +| **Stable** | Production-ready, fully supported | No breaking changes without major version bump | |
| 155 | +| **Beta** | Feature-complete, may have minor changes | Breaking changes possible with deprecation notice | |
| 156 | +| **Alpha** | Experimental, subject to significant changes | No stability guarantees | |
| 157 | + |
| 158 | +### Versioning Strategy |
| 159 | + |
| 160 | +`toolhive-core` follows [Semantic Versioning 2.0.0](https://semver.org/): |
| 161 | + |
| 162 | +- **Major (vX.0.0)**: Breaking API changes |
| 163 | +- **Minor (v0.X.0)**: New features, backward-compatible |
| 164 | +- **Patch (v0.0.X)**: Bug fixes, backward-compatible |
| 165 | + |
| 166 | +**Release cadence**: |
| 167 | +- Patch releases: As needed for critical fixes |
| 168 | +- Minor releases: Monthly or when significant features graduate |
| 169 | +- Major releases: Rare, only when breaking changes are necessary |
| 170 | + |
| 171 | +### Initial Package Tiers |
| 172 | + |
| 173 | +Based on analysis of the `toolhive` codebase: |
| 174 | + |
| 175 | +#### Tier 1: Immediate Graduation Candidates |
| 176 | + |
| 177 | +These packages meet all graduation criteria (zero ToolHive-specific coupling, well-tested, minimal dependencies) and should be included in the initial `toolhive-core` release: |
| 178 | + |
| 179 | +| Package | Current Location | Stability | Rationale | |
| 180 | +|---------|------------------|-----------|-----------| |
| 181 | +| **errors** | `pkg/errors/` | Beta | HTTP-aware error handling, zero deps, 10mo stable | |
| 182 | +| **oauth** | `pkg/oauth/` | Beta | RFC-compliant OAuth/OIDC types, fosite dep only | |
| 183 | +| **env** | `pkg/env/` | Beta | Testable environment access, zero deps | |
| 184 | +| **permissions** | `pkg/permissions/` | Beta | Container permission profiles, stdlib-only, security validations | |
| 185 | +| **validation** | `pkg/validation/` | Beta | RFC 7230 HTTP header validation, security-focused | |
| 186 | +| **versions** | `pkg/versions/` | Beta | Build metadata, User-Agent generation | |
| 187 | +| **recovery** | `pkg/recovery/` | Beta | Panic recovery middleware, zero deps | |
| 188 | + |
| 189 | +> **Note**: All packages start as Beta in v0.x releases. Once the library reaches v1.0.0, packages meeting all graduation criteria will be promoted to Stable. |
| 190 | +
|
| 191 | +#### Tier 2: Minor Refactoring Required |
| 192 | + |
| 193 | +These packages have solid implementations but need minor changes to remove ToolHive-specific coupling: |
| 194 | + |
| 195 | +| Package | Current Location | Status | Required Changes | |
| 196 | +|---------|------------------|--------|------------------| |
| 197 | +| **healthcheck** | `pkg/healthcheck/` | Beta | Remove unused logger import | |
| 198 | +| **logger** | `pkg/logger/` | Beta | Remove `viper.GetBool("debug")` coupling; accept config via parameters | |
| 199 | +| **auth/tokenexchange** | `pkg/auth/tokenexchange/` | Beta | Split core RFC 8693 logic from ToolHive middleware | |
| 200 | +| **auth/metadata** | `pkg/auth/` (to extract) | Beta | Refactor protected resource metadata into standalone package for reuse | |
| 201 | +| **networking** | `pkg/networking/` | Beta | Replace ToolHive logger with slog or interface | |
| 202 | +| **lockfile** | `pkg/lockfile/` | Beta | Accept logger interface instead of importing pkg/logger | |
| 203 | + |
| 204 | +#### Tier 3: Partial Extraction or Future Work |
| 205 | + |
| 206 | +These packages have useful components but require significant refactoring or are too coupled: |
| 207 | + |
| 208 | +| Package | Current Location | Status | Notes | |
| 209 | +|---------|------------------|--------|-------| |
| 210 | +| **labels** | `pkg/labels/` | Beta | Extract K8s validation funcs only; keep ToolHive constants | |
| 211 | +| **secrets** | `pkg/secrets/` | Beta | Multiple providers, needs interface review | |
| 212 | +| **transport/types** | `pkg/transport/types/` | Beta | Middleware abstraction, widely used but coupled | |
| 213 | +| **mcp** | `pkg/mcp/` | Beta | MCP primitives, protocol-specific | |
| 214 | +| **registry/types** | `pkg/registry/` | Beta | Registry types already imported by toolhive-registry and toolhive-registry-server | |
| 215 | +| **container/verifier** | `pkg/container/verifier/` | Beta | Sigstore verification, already used by toolhive-registry | |
| 216 | +| **cache** | multiple locations | Beta | Generic cache abstraction (see note below) | |
| 217 | + |
| 218 | +> **Cache Consolidation Opportunity**: The codebase has several bespoke cache implementations that could be consolidated into a generic, configurable cache package: |
| 219 | +> - `TokenCache` interface in `pkg/vmcp/cache/cache.go` (planned memory/Redis implementations) |
| 220 | +> - `CachedAPIRegistryProvider` in `pkg/registry/provider_cached.go` (in-memory registry cache) |
| 221 | +> - `DefaultManager` cache in `pkg/vmcp/discovery/manager.go` (capability discovery cache with TTL) |
| 222 | +> |
| 223 | +> Consolidating on a generic cache implementation would improve testability, simplify application code (e.g., the cache handles synchronization instead of requiring mutexes in each consumer), reduce cognitive load, and enable reuse of backend implementations (e.g., a Redis-backed cache could be shared across all use cases). |
| 224 | +
|
| 225 | +#### Not Recommended for Extraction |
| 226 | + |
| 227 | +These packages are too coupled to ToolHive internals: |
| 228 | + |
| 229 | +| Package | Reason | |
| 230 | +|---------|--------| |
| 231 | +| **auth** (heavy pieces) | Product-level wiring: token.go, oauth/*, remote/*, secrets/* | |
| 232 | +| **runner**, **workloads** | Deep orchestration coupling, RunConfig migrations | |
| 233 | +| **config** | Viper global state, ToolHive-specific env vars | |
| 234 | +| **ignore** | Hardcoded `.thvignore`, ToolHive paths | |
| 235 | +| **process** | Hardcoded paths, imports pkg/container/runtime | |
| 236 | + |
| 237 | +## Security Considerations |
| 238 | + |
| 239 | +### Threat Model |
| 240 | + |
| 241 | +- **Supply chain attacks**: Malicious code injected into shared library affects all consumers |
| 242 | +- **Dependency confusion**: Typosquatting or namespace confusion attacks |
| 243 | +- **Credential exposure**: Logging or error handling inadvertently exposing secrets |
| 244 | + |
| 245 | +### Authentication and Authorization |
| 246 | + |
| 247 | +- No direct auth changes; auth packages remain in Tier 2/3 until fully reviewed |
| 248 | +- Logger and error packages must not log sensitive data by default |
| 249 | + |
| 250 | +### Data Security |
| 251 | + |
| 252 | +- Validation package must reject inputs that could cause injection attacks |
| 253 | +- Error wrapping must not expose internal implementation details in error messages |
| 254 | + |
| 255 | +### Input Validation |
| 256 | + |
| 257 | +- The `validation` package provides security-focused validation for: |
| 258 | + - HTTP headers (preventing header injection) |
| 259 | + - Resource URIs (preventing path traversal) |
| 260 | + - Group names (preventing null byte injection) |
| 261 | + |
| 262 | +### Secrets Management |
| 263 | + |
| 264 | +- Logger package must provide redaction capabilities for sensitive fields |
| 265 | +- Error messages must not include credential values |
| 266 | + |
| 267 | +### Audit and Logging |
| 268 | + |
| 269 | +- All package releases are tagged and signed |
| 270 | +- CHANGELOG documents all changes |
| 271 | +- Security advisories published via GitHub Security Advisories |
| 272 | + |
| 273 | +### Mitigations |
| 274 | + |
| 275 | +| Threat | Mitigation | |
| 276 | +|--------|------------| |
| 277 | +| Supply chain attacks | Signed releases, dependency review, SLSA provenance | |
| 278 | +| Dependency confusion | Use `github.com/stacklok/` namespace, Go module proxy | |
| 279 | +| Credential exposure | Redaction functions, sensitive field markers | |
| 280 | + |
| 281 | +## Compatibility |
| 282 | + |
| 283 | +### Backward Compatibility |
| 284 | + |
| 285 | +- **Initial release**: API-compatible with existing `toolhive/pkg/*` packages |
| 286 | +- **Migration path**: |
| 287 | + 1. Add `toolhive-core` dependency |
| 288 | + 2. Update imports (can be automated with `gofmt -r`) |
| 289 | + 3. Remove now-unused internal packages |
| 290 | +- **Deprecation**: Internal packages deprecated but kept for 2 minor versions (serves as rollback window if issues arise) |
| 291 | + |
| 292 | +### Forward Compatibility |
| 293 | + |
| 294 | +- **Extensibility**: Interfaces allow adding implementations without breaking changes |
| 295 | +- **Optional features**: New features use functional options pattern |
| 296 | +- **Version constraints**: Consumers can pin to major versions for stability |
| 297 | + |
| 298 | +## Testing Strategy |
| 299 | + |
| 300 | +### Unit Tests |
| 301 | + |
| 302 | +- All packages must have ≥70% coverage |
| 303 | +- Table-driven tests for validation functions |
| 304 | +- Mock-based tests for interface implementations |
| 305 | + |
| 306 | +### Integration Tests |
| 307 | + |
| 308 | +- Cross-package integration tests (e.g., validation + errors) |
| 309 | +- Import cycle detection in CI |
| 310 | + |
| 311 | +### Compatibility Tests |
| 312 | + |
| 313 | +- Build `toolhive` and `dockyard` against new versions |
| 314 | +- Automated compatibility matrix testing |
| 315 | + |
| 316 | +### Performance Tests |
| 317 | + |
| 318 | +- Benchmark validation functions |
| 319 | +- Memory allocation tests for hot paths |
| 320 | + |
| 321 | +### Security Tests |
| 322 | + |
| 323 | +- Static analysis with `gosec` |
| 324 | +- Dependency vulnerability scanning with `govulncheck` |
| 325 | + |
| 326 | +## Documentation |
| 327 | + |
| 328 | +- **README.md**: Quick start, installation, package overview |
| 329 | +- **STABILITY.md**: Graduation criteria, stability levels explained |
| 330 | +- **CHANGELOG.md**: All changes in keep-a-changelog format |
| 331 | +- **Package godoc**: Usage examples, API documentation |
| 332 | +- **Migration guide**: How to migrate from internal packages |
| 333 | +- **Architecture docs**: Updated to reference `toolhive-core` |
| 334 | + |
| 335 | +## Open Questions |
| 336 | + |
| 337 | +1. **Release cadence**: Monthly minor releases, or release when ready? |
| 338 | +2. **Tier 2 timeline**: When should we start refactoring healthcheck/tokenexchange for graduation? |
| 339 | +3. **Logger approach**: When decoupling logger from Viper, should we use zap with config injection, switch to slog (stdlib), or provide an interface that supports both? |
| 340 | +4. **permissions package scope**: Is `pkg/permissions` (container security profiles) generic enough for a shared library, or is it too domain-specific to ToolHive? |
| 341 | + |
| 342 | +## References |
| 343 | + |
| 344 | +- [Go Modules Reference](https://go.dev/ref/mod) |
| 345 | +- [Semantic Versioning 2.0.0](https://semver.org/) |
| 346 | +- [Keep a Changelog](https://keepachangelog.com/) |
| 347 | +- [Go API Stability](https://go.dev/doc/modules/version-numbers) |
| 348 | +- [SLSA Supply Chain Security](https://slsa.dev/) |
| 349 | +- [ToolHive Architecture Documentation](https://github.com/stacklok/toolhive/tree/main/docs/arch) |
| 350 | + |
| 351 | +--- |
| 352 | + |
| 353 | +## RFC Lifecycle |
| 354 | + |
| 355 | +<!-- This section is maintained by RFC reviewers --> |
| 356 | + |
| 357 | +### Review History |
| 358 | + |
| 359 | +| Date | Reviewer | Decision | Notes | |
| 360 | +|------|----------|----------|-------| |
| 361 | +| 2026-01-29 | @JAORMX | Draft | Initial submission | |
| 362 | + |
| 363 | +### Implementation Tracking |
| 364 | + |
| 365 | +| Repository | PR | Status | |
| 366 | +|------------|-----|--------| |
| 367 | +| toolhive-core | N/A | Not started | |
| 368 | +| toolhive | N/A | Not started | |
| 369 | +| dockyard | N/A | Not started | |
| 370 | +| toolhive-registry | N/A | Not started | |
| 371 | +| toolhive-registry-server | N/A | Optional - may continue using slog | |
0 commit comments