Skip to content

Wire optimizer flags into thv vmcp serve #4887

Description

@yrobla

Description

Add --optimizer, --optimizer-embedding, --embedding-model, and --embedding-image flags to the serve subcommand in cmd/thv/app/vmcp.go, and wire the EmbeddingServiceManager (from #4884) into the serve lifecycle in pkg/vmcp/cli/serve.go. When --optimizer is set, the optimizer config is injected with FTS5-only mode. When --optimizer-embedding is set, the TEI container is started via EmbeddingServiceManager.Start() before vMCP starts, the returned URL is injected into OptimizerConfig.EmbeddingService, and Stop() is called cleanly on shutdown. This is Phase 4 of RFC THV-0059 and completes the Tier 1 (FTS5-only) and Tier 2 (managed TEI) optimizer tiers for thv vmcp serve.

Context

#4886 extended pkg/vmcp/cli/serve.go with quick-mode logic and added ServeConfig.GroupRef. #4884 created pkg/vmcp/cli/embedding_manager.go with EmbeddingServiceManager (Start/Stop/URL methods) and the ContainerFactory interface. This item is the integration layer: the Cobra thin wrapper in cmd/thv/app/vmcp.go receives four new flags and passes the values into an expanded ServeConfig. The serve function in pkg/vmcp/cli/serve.go reads those values, conditionally starts the TEI container, injects the URL into cfg.Optimizer.EmbeddingService, calls optimizer.GetAndValidateConfig, creates an optimizer factory, and defers Stop() + factory cleanup for graceful shutdown.

RFC THV-0059 defines three optimizer tiers: Tier 0 (no optimizer), Tier 1 (--optimizer, FTS5-only, no external service), and Tier 2 (--optimizer-embedding, FTS5 + TEI managed container). --optimizer-embedding implies --optimizer; passing --optimizer-embedding alone is sufficient to enable the full stack. Passing --optimizer-embedding and TEI failing to start is a hard failure — no silent FTS5 fallback.

Dependencies: Depends on #4886, #4884 (use Temp IDs until GitHub issue numbers are known)
Blocks: #4889, #4890

Acceptance Criteria

  • cmd/thv/app/vmcp.go adds --optimizer (bool, default false, description: "Enable FTS5 keyword optimizer (Tier 1): exposes find_tool and call_tool instead of all backend tools"), --optimizer-embedding (bool, default false, description: "Enable managed TEI semantic optimizer (Tier 2); implies --optimizer"), --embedding-model (string, default "BAAI/bge-small-en-v1.5"), and --embedding-image (string, default "ghcr.io/huggingface/text-embeddings-inference:cpu-latest") flags on the serve subcommand
  • ServeConfig in pkg/vmcp/cli/serve.go gains four new fields: EnableOptimizer bool, EnableEmbedding bool, EmbeddingModel string, EmbeddingImage string
  • The Cobra flag values are passed through to ServeConfig without transformation (thin wrapper principle)
  • When EnableOptimizer is false and EnableEmbedding is false, the serve lifecycle is unchanged from pre-Wire optimizer flags into thv vmcp serve #4887 behaviour (no optimizer, no TEI container)
  • When EnableOptimizer is true (and EnableEmbedding is false), Serve() constructs a *config.OptimizerConfig with EmbeddingService empty and injects it into the loaded config before calling optimizer.GetAndValidateConfig; the FTS5-only optimizer factory is created and registered with the server
  • When EnableEmbedding is true, Serve() creates an EmbeddingServiceManager via embedding.NewEmbeddingServiceManager(factory, cfg), calls Start(ctx) to obtain the TEI URL, sets OptimizerConfig.EmbeddingService to that URL, and defers Stop(ctx) so the container is stopped cleanly on shutdown (or context cancellation)
  • When EnableEmbedding is true and EmbeddingServiceManager.Start() returns an error, Serve() returns that error immediately without starting the vMCP server (fail-fast, no silent FTS5 fallback)
  • --optimizer-embedding implies --optimizer; the serve code does not require the user to pass both flags — enabling embedding automatically enables the optimizer
  • The TEI container Stop() is deferred before the vMCP server starts, ensuring cleanup occurs even if the server exits with an error after TEI has started
  • go build ./cmd/thv/... and go build ./pkg/vmcp/cli/... both succeed
  • go vet ./cmd/thv/... and go vet ./pkg/vmcp/cli/... both pass with no warnings
  • No new external Go module dependencies are introduced
  • task docs is run and the regenerated CLI documentation reflects the four new flags on thv vmcp serve
  • Unit tests are added or extended in pkg/vmcp/cli/ covering the optimizer wiring logic (see Testing Strategy)
  • All existing tests pass (no regressions)
  • Code reviewed and approved

Technical Approach

Recommended Implementation

The change spans two files: the thin CLI wrapper (cmd/thv/app/vmcp.go) and the business logic (pkg/vmcp/cli/serve.go). No other files need modification for this item.

In cmd/thv/app/vmcp.go (newVMCPServeCommand()):

Add four flag declarations after the existing --group flag added by #4886:

cmd.Flags().BoolVar(&enableOptimizer,  "optimizer",          false, "Enable FTS5 keyword optimizer (Tier 1)")
cmd.Flags().BoolVar(&enableEmbedding,  "optimizer-embedding", false, "Enable managed TEI semantic optimizer (Tier 2); implies --optimizer")
cmd.Flags().StringVar(&embeddingModel, "embedding-model",    "BAAI/bge-small-en-v1.5", "HuggingFace model name for semantic search")
cmd.Flags().StringVar(&embeddingImage, "embedding-image",    "ghcr.io/huggingface/text-embeddings-inference:cpu-latest", "TEI container image")

Pass these into vmcpcli.ServeConfig{..., EnableOptimizer: enableOptimizer, EnableEmbedding: enableEmbedding, EmbeddingModel: embeddingModel, EmbeddingImage: embeddingImage}.

In pkg/vmcp/cli/serve.go (the Serve() function):

After the existing config-load / quick-mode branching, add optimizer wiring logic:

  1. If cfg.EnableEmbedding is true, set effectiveOptimizer = true.
  2. If effectiveOptimizer is true, ensure vmcpCfg.Optimizer is non-nil and is an empty *config.OptimizerConfig{}.
  3. If cfg.EnableEmbedding is true:
    • Construct embeddingManagerCfg := embedding.EmbeddingServiceManagerConfig{Model: cfg.EmbeddingModel, Image: cfg.EmbeddingImage}.
    • Call mgr, err := embedding.NewEmbeddingServiceManager(container.NewFactory(), embeddingManagerCfg).
    • Call teiURL, err := mgr.Start(ctx) — return the error immediately on failure.
    • defer func() { _ = mgr.Stop(context.Background()) }() (use a background context for cleanup to avoid cancellation race).
    • Set vmcpCfg.Optimizer.EmbeddingService = teiURL.
  4. Call optimizer.GetAndValidateConfig(vmcpCfg.Optimizer) to obtain the validated *optimizer.Config.
  5. Call optimizer.NewOptimizerFactory(optCfg) to get the factory and cleanup function; defer cleanup(context.Background()).
  6. Pass the factory to vmcpsession.NewSessionFactory (via the existing opts slice or as an additional option).

Use immediately-assigned variables (no var x T reassigned across branches) and wrap errors with fmt.Errorf("...: %w", err).

Patterns & Frameworks

  • Thin wrapper principle: All optimizer wiring logic lives in pkg/vmcp/cli/serve.go; cmd/thv/app/vmcp.go only declares flags and passes values into ServeConfig
  • SPDX headers: All modified files must retain their existing SPDX headers; no new files are created by this item
  • Immutable variable assignment: Use immediately-invoked anonymous functions or helper functions so optCfg and teiURL are single-assignment
  • Defer ordering: Defers are LIFO; register TEI Stop() defer before the vMCP server's shutdown defer so TEI is stopped after the server closes its connections
  • Error wrapping: fmt.Errorf("failed to start TEI embedding service: %w", err) — include the original error for debuggability
  • No new external Go module dependencies: pkg/container, pkg/vmcp/cli/embedding_manager.go, and pkg/vmcp/optimizer/ are all already in the module
  • Signal handling: The --optimizer-embedding path's defer mgr.Stop(context.Background()) uses context.Background() (not the command context) to ensure Stop runs even after the serve context is cancelled

Code Pointers

  • cmd/thv/app/vmcp.go (created by Add thv vmcp serve and thv vmcp validate subcommands #4883, modified by Implement zero-config quick mode for thv vmcp serve #4886) — Add four new flags in newVMCPServeCommand(); pass them into ServeConfig
  • pkg/vmcp/cli/serve.go (modified by Extract shared vMCP logic into pkg/vmcp/cli/ (serve + validate) #4879 and Implement zero-config quick mode for thv vmcp serve #4886) — Expand ServeConfig with four new fields; add optimizer wiring after the config-load block
  • pkg/vmcp/cli/embedding_manager.go (created by Implement EmbeddingServiceManager in pkg/vmcp/cli/ #4884) — NewEmbeddingServiceManager, EmbeddingServiceManagerConfig, Start(ctx), Stop(ctx) — the primary dependency of this item
  • pkg/vmcp/optimizer/optimizer.goGetAndValidateConfig(cfg *vmcpconfig.OptimizerConfig) (*Config, error) and NewOptimizerFactory(cfg *Config) — the two calls that activate the optimizer
  • pkg/vmcp/config/config.goConfig.Optimizer *OptimizerConfig and OptimizerConfig.EmbeddingService string — the field populated with the TEI URL
  • cmd/thv/app/inspector.go — Prior art for creating a container factory and managing a container lifecycle with deferred Stop inside a command function
  • cmd/vmcp/app/commands.go — Shows how runServe currently invokes optimizer.GetAndValidateConfig and optimizer.NewOptimizerFactory; this is the pattern to replicate in pkg/vmcp/cli/serve.go
  • .claude/rules/go-style.md — Immutable variable assignment, error wrapping, drain response bodies, SPDX headers
  • .claude/rules/cli-commands.md — Thin wrapper principle; adding new flags checklist

Component Interfaces

// pkg/vmcp/cli/serve.go — updated ServeConfig (additions only; existing fields preserved)

// ServeConfig holds all parameters for starting the vMCP server.
// ConfigPath and GroupRef are mutually exclusive sources of config;
// exactly one must be non-empty (enforced by #4886 validation).
type ServeConfig struct {
    ConfigPath  string // path to YAML config file; takes precedence over GroupRef
    GroupRef    string // ToolHive group name; used for zero-config quick mode
    Host        string
    Port        int
    EnableAudit bool

    // Optimizer tier selection (Phase 4 additions)
    EnableOptimizer bool   // Tier 1: FTS5-only keyword search
    EnableEmbedding bool   // Tier 2: TEI semantic search (implies EnableOptimizer)
    EmbeddingModel  string // HuggingFace model name; default "BAAI/bge-small-en-v1.5"
    EmbeddingImage  string // TEI container image; default TEI CPU image
}
// pkg/vmcp/cli/serve.go — optimizer wiring sketch (within Serve())

func Serve(ctx context.Context, cfg ServeConfig) error {
    // ... existing config-load / quick-mode logic from #4886 ...

    // Optimizer wiring — Phase 4
    effectiveOptimizer := cfg.EnableOptimizer || cfg.EnableEmbedding
    if effectiveOptimizer {
        if vmcpCfg.Optimizer == nil {
            vmcpCfg.Optimizer = &vmcpconfig.OptimizerConfig{}
        }
    }

    if cfg.EnableEmbedding {
        mgrCfg := embedding.EmbeddingServiceManagerConfig{
            Model: cfg.EmbeddingModel,
            Image: cfg.EmbeddingImage,
        }
        mgr, err := embedding.NewEmbeddingServiceManager(container.NewFactory(), mgrCfg)
        if err != nil {
            return fmt.Errorf("failed to create embedding service manager: %w", err)
        }
        teiURL, err := mgr.Start(ctx)
        if err != nil {
            return fmt.Errorf("failed to start TEI embedding service: %w", err)
        }
        defer func() { _ = mgr.Stop(context.Background()) }()
        vmcpCfg.Optimizer.EmbeddingService = teiURL
    }

    var optimizerFactory func(context.Context, []server.ServerTool) (optimizer.Optimizer, error)
    if effectiveOptimizer {
        optCfg, err := optimizer.GetAndValidateConfig(vmcpCfg.Optimizer)
        if err != nil {
            return fmt.Errorf("invalid optimizer config: %w", err)
        }
        factory, cleanup, err := optimizer.NewOptimizerFactory(optCfg)
        if err != nil {
            return fmt.Errorf("failed to create optimizer factory: %w", err)
        }
        defer func() { _ = cleanup(context.Background()) }()
        optimizerFactory = factory
    }

    // ... pass optimizerFactory into session factory options ...
}

Testing Strategy

Unit Tests (pkg/vmcp/cli/serve_test.go or pkg/vmcp/cli/optimizer_wiring_test.go)

Use go.uber.org/mock/gomock with a MockEmbeddingServiceManager (or via a narrower interface extracted for testability) and table-driven tests.

  • When EnableOptimizer: false and EnableEmbedding: false, the serve path does not call optimizer.GetAndValidateConfig or any embedding manager method (verify by asserting no mock calls)
  • When EnableOptimizer: true and EnableEmbedding: false, optimizer.GetAndValidateConfig is called with a non-nil OptimizerConfig whose EmbeddingService is empty
  • When EnableEmbedding: true, EmbeddingServiceManager.Start() is called; the returned URL is set on OptimizerConfig.EmbeddingService before optimizer.GetAndValidateConfig is called
  • When EnableEmbedding: true and Start() returns an error, Serve() returns a non-nil error and the vMCP server is never started
  • When EnableEmbedding: true and Start() succeeds but the serve context is cancelled immediately, Stop() is called (deferred cleanup runs)
  • --optimizer-embedding without --embedding-model uses the default model "BAAI/bge-small-en-v1.5" (verify ServeConfig.EmbeddingModel default is set correctly in the Cobra command)

Integration Tests

Edge Cases

  • --optimizer-embedding passed alone (without --optimizer) still activates the FTS5 layer — effectiveOptimizer must be true whenever EnableEmbedding is true
  • EmbeddingImage empty string falls through to EmbeddingServiceManager defaults (validated in Implement EmbeddingServiceManager in pkg/vmcp/cli/ #4884); Serve() does not need to validate image separately
  • vmcpCfg.Optimizer already set (e.g. loaded from a YAML config file with optimizer: block) — the embedding URL is injected into the existing struct, not replacing it wholesale

Out of Scope

  • E2E tests for the full optimizer lifecycle — those are E2E tests: optimizer tiers and regression #4889
  • Tier 3 (config-file based) optimizer, where the user specifies optimizer.embeddingService directly in the YAML config — that already works via the existing config load path and does not require new flags
  • Auto-detection of GPU availability for TEI image selection — explicitly deferred per intake
  • Changes to the standalone vmcp binary (cmd/vmcp/) — preserved unchanged
  • thv vmcp status subcommand — deferred per RFC open questions
  • Architecture documentation — that is Architecture documentation for local vMCP #4890

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    cliChanges that impact CLI functionalityenhancementNew feature or requestvmcpVirtual MCP Server related issues
    No fields configured for Task 📋.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions