You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add --optimizer, --optimizer-embedding, --embedding-model, and --embedding-image flags to the serve subcommand in cmd/thv/app/vmcp.go, and wire the EmbeddingServiceManager (from #4884) into the serve lifecycle in pkg/vmcp/cli/serve.go. When --optimizer is set, the optimizer config is injected with FTS5-only mode. When --optimizer-embedding is set, the TEI container is started via EmbeddingServiceManager.Start() before vMCP starts, the returned URL is injected into OptimizerConfig.EmbeddingService, and Stop() is called cleanly on shutdown. This is Phase 4 of RFC THV-0059 and completes the Tier 1 (FTS5-only) and Tier 2 (managed TEI) optimizer tiers for thv vmcp serve.
Context
#4886 extended pkg/vmcp/cli/serve.go with quick-mode logic and added ServeConfig.GroupRef. #4884 created pkg/vmcp/cli/embedding_manager.go with EmbeddingServiceManager (Start/Stop/URL methods) and the ContainerFactory interface. This item is the integration layer: the Cobra thin wrapper in cmd/thv/app/vmcp.go receives four new flags and passes the values into an expanded ServeConfig. The serve function in pkg/vmcp/cli/serve.go reads those values, conditionally starts the TEI container, injects the URL into cfg.Optimizer.EmbeddingService, calls optimizer.GetAndValidateConfig, creates an optimizer factory, and defers Stop() + factory cleanup for graceful shutdown.
RFC THV-0059 defines three optimizer tiers: Tier 0 (no optimizer), Tier 1 (--optimizer, FTS5-only, no external service), and Tier 2 (--optimizer-embedding, FTS5 + TEI managed container). --optimizer-embedding implies --optimizer; passing --optimizer-embedding alone is sufficient to enable the full stack. Passing --optimizer-embedding and TEI failing to start is a hard failure — no silent FTS5 fallback.
Dependencies: Depends on #4886, #4884 (use Temp IDs until GitHub issue numbers are known) Blocks: #4889, #4890
Acceptance Criteria
cmd/thv/app/vmcp.go adds --optimizer (bool, default false, description: "Enable FTS5 keyword optimizer (Tier 1): exposes find_tool and call_tool instead of all backend tools"), --optimizer-embedding (bool, default false, description: "Enable managed TEI semantic optimizer (Tier 2); implies --optimizer"), --embedding-model (string, default "BAAI/bge-small-en-v1.5"), and --embedding-image (string, default "ghcr.io/huggingface/text-embeddings-inference:cpu-latest") flags on the serve subcommand
ServeConfig in pkg/vmcp/cli/serve.go gains four new fields: EnableOptimizer bool, EnableEmbedding bool, EmbeddingModel string, EmbeddingImage string
The Cobra flag values are passed through to ServeConfig without transformation (thin wrapper principle)
When EnableOptimizer is false and EnableEmbedding is false, the serve lifecycle is unchanged from pre-Wire optimizer flags into thv vmcp serve #4887 behaviour (no optimizer, no TEI container)
When EnableOptimizer is true (and EnableEmbedding is false), Serve() constructs a *config.OptimizerConfig with EmbeddingService empty and injects it into the loaded config before calling optimizer.GetAndValidateConfig; the FTS5-only optimizer factory is created and registered with the server
When EnableEmbedding is true, Serve() creates an EmbeddingServiceManager via embedding.NewEmbeddingServiceManager(factory, cfg), calls Start(ctx) to obtain the TEI URL, sets OptimizerConfig.EmbeddingService to that URL, and defers Stop(ctx) so the container is stopped cleanly on shutdown (or context cancellation)
When EnableEmbedding is true and EmbeddingServiceManager.Start() returns an error, Serve() returns that error immediately without starting the vMCP server (fail-fast, no silent FTS5 fallback)
--optimizer-embedding implies --optimizer; the serve code does not require the user to pass both flags — enabling embedding automatically enables the optimizer
The TEI container Stop() is deferred before the vMCP server starts, ensuring cleanup occurs even if the server exits with an error after TEI has started
go build ./cmd/thv/... and go build ./pkg/vmcp/cli/... both succeed
go vet ./cmd/thv/... and go vet ./pkg/vmcp/cli/... both pass with no warnings
No new external Go module dependencies are introduced
task docs is run and the regenerated CLI documentation reflects the four new flags on thv vmcp serve
Unit tests are added or extended in pkg/vmcp/cli/ covering the optimizer wiring logic (see Testing Strategy)
All existing tests pass (no regressions)
Code reviewed and approved
Technical Approach
Recommended Implementation
The change spans two files: the thin CLI wrapper (cmd/thv/app/vmcp.go) and the business logic (pkg/vmcp/cli/serve.go). No other files need modification for this item.
In cmd/thv/app/vmcp.go (newVMCPServeCommand()):
Add four flag declarations after the existing --group flag added by #4886:
Call teiURL, err := mgr.Start(ctx) — return the error immediately on failure.
defer func() { _ = mgr.Stop(context.Background()) }() (use a background context for cleanup to avoid cancellation race).
Set vmcpCfg.Optimizer.EmbeddingService = teiURL.
Call optimizer.GetAndValidateConfig(vmcpCfg.Optimizer) to obtain the validated *optimizer.Config.
Call optimizer.NewOptimizerFactory(optCfg) to get the factory and cleanup function; defer cleanup(context.Background()).
Pass the factory to vmcpsession.NewSessionFactory (via the existing opts slice or as an additional option).
Use immediately-assigned variables (no var x T reassigned across branches) and wrap errors with fmt.Errorf("...: %w", err).
Patterns & Frameworks
Thin wrapper principle: All optimizer wiring logic lives in pkg/vmcp/cli/serve.go; cmd/thv/app/vmcp.go only declares flags and passes values into ServeConfig
SPDX headers: All modified files must retain their existing SPDX headers; no new files are created by this item
Immutable variable assignment: Use immediately-invoked anonymous functions or helper functions so optCfg and teiURL are single-assignment
Defer ordering: Defers are LIFO; register TEI Stop() defer before the vMCP server's shutdown defer so TEI is stopped after the server closes its connections
Error wrapping: fmt.Errorf("failed to start TEI embedding service: %w", err) — include the original error for debuggability
No new external Go module dependencies: pkg/container, pkg/vmcp/cli/embedding_manager.go, and pkg/vmcp/optimizer/ are all already in the module
Signal handling: The --optimizer-embedding path's defer mgr.Stop(context.Background()) uses context.Background() (not the command context) to ensure Stop runs even after the serve context is cancelled
pkg/vmcp/cli/embedding_manager.go (created by Implement EmbeddingServiceManager in pkg/vmcp/cli/ #4884) — NewEmbeddingServiceManager, EmbeddingServiceManagerConfig, Start(ctx), Stop(ctx) — the primary dependency of this item
pkg/vmcp/optimizer/optimizer.go — GetAndValidateConfig(cfg *vmcpconfig.OptimizerConfig) (*Config, error) and NewOptimizerFactory(cfg *Config) — the two calls that activate the optimizer
pkg/vmcp/config/config.go — Config.Optimizer *OptimizerConfig and OptimizerConfig.EmbeddingService string — the field populated with the TEI URL
cmd/thv/app/inspector.go — Prior art for creating a container factory and managing a container lifecycle with deferred Stop inside a command function
cmd/vmcp/app/commands.go — Shows how runServe currently invokes optimizer.GetAndValidateConfig and optimizer.NewOptimizerFactory; this is the pattern to replicate in pkg/vmcp/cli/serve.go
Unit Tests (pkg/vmcp/cli/serve_test.go or pkg/vmcp/cli/optimizer_wiring_test.go)
Use go.uber.org/mock/gomock with a MockEmbeddingServiceManager (or via a narrower interface extracted for testability) and table-driven tests.
When EnableOptimizer: false and EnableEmbedding: false, the serve path does not call optimizer.GetAndValidateConfig or any embedding manager method (verify by asserting no mock calls)
When EnableOptimizer: true and EnableEmbedding: false, optimizer.GetAndValidateConfig is called with a non-nil OptimizerConfig whose EmbeddingService is empty
When EnableEmbedding: true, EmbeddingServiceManager.Start() is called; the returned URL is set on OptimizerConfig.EmbeddingService before optimizer.GetAndValidateConfig is called
When EnableEmbedding: true and Start() returns an error, Serve() returns a non-nil error and the vMCP server is never started
When EnableEmbedding: true and Start() succeeds but the serve context is cancelled immediately, Stop() is called (deferred cleanup runs)
--optimizer-embedding without --embedding-model uses the default model "BAAI/bge-small-en-v1.5" (verify ServeConfig.EmbeddingModel default is set correctly in the Cobra command)
--optimizer-embedding passed alone (without --optimizer) still activates the FTS5 layer — effectiveOptimizer must be true whenever EnableEmbedding is true
vmcpCfg.Optimizer already set (e.g. loaded from a YAML config file with optimizer: block) — the embedding URL is injected into the existing struct, not replacing it wholesale
Tier 3 (config-file based) optimizer, where the user specifies optimizer.embeddingService directly in the YAML config — that already works via the existing config load path and does not require new flags
Auto-detection of GPU availability for TEI image selection — explicitly deferred per intake
Changes to the standalone vmcp binary (cmd/vmcp/) — preserved unchanged
thv vmcp status subcommand — deferred per RFC open questions
Description
Add
--optimizer,--optimizer-embedding,--embedding-model, and--embedding-imageflags to theservesubcommand incmd/thv/app/vmcp.go, and wire theEmbeddingServiceManager(from #4884) into the serve lifecycle inpkg/vmcp/cli/serve.go. When--optimizeris set, the optimizer config is injected with FTS5-only mode. When--optimizer-embeddingis set, the TEI container is started viaEmbeddingServiceManager.Start()before vMCP starts, the returned URL is injected intoOptimizerConfig.EmbeddingService, andStop()is called cleanly on shutdown. This is Phase 4 of RFC THV-0059 and completes the Tier 1 (FTS5-only) and Tier 2 (managed TEI) optimizer tiers forthv vmcp serve.Context
#4886 extended
pkg/vmcp/cli/serve.gowith quick-mode logic and addedServeConfig.GroupRef. #4884 createdpkg/vmcp/cli/embedding_manager.gowithEmbeddingServiceManager(Start/Stop/URL methods) and theContainerFactoryinterface. This item is the integration layer: the Cobra thin wrapper incmd/thv/app/vmcp.goreceives four new flags and passes the values into an expandedServeConfig. The serve function inpkg/vmcp/cli/serve.goreads those values, conditionally starts the TEI container, injects the URL intocfg.Optimizer.EmbeddingService, callsoptimizer.GetAndValidateConfig, creates an optimizer factory, and defersStop()+ factory cleanup for graceful shutdown.RFC THV-0059 defines three optimizer tiers: Tier 0 (no optimizer), Tier 1 (
--optimizer, FTS5-only, no external service), and Tier 2 (--optimizer-embedding, FTS5 + TEI managed container).--optimizer-embeddingimplies--optimizer; passing--optimizer-embeddingalone is sufficient to enable the full stack. Passing--optimizer-embeddingand TEI failing to start is a hard failure — no silent FTS5 fallback.Dependencies: Depends on #4886, #4884 (use Temp IDs until GitHub issue numbers are known)
Blocks: #4889, #4890
Acceptance Criteria
cmd/thv/app/vmcp.goadds--optimizer(bool, defaultfalse, description: "Enable FTS5 keyword optimizer (Tier 1): exposes find_tool and call_tool instead of all backend tools"),--optimizer-embedding(bool, defaultfalse, description: "Enable managed TEI semantic optimizer (Tier 2); implies --optimizer"),--embedding-model(string, default"BAAI/bge-small-en-v1.5"), and--embedding-image(string, default"ghcr.io/huggingface/text-embeddings-inference:cpu-latest") flags on theservesubcommandServeConfiginpkg/vmcp/cli/serve.gogains four new fields:EnableOptimizer bool,EnableEmbedding bool,EmbeddingModel string,EmbeddingImage stringServeConfigwithout transformation (thin wrapper principle)EnableOptimizerisfalseandEnableEmbeddingisfalse, the serve lifecycle is unchanged from pre-Wire optimizer flags intothv vmcp serve#4887 behaviour (no optimizer, no TEI container)EnableOptimizeristrue(andEnableEmbeddingisfalse),Serve()constructs a*config.OptimizerConfigwithEmbeddingServiceempty and injects it into the loaded config before callingoptimizer.GetAndValidateConfig; the FTS5-only optimizer factory is created and registered with the serverEnableEmbeddingistrue,Serve()creates anEmbeddingServiceManagerviaembedding.NewEmbeddingServiceManager(factory, cfg), callsStart(ctx)to obtain the TEI URL, setsOptimizerConfig.EmbeddingServiceto that URL, and defersStop(ctx)so the container is stopped cleanly on shutdown (or context cancellation)EnableEmbeddingistrueandEmbeddingServiceManager.Start()returns an error,Serve()returns that error immediately without starting the vMCP server (fail-fast, no silent FTS5 fallback)--optimizer-embeddingimplies--optimizer; the serve code does not require the user to pass both flags — enabling embedding automatically enables the optimizerStop()is deferred before the vMCP server starts, ensuring cleanup occurs even if the server exits with an error after TEI has startedgo build ./cmd/thv/...andgo build ./pkg/vmcp/cli/...both succeedgo vet ./cmd/thv/...andgo vet ./pkg/vmcp/cli/...both pass with no warningstask docsis run and the regenerated CLI documentation reflects the four new flags onthv vmcp servepkg/vmcp/cli/covering the optimizer wiring logic (see Testing Strategy)Technical Approach
Recommended Implementation
The change spans two files: the thin CLI wrapper (
cmd/thv/app/vmcp.go) and the business logic (pkg/vmcp/cli/serve.go). No other files need modification for this item.In
cmd/thv/app/vmcp.go(newVMCPServeCommand()):Add four flag declarations after the existing
--groupflag added by #4886:Pass these into
vmcpcli.ServeConfig{..., EnableOptimizer: enableOptimizer, EnableEmbedding: enableEmbedding, EmbeddingModel: embeddingModel, EmbeddingImage: embeddingImage}.In
pkg/vmcp/cli/serve.go(theServe()function):After the existing config-load / quick-mode branching, add optimizer wiring logic:
cfg.EnableEmbeddingistrue, seteffectiveOptimizer = true.effectiveOptimizeristrue, ensurevmcpCfg.Optimizeris non-nil and is an empty*config.OptimizerConfig{}.cfg.EnableEmbeddingistrue:embeddingManagerCfg := embedding.EmbeddingServiceManagerConfig{Model: cfg.EmbeddingModel, Image: cfg.EmbeddingImage}.mgr, err := embedding.NewEmbeddingServiceManager(container.NewFactory(), embeddingManagerCfg).teiURL, err := mgr.Start(ctx)— return the error immediately on failure.defer func() { _ = mgr.Stop(context.Background()) }()(use a background context for cleanup to avoid cancellation race).vmcpCfg.Optimizer.EmbeddingService = teiURL.optimizer.GetAndValidateConfig(vmcpCfg.Optimizer)to obtain the validated*optimizer.Config.optimizer.NewOptimizerFactory(optCfg)to get the factory and cleanup function;defer cleanup(context.Background()).vmcpsession.NewSessionFactory(via the existingoptsslice or as an additional option).Use immediately-assigned variables (no
var x Treassigned across branches) and wrap errors withfmt.Errorf("...: %w", err).Patterns & Frameworks
pkg/vmcp/cli/serve.go;cmd/thv/app/vmcp.goonly declares flags and passes values intoServeConfigoptCfgandteiURLare single-assignmentStop()defer before the vMCP server's shutdown defer so TEI is stopped after the server closes its connectionsfmt.Errorf("failed to start TEI embedding service: %w", err)— include the original error for debuggabilitypkg/container,pkg/vmcp/cli/embedding_manager.go, andpkg/vmcp/optimizer/are all already in the module--optimizer-embeddingpath'sdefer mgr.Stop(context.Background())usescontext.Background()(not the command context) to ensure Stop runs even after the serve context is cancelledCode Pointers
cmd/thv/app/vmcp.go(created by Addthv vmcp serveandthv vmcp validatesubcommands #4883, modified by Implement zero-config quick mode forthv vmcp serve#4886) — Add four new flags innewVMCPServeCommand(); pass them intoServeConfigpkg/vmcp/cli/serve.go(modified by Extract shared vMCP logic intopkg/vmcp/cli/(serve + validate) #4879 and Implement zero-config quick mode forthv vmcp serve#4886) — ExpandServeConfigwith four new fields; add optimizer wiring after the config-load blockpkg/vmcp/cli/embedding_manager.go(created by Implement EmbeddingServiceManager in pkg/vmcp/cli/ #4884) —NewEmbeddingServiceManager,EmbeddingServiceManagerConfig,Start(ctx),Stop(ctx)— the primary dependency of this itempkg/vmcp/optimizer/optimizer.go—GetAndValidateConfig(cfg *vmcpconfig.OptimizerConfig) (*Config, error)andNewOptimizerFactory(cfg *Config)— the two calls that activate the optimizerpkg/vmcp/config/config.go—Config.Optimizer *OptimizerConfigandOptimizerConfig.EmbeddingService string— the field populated with the TEI URLcmd/thv/app/inspector.go— Prior art for creating a container factory and managing a container lifecycle with deferred Stop inside a command functioncmd/vmcp/app/commands.go— Shows howrunServecurrently invokesoptimizer.GetAndValidateConfigandoptimizer.NewOptimizerFactory; this is the pattern to replicate inpkg/vmcp/cli/serve.go.claude/rules/go-style.md— Immutable variable assignment, error wrapping, drain response bodies, SPDX headers.claude/rules/cli-commands.md— Thin wrapper principle; adding new flags checklistComponent Interfaces
Testing Strategy
Unit Tests (
pkg/vmcp/cli/serve_test.goorpkg/vmcp/cli/optimizer_wiring_test.go)Use
go.uber.org/mock/gomockwith aMockEmbeddingServiceManager(or via a narrower interface extracted for testability) and table-driven tests.EnableOptimizer: falseandEnableEmbedding: false, the serve path does not calloptimizer.GetAndValidateConfigor any embedding manager method (verify by asserting no mock calls)EnableOptimizer: trueandEnableEmbedding: false,optimizer.GetAndValidateConfigis called with a non-nilOptimizerConfigwhoseEmbeddingServiceis emptyEnableEmbedding: true,EmbeddingServiceManager.Start()is called; the returned URL is set onOptimizerConfig.EmbeddingServicebeforeoptimizer.GetAndValidateConfigis calledEnableEmbedding: trueandStart()returns an error,Serve()returns a non-nil error and the vMCP server is never startedEnableEmbedding: trueandStart()succeeds but the serve context is cancelled immediately,Stop()is called (deferred cleanup runs)--optimizer-embeddingwithout--embedding-modeluses the default model"BAAI/bge-small-en-v1.5"(verifyServeConfig.EmbeddingModeldefault is set correctly in the Cobra command)Integration Tests
Edge Cases
--optimizer-embeddingpassed alone (without--optimizer) still activates the FTS5 layer —effectiveOptimizermust betruewheneverEnableEmbeddingistrueEmbeddingImageempty string falls through toEmbeddingServiceManagerdefaults (validated in Implement EmbeddingServiceManager in pkg/vmcp/cli/ #4884);Serve()does not need to validate image separatelyvmcpCfg.Optimizeralready set (e.g. loaded from a YAML config file withoptimizer:block) — the embedding URL is injected into the existing struct, not replacing it wholesaleOut of Scope
optimizer.embeddingServicedirectly in the YAML config — that already works via the existing config load path and does not require new flagsvmcpbinary (cmd/vmcp/) — preserved unchangedthv vmcp statussubcommand — deferred per RFC open questionsReferences
pkg/vmcp/optimizer/optimizer.go—GetAndValidateConfig,NewOptimizerFactory— the two optimizer activation callspkg/vmcp/config/config.go—Config.Optimizer *OptimizerConfig,OptimizerConfig.EmbeddingServicestring fieldpkg/vmcp/cli/embedding_manager.go(Implement EmbeddingServiceManager in pkg/vmcp/cli/ #4884) —EmbeddingServiceManager,NewEmbeddingServiceManager,Start,Stoppkg/vmcp/cli/serve.go(Extract shared vMCP logic intopkg/vmcp/cli/(serve + validate) #4879, Implement zero-config quick mode forthv vmcp serve#4886) — The function to extend;ServeConfigstruct to expandcmd/thv/app/vmcp.go(Addthv vmcp serveandthv vmcp validatesubcommands #4883, Implement zero-config quick mode forthv vmcp serve#4886) — The thin wrapper to add four flags tocmd/thv/app/inspector.go— Prior art for auxiliary container lifecycle with deferred cleanup.claude/rules/go-style.md— SPDX headers, immutable variable assignment, error wrapping.claude/rules/cli-commands.md— Thin wrapper principle; new flag checklist.claude/rules/testing.md— Unit test strategy forpkg/packages; gomock usage;t.Cleanupfor resource teardown