Skip to content

Commit 28d4f81

Browse files
authored
refactor(mcp-optimizer): remove the feature and migrate legacy users on startup (#2086)
* refactor(mcp-optimizer): remove sunset UI, route, and feature module Follows the deprecation shipped in #1910. Deletes the /mcp-optimizer route, the sunset banner, the features/meta-mcp module, and every optimizer-specific branch that lived across MCP server orchestration, group actions, client registration, and routing. Stops tracking the toolhive-mcp-optimizer image in Renovate and drops the sunset blog URL constant. Trims constants.ts down to the identifiers still used by the legacy-user cleanup migration. * feat(mcp-optimizer): run cleanup once on startup for legacy users Users who had MCP Optimizer enabled still have the __mcp-optimizer__ group, its meta-mcp workload, and their clients registered to the optimizer group after the feature is removed. On first launch post-upgrade, wait for the ToolHive daemon to become ready and then run the same teardown that used to fire when the experimental toggle was flipped off: restore clients to the ALLOWED_GROUPS target and delete the optimizer group with its workloads. - Add useMcpOptimizerStartupCleanup and wire it into RootComponent. It polls electronAPI.isToolhiveRunning before marking itself as run, so a slow daemon start cannot cause the migration to silently no-op. - Rewrite useCleanupMetaOptimizer to fetch groups and the meta-mcp workload at call-time via queryClient.fetchQuery, decoupling it from React Query closures and from the feature-flag gate. - Flip META_OPTIMIZER to isDisabled: true so the flag key is still defined (for one release) and the UI no longer surfaces it, while the cleanup path remains callable. * test(e2e): cover MCP optimizer startup cleanup migration Two-session Playwright spec that relaunches the app against the same userDataDir to exercise the one-shot startup hook: - Session 1 creates a custom group via the UI, then seeds __mcp-optimizer__, registers a client against it, and creates a meta-mcp workload pointing at a local test MCP server with ALLOWED_GROUPS set to the custom group. - Session 2 polls the thv HTTP API until the optimizer group is gone, the client is re-registered to the custom group, and the meta-mcp workload returns 404 — with a best-effort thv CLI teardown in afterAll as a safety net. Adds a reusable launchApp + thvFetch helper so future multi-session specs don't have to re-implement Electron + userDataDir wiring. * fix(mcp-optimizer): harden startup cleanup against edge cases Addresses review feedback on #2086: - useMcpOptimizerStartupCleanup no longer latches `hasRunCleanup` before awaiting ToolHive readiness. If the 60s readiness wait times out the hook can now retry on a subsequent effect re-fire. An `inFlight` ref still prevents concurrent starts. - useCleanupMetaOptimizer parses ALLOWED_GROUPS defensively (trim, first non-empty comma entry) and only restores clients when the target group still exists in the fetched groups list. Unregister + group delete still proceed regardless, so the optimizer state never lingers. - mcp-optimizer-startup-cleanup.spec.ts bumps the cleanup poll timeout from 60s to 120s so slow CI runners don't flake when the app-side readiness wait consumes most of the original budget. Adds a regression test that asserts restoration is skipped when ALLOWED_GROUPS points to a group that no longer exists. Made-with: Cursor * fix(mcp-optimizer): address deeper review feedback on startup migration Second follow-up on #2086, addressing the more nuanced review points: - useMcpOptimizerStartupCleanup now logs the elapsed cleanup duration on success, making slow CI runs easier to diagnose. - useCleanupMetaOptimizer aborts before touching the optimizer group when restoring clients to the ALLOWED_GROUPS target fails. The startup hook retries on the next launch, so clients are never left orphaned between "removed from optimizer" and "restored to target". - useCleanupMetaOptimizer wraps the final deleteGroup call in a try/catch that swallows 404s. In the rare race where the group is removed between our fetch and the delete (e.g. concurrent CLI teardown), the migration treats the end state as success instead of surfacing a phantom error. - useDeleteServer drops the unused `group` param from its type and from its sole caller in RemoveServerMenuItem, keeping the hook contract honest now that the optimizer-specific notification path is gone. Adds a regression test asserting that a failing clients/register call leaves the optimizer group untouched (no unregister, no delete). Made-with: Cursor * fix(e2e): force-exit electron between sessions in optimizer cleanup test On Linux CI, `ElectronApplication.close()` hung for the full 180s test timeout after the first session seeded a running workload via the thv API - the before-quit / graceful-shutdown path waits on a workload that never drains. Add a `close()` helper on `LaunchedApp` that force-exits via `app.exit(0)`, races `app.close()` against a 5s fallback, and SIGKILLs the leftover process. Use it from both session teardowns in the optimizer cleanup spec. Made-with: Cursor
1 parent 4edcbff commit 28d4f81

68 files changed

Lines changed: 754 additions & 4417 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

common/app-info.ts

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -49,14 +49,11 @@ export const GITHUB_PAGES_MANIFEST_URL =
4949
export const DOCS_BASE_URL = 'https://docs.stacklok.com/toolhive'
5050
export const DISCORD_URL = 'https://discord.gg/stacklok'
5151
export const DEMO_URL = 'https://stacklok.com/demo/'
52-
export const MCP_OPTIMIZER_SUNSET_BLOG_URL =
53-
'https://stacklok.com/blog/mcp-optimizer-is-now-built-into-the-stacklok-platform/'
5452

5553
// ── Registry ─────────────────────────────────────────────────────────────────
5654

5755
export const DEFAULT_REGISTRY_JSON_URL =
5856
'https://raw.githubusercontent.com/stacklok/toolhive/refs/heads/main/pkg/registry/data/registry.json'
59-
export const MCP_OPTIMIZER_IMAGE = 'ghcr.io/stackloklabs/mcp-optimizer'
6057

6158
// ── Privacy / legal ──────────────────────────────────────────────────────────
6259

e2e-tests/helpers/app-relaunch.ts

Lines changed: 173 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
import path from 'path'
2+
import {
3+
_electron as electron,
4+
type ElectronApplication,
5+
type Page,
6+
} from '@playwright/test'
7+
8+
function getExecutablePath(): string {
9+
const platform = process.platform
10+
const arch = process.arch
11+
const basePath = path.join(__dirname, '..', '..', 'out')
12+
13+
if (platform === 'darwin') {
14+
return path.join(
15+
basePath,
16+
`ToolHive-darwin-${arch}`,
17+
'ToolHive.app',
18+
'Contents',
19+
'MacOS',
20+
'ToolHive'
21+
)
22+
} else if (platform === 'win32') {
23+
return path.join(basePath, `ToolHive-win32-${arch}`, 'ToolHive.exe')
24+
} else {
25+
return path.join(basePath, `ToolHive-linux-${arch}`, 'ToolHive')
26+
}
27+
}
28+
29+
export interface LaunchedApp {
30+
app: ElectronApplication
31+
window: Page
32+
baseUrl: string
33+
/**
34+
* Terminate the app without waiting on the renderer's before-quit teardown.
35+
*
36+
* On Linux CI the regular `ElectronApplication.close()` has been observed to
37+
* hang indefinitely when a session has seeded a running workload via the thv
38+
* API (the graceful shutdown path waits on a remote workload that never
39+
* drains). We bypass that path via `app.exit(0)` and fall back to SIGKILL.
40+
*/
41+
close: () => Promise<void>
42+
}
43+
44+
/**
45+
* Launch the Electron app bound to a specific userDataDir so the same
46+
* directory can be reused across launches within a single test.
47+
*
48+
* Mirrors the setup in e2e-tests/fixtures/electron.ts but exposes the raw
49+
* app/window instead of running the shared MCP server group bootstrap.
50+
*/
51+
export async function launchApp(userDataDir: string): Promise<LaunchedApp> {
52+
const app = await electron.launch({
53+
executablePath: getExecutablePath(),
54+
...(process.env.CI ? { recordVideo: { dir: 'test-videos' } } : {}),
55+
args: ['--no-sandbox', `--user-data-dir=${userDataDir}`],
56+
env: {
57+
...process.env,
58+
TOOLHIVE_E2E: 'true',
59+
},
60+
})
61+
62+
const window = await app.firstWindow()
63+
64+
await window.route('https://*.sentry.io/**', (route) => {
65+
throw new Error(`Sentry request blocked: ${route.request().url()}`)
66+
})
67+
68+
// Disable quit confirmation dialog as a safety net; our close() helper also
69+
// force-exits, so this mostly keeps manual tear-downs clean.
70+
await window.evaluate(async () => {
71+
// eslint-disable-next-line @typescript-eslint/no-explicit-any
72+
await (globalThis as any).electronAPI.setSkipQuitConfirmation(true)
73+
})
74+
75+
await window.getByRole('link', { name: /mcp servers/i }).waitFor()
76+
77+
const port = await window.evaluate(async () => {
78+
// eslint-disable-next-line @typescript-eslint/no-explicit-any
79+
return (await (globalThis as any).electronAPI.getToolhivePort()) as number
80+
})
81+
82+
if (!port) {
83+
throw new Error('Failed to resolve ToolHive port from the launched app')
84+
}
85+
86+
const baseUrl = `http://127.0.0.1:${port}`
87+
88+
await waitForThvReady(baseUrl)
89+
90+
const close = async () => {
91+
// Force an immediate exit via Electron's app.exit(), bypassing before-quit
92+
// handlers, the confirmation dialog, and any pending graceful-shutdown
93+
// work (e.g. waiting on seeded workloads to drain).
94+
try {
95+
await app.evaluate(({ app: electronApp }) => electronApp.exit(0))
96+
} catch {
97+
// Renderer/main may already be gone; ignore.
98+
}
99+
100+
// Give the process a short window to exit cleanly, then hard-kill.
101+
const proc = app.process()
102+
await Promise.race([
103+
app.close().catch(() => {}),
104+
new Promise<void>((resolve) => setTimeout(resolve, 5_000)),
105+
])
106+
107+
if (proc.exitCode === null && !proc.killed) {
108+
try {
109+
proc.kill('SIGKILL')
110+
} catch {
111+
// Process may have exited between the check and the kill.
112+
}
113+
}
114+
}
115+
116+
return { app, window, baseUrl, close }
117+
}
118+
119+
/**
120+
* Thin wrapper around `fetch` that raises on non-2xx/4xx responses the caller
121+
* wants to treat as failures, optionally returning parsed JSON.
122+
*/
123+
export async function thvFetch<T = unknown>(
124+
baseUrl: string,
125+
apiPath: string,
126+
init?: RequestInit & { expectStatus?: number[] }
127+
): Promise<{ status: number; json: T | null }> {
128+
const { expectStatus, ...rest } = init ?? {}
129+
const res = await fetch(`${baseUrl}${apiPath}`, {
130+
...rest,
131+
headers: {
132+
'content-type': 'application/json',
133+
...(rest.headers ?? {}),
134+
},
135+
})
136+
137+
if (expectStatus && !expectStatus.includes(res.status)) {
138+
const body = await res.text()
139+
throw new Error(
140+
`thvFetch ${apiPath} expected status in [${expectStatus.join(',')}], got ${res.status}: ${body}`
141+
)
142+
}
143+
144+
const text = await res.text()
145+
let json: T | null = null
146+
if (text) {
147+
try {
148+
json = JSON.parse(text) as T
149+
} catch {
150+
json = null
151+
}
152+
}
153+
return { status: res.status, json }
154+
}
155+
156+
async function waitForThvReady(
157+
baseUrl: string,
158+
{ timeoutMs = 30_000 } = {}
159+
): Promise<void> {
160+
const deadline = Date.now() + timeoutMs
161+
while (Date.now() < deadline) {
162+
try {
163+
const res = await fetch(`${baseUrl}/api/v1beta/groups`)
164+
if (res.ok) return
165+
} catch {
166+
// keep polling
167+
}
168+
await new Promise((resolve) => setTimeout(resolve, 250))
169+
}
170+
throw new Error(
171+
`ToolHive API at ${baseUrl} did not become ready within ${timeoutMs}ms`
172+
)
173+
}
Lines changed: 186 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,186 @@
1+
import path from 'path'
2+
import fs from 'fs'
3+
import os from 'os'
4+
import { execSync } from 'child_process'
5+
import { test, expect } from '@playwright/test'
6+
import { launchApp, thvFetch, type LaunchedApp } from './helpers/app-relaunch'
7+
import {
8+
startTestMcpServer,
9+
type TestMcpServer,
10+
} from './helpers/test-mcp-server'
11+
12+
const OPTIMIZER_GROUP = '__mcp-optimizer__'
13+
const META_MCP_SERVER = 'toolhive-mcp-optimizer'
14+
const CUSTOM_GROUP = 'pw-optimizer-custom'
15+
const TEST_CLIENT = 'vscode'
16+
17+
function getThvPath(): string {
18+
const platform = process.platform
19+
const arch = process.arch
20+
const binName = platform === 'win32' ? 'thv.exe' : 'thv'
21+
return path.join(__dirname, '..', 'bin', `${platform}-${arch}`, binName)
22+
}
23+
24+
/**
25+
* Best-effort cleanup of leftover groups via the thv CLI. The CLI requires a
26+
* Docker-compatible runtime but we already require that for e2e tests.
27+
*/
28+
function bestEffortCliCleanup(): void {
29+
const thvPath = getThvPath()
30+
const env = { ...process.env, TOOLHIVE_SKIP_DESKTOP_CHECK: 'true' }
31+
for (const group of [OPTIMIZER_GROUP, CUSTOM_GROUP]) {
32+
try {
33+
execSync(`"${thvPath}" group rm "${group}" --with-workloads`, {
34+
input: 'y\n',
35+
stdio: ['pipe', 'ignore', 'ignore'],
36+
env,
37+
})
38+
} catch {
39+
// Group does not exist - that's the happy path.
40+
}
41+
}
42+
}
43+
44+
async function createGroupViaUi(
45+
launched: LaunchedApp,
46+
groupName: string
47+
): Promise<void> {
48+
const { window } = launched
49+
await window.getByRole('button', { name: /add a group/i }).click()
50+
await window.getByRole('dialog').waitFor()
51+
await window.getByLabel(/name/i).fill(groupName)
52+
await window.getByRole('button', { name: /create/i }).click()
53+
await window.getByRole('dialog').waitFor({ state: 'hidden' })
54+
}
55+
56+
async function seedOptimizerState(
57+
baseUrl: string,
58+
testServer: TestMcpServer
59+
): Promise<void> {
60+
await thvFetch(baseUrl, '/api/v1beta/groups', {
61+
method: 'POST',
62+
body: JSON.stringify({ name: OPTIMIZER_GROUP }),
63+
expectStatus: [200, 201],
64+
})
65+
66+
await thvFetch(baseUrl, '/api/v1beta/clients/register', {
67+
method: 'POST',
68+
body: JSON.stringify({
69+
names: [TEST_CLIENT],
70+
groups: [OPTIMIZER_GROUP],
71+
}),
72+
expectStatus: [200, 201],
73+
})
74+
75+
// Create a remote meta-mcp workload so GET /workloads/meta-mcp later returns
76+
// the ALLOWED_GROUPS env var that drives the restoration path. A remote
77+
// workload avoids any Docker image pull complications.
78+
await thvFetch(baseUrl, '/api/v1beta/workloads', {
79+
method: 'POST',
80+
body: JSON.stringify({
81+
name: META_MCP_SERVER,
82+
group: OPTIMIZER_GROUP,
83+
url: testServer.url,
84+
transport: 'streamable-http',
85+
env_vars: {
86+
ALLOWED_GROUPS: CUSTOM_GROUP,
87+
},
88+
}),
89+
expectStatus: [200, 201, 202],
90+
})
91+
}
92+
93+
async function waitForOptimizerCleanup(baseUrl: string): Promise<void> {
94+
await expect
95+
.poll(
96+
async () => {
97+
const { json } = await thvFetch<{
98+
groups?: Array<{ name?: string; registered_clients?: string[] }>
99+
}>(baseUrl, '/api/v1beta/groups', { expectStatus: [200] })
100+
const groups = json?.groups ?? []
101+
const optimizerGroup = groups.find((g) => g.name === OPTIMIZER_GROUP)
102+
const customGroup = groups.find((g) => g.name === CUSTOM_GROUP)
103+
return {
104+
optimizerGone: !optimizerGroup,
105+
customHasClient:
106+
customGroup?.registered_clients?.includes(TEST_CLIENT) ?? false,
107+
}
108+
},
109+
// App-side readiness wait can itself take up to TOOLHIVE_READY_MAX_WAIT_MS
110+
// (60s). Give the poll a budget that exceeds that plus cleanup time so
111+
// slow CI runners don't flake even though in practice this completes
112+
// well under a second.
113+
{ timeout: 120_000, intervals: [500, 1000, 2000] }
114+
)
115+
.toEqual({ optimizerGone: true, customHasClient: true })
116+
}
117+
118+
test.describe('MCP Optimizer startup cleanup', () => {
119+
let userDataDir: string
120+
let testServer: TestMcpServer
121+
122+
test.beforeAll(async () => {
123+
bestEffortCliCleanup()
124+
testServer = await startTestMcpServer()
125+
})
126+
127+
test.afterAll(async () => {
128+
await testServer?.stop()
129+
bestEffortCliCleanup()
130+
})
131+
132+
test.beforeEach(() => {
133+
userDataDir = fs.mkdtempSync(
134+
path.join(os.tmpdir(), 'toolhive-e2e-optimizer-cleanup-')
135+
)
136+
})
137+
138+
test.afterEach(() => {
139+
fs.rmSync(userDataDir, { recursive: true, force: true })
140+
})
141+
142+
test('restores clients to the custom group and deletes __mcp-optimizer__ on next launch', async () => {
143+
// Session 1: seed the legacy MCP Optimizer state.
144+
const firstLaunch = await launchApp(userDataDir)
145+
try {
146+
await createGroupViaUi(firstLaunch, CUSTOM_GROUP)
147+
await seedOptimizerState(firstLaunch.baseUrl, testServer)
148+
149+
// Sanity: both groups exist and optimizer has the registered client.
150+
const { json: seeded } = await thvFetch<{
151+
groups?: Array<{ name?: string; registered_clients?: string[] }>
152+
}>(firstLaunch.baseUrl, '/api/v1beta/groups', { expectStatus: [200] })
153+
const seededOptimizer = seeded?.groups?.find(
154+
(g) => g.name === OPTIMIZER_GROUP
155+
)
156+
expect(seededOptimizer?.registered_clients).toContain(TEST_CLIENT)
157+
expect(seeded?.groups?.some((g) => g.name === CUSTOM_GROUP)).toBe(true)
158+
} finally {
159+
await firstLaunch.close()
160+
}
161+
162+
// Session 2: relaunching with the same userDataDir should trigger the
163+
// startup cleanup hook, which restores clients and deletes the group.
164+
const secondLaunch = await launchApp(userDataDir)
165+
try {
166+
await waitForOptimizerCleanup(secondLaunch.baseUrl)
167+
168+
// The meta-mcp workload is deleted as part of ?with-workloads=true.
169+
const { status: workloadStatus } = await thvFetch(
170+
secondLaunch.baseUrl,
171+
`/api/v1beta/workloads/${META_MCP_SERVER}`
172+
)
173+
expect(workloadStatus).toBe(404)
174+
175+
// The user's custom group is preserved.
176+
const { json: finalGroups } = await thvFetch<{
177+
groups?: Array<{ name?: string }>
178+
}>(secondLaunch.baseUrl, '/api/v1beta/groups', { expectStatus: [200] })
179+
expect(finalGroups?.groups?.some((g) => g.name === CUSTOM_GROUP)).toBe(
180+
true
181+
)
182+
} finally {
183+
await secondLaunch.close()
184+
}
185+
})
186+
})

0 commit comments

Comments
 (0)