Skip to content

Commit e85e637

Browse files
authored
fix: [AI-198] prioritize managed engine venv over CWD .venv (#199)
* fix: [AI-198] prioritize managed engine venv over CWD `.venv` in `resolvePython()` `ensureEngine()` installs altimate-engine into the managed venv at `~/.opencode/data/engine/venv/`, but `resolvePython()` checked for a `.venv` in the user's current working directory first. When running from any Python project with its own `.venv`, the bridge would spawn that project's Python — which doesn't have altimate-engine — and fail. Swap the priority so the managed engine venv is checked before the CWD `.venv`. Add a test covering the new ordering. Closes #198 * fix: [AI-196] install `altimate-engine[warehouses]` extra by default The managed engine install was missing the `[warehouses]` optional dependency group, which includes `snowflake-connector-python`, `psycopg2-binary`, `duckdb`, and other database connectors. This caused FinOps tools and warehouse operations to fail with "snowflake-connector-python not installed" errors. Closes #196 * fix: harden engine bootstrap — validate venv, track extras, fix Windows paths, add startup mutex - Validate Python binary exists on disk before trusting manifest (fixes: manifest exists but venv manually deleted → silent failure) - Track installed extras in manifest so version match with different extras (e.g. upgrading from no-extras to `[warehouses]`) triggers reinstall - Use platform-aware `venvPythonBin()` helper for dev/cwd venv paths (fixes: Windows uses `Scripts/python.exe`, not `bin/python`) - Add `pendingStart` mutex to `Bridge.call()` preventing concurrent `start()` calls from spawning duplicate Python processes - Add 25 new tests covering: priority ordering, env var edge cases, extras tracking, manifest validation, Windows paths, startup mutex, concurrent calls * fix: clean up child process on ping failure + use platform-aware paths in tests Address Sentry bot review comments on PR #199: 1. When `start()` spawns a child but `ping` verification fails, kill and clear the `child` handle so subsequent `call()` invocations trigger a restart instead of writing to a broken process and hanging for 30s timeout. 2. Use platform-aware `testVenvPythonBin()` helper in test file for dev/cwd venv detection, matching production `venvPythonBin()` behavior. Fixes incorrect test skip logic on Windows. * fix: address multi-model code review feedback - Remove dead else branch in `ENGINE_INSTALL_SPEC` conditional (constant is always truthy) - Add `extras` field to `engine_started` telemetry event for debugging - Update `ENGINE_INSTALL_SPEC` comment to reflect production usage - Tighten concurrent startup test assertion to verify mutex coalescing (`ensureEngineCalls <= 2` instead of just `>= 1`) * fix: handle spawn errors, venv recovery, and post-mutex child guard Address Gemini review findings: 1. Add `error` event handler on spawned child process to prevent unhandled ENOENT/EACCES from crashing the host process when `resolvePython()` returns a nonexistent or non-executable path. 2. Recreate venv when Python binary is missing even if the venv directory still exists (e.g. user deleted just the binary). 3. Re-check `child` state after awaiting `pendingStart` mutex — the process may have died between startup completing and the secondary caller resuming. * fix: don't increment `restartCount` when process is killed by signal When `child.kill()` is called (e.g. after ping failure), the exit handler fires with `code = null`. The check `code !== 0` treated this as a crash, incrementing `restartCount`. After two ping failures, the bridge would be permanently disabled. Changed to `code !== null && code !== 0` so only actual non-zero exit codes (real crashes) count toward the restart limit.
1 parent 31d8f52 commit e85e637

File tree

4 files changed

+524
-24
lines changed

4 files changed

+524
-24
lines changed

packages/opencode/src/altimate/bridge/client.ts

Lines changed: 48 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,13 @@ import type { BridgeMethod, BridgeMethods } from "./protocol"
1414
import { Telemetry } from "../telemetry"
1515
import { Log } from "../../util/log"
1616

17+
/** Platform-aware path to the python binary inside a venv directory. */
18+
function venvPythonBin(venvDir: string): string {
19+
return process.platform === "win32"
20+
? path.join(venvDir, "Scripts", "python.exe")
21+
: path.join(venvDir, "bin", "python")
22+
}
23+
1724
/** Resolve the Python interpreter to use for the engine sidecar.
1825
* Exported for testing — not part of the public API. */
1926
export function resolvePython(): string {
@@ -22,17 +29,20 @@ export function resolvePython(): string {
2229

2330
// 2. Check for .venv relative to altimate-engine package (local dev)
2431
const engineDir = path.resolve(__dirname, "..", "..", "..", "altimate-engine")
25-
const venvPython = path.join(engineDir, ".venv", "bin", "python")
32+
const venvPython = venvPythonBin(path.join(engineDir, ".venv"))
2633
if (existsSync(venvPython)) return venvPython
2734

28-
// 3. Check for .venv in cwd
29-
const cwdVenv = path.join(process.cwd(), ".venv", "bin", "python")
30-
if (existsSync(cwdVenv)) return cwdVenv
31-
32-
// 4. Check the managed engine venv (created by ensureEngine)
35+
// 3. Check the managed engine venv (created by ensureEngine)
36+
// This must come before the CWD venv check — ensureEngine() installs
37+
// altimate-engine here, so an unrelated .venv in the user's project
38+
// directory must not shadow it.
3339
const managedPython = enginePythonPath()
3440
if (existsSync(managedPython)) return managedPython
3541

42+
// 4. Check for .venv in cwd
43+
const cwdVenv = venvPythonBin(path.join(process.cwd(), ".venv"))
44+
if (existsSync(cwdVenv)) return cwdVenv
45+
3646
// 5. Fallback
3747
return "python3"
3848
}
@@ -45,6 +55,8 @@ export namespace Bridge {
4555
const CALL_TIMEOUT_MS = 30_000
4656
const pending = new Map<number, { resolve: (value: any) => void; reject: (reason: any) => void }>()
4757
let buffer = ""
58+
// Mutex to prevent concurrent start() calls from spawning duplicate processes
59+
let pendingStart: Promise<void> | null = null
4860

4961
export async function call<M extends BridgeMethod>(
5062
method: M,
@@ -53,7 +65,20 @@ export namespace Bridge {
5365
const startTime = Date.now()
5466
if (!child || child.exitCode !== null) {
5567
if (restartCount >= MAX_RESTARTS) throw new Error("Python bridge failed after max restarts")
56-
await start()
68+
if (pendingStart) {
69+
await pendingStart
70+
// Re-check: the process may have died between startup and now
71+
if (!child || child.exitCode !== null) {
72+
throw new Error("Bridge process died during startup")
73+
}
74+
} else {
75+
pendingStart = start()
76+
try {
77+
await pendingStart
78+
} finally {
79+
pendingStart = null
80+
}
81+
}
5782
}
5883
const id = ++requestId
5984
const request = JSON.stringify({ jsonrpc: "2.0", method, params, id })
@@ -141,8 +166,18 @@ export namespace Bridge {
141166
if (msg) Log.Default.error("altimate-engine stderr", { message: msg })
142167
})
143168

169+
child.on("error", (err) => {
170+
Log.Default.error("altimate-engine spawn error", { error: String(err) })
171+
restartCount++
172+
for (const [id, p] of pending) {
173+
p.reject(new Error(`Bridge process failed to spawn: ${err}`))
174+
pending.delete(id)
175+
}
176+
child = undefined
177+
})
178+
144179
child.on("exit", (code) => {
145-
if (code !== 0) restartCount++
180+
if (code !== null && code !== 0) restartCount++
146181
for (const [id, p] of pending) {
147182
p.reject(new Error(`Bridge process exited (code ${code})`))
148183
pending.delete(id)
@@ -154,6 +189,11 @@ export namespace Bridge {
154189
try {
155190
await call("ping", {} as any)
156191
} catch (e) {
192+
// Clean up the spawned process so subsequent call() invocations
193+
// correctly detect !child and trigger a restart instead of writing
194+
// to a non-functional process and hanging until timeout.
195+
child?.kill()
196+
child = undefined
157197
throw new Error(`Failed to start Python bridge: ${e}`)
158198
}
159199
}

packages/opencode/src/altimate/bridge/engine.ts

Lines changed: 19 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -25,12 +25,18 @@ declare const OPENCODE_VERSION: string
2525
// Mutex to prevent concurrent ensureEngine/ensureUv calls from corrupting state
2626
let pendingEnsure: Promise<void> | null = null
2727

28+
/** Pip extras spec for altimate-engine (e.g. "warehouses" → altimate-engine[warehouses]).
29+
* Used in ensureEngine install command and recorded in manifest for upgrade detection. */
30+
export const ENGINE_INSTALL_SPEC = "warehouses"
31+
2832
interface Manifest {
2933
engine_version: string
3034
python_version: string
3135
uv_version: string
3236
cli_version: string
3337
installed_at: string
38+
/** Comma-separated extras that were installed (e.g. "warehouses") */
39+
extras?: string
3440
}
3541

3642
/** Returns path to the engine directory */
@@ -158,7 +164,12 @@ export async function ensureEngine(): Promise<void> {
158164
async function ensureEngineImpl(): Promise<void> {
159165
const manifest = await readManifest()
160166
const isUpgrade = manifest !== null
161-
if (manifest && manifest.engine_version === ALTIMATE_ENGINE_VERSION) return
167+
168+
// Validate both version AND filesystem state — a matching version in the
169+
// manifest is not enough if the venv or Python binary was deleted.
170+
const pythonExists = existsSync(enginePythonPath())
171+
const extrasMatch = (manifest?.extras ?? "") === ENGINE_INSTALL_SPEC
172+
if (manifest && manifest.engine_version === ALTIMATE_ENGINE_VERSION && pythonExists && extrasMatch) return
162173

163174
const startTime = Date.now()
164175

@@ -168,8 +179,9 @@ async function ensureEngineImpl(): Promise<void> {
168179
const dir = engineDir()
169180
const venvDir = path.join(dir, "venv")
170181

171-
// Create venv if it doesn't exist
172-
if (!existsSync(venvDir)) {
182+
// Create venv if it doesn't exist, or recreate if the Python binary is missing
183+
// (e.g. user deleted the binary but left the venv directory intact)
184+
if (!existsSync(venvDir) || !pythonExists) {
173185
Log.Default.info("creating python environment")
174186
try {
175187
execFileSync(uv, ["venv", "--python", "3.12", venvDir], { stdio: "pipe" })
@@ -189,7 +201,8 @@ async function ensureEngineImpl(): Promise<void> {
189201
const pythonPath = enginePythonPath()
190202
Log.Default.info("installing altimate-engine", { version: ALTIMATE_ENGINE_VERSION })
191203
try {
192-
execFileSync(uv, ["pip", "install", "--python", pythonPath, `altimate-engine==${ALTIMATE_ENGINE_VERSION}`], { stdio: "pipe" })
204+
const spec = `altimate-engine[${ENGINE_INSTALL_SPEC}]==${ALTIMATE_ENGINE_VERSION}`
205+
execFileSync(uv, ["pip", "install", "--python", pythonPath, spec], { stdio: "pipe" })
193206
} catch (e: any) {
194207
Telemetry.track({
195208
type: "engine_error",
@@ -212,6 +225,7 @@ async function ensureEngineImpl(): Promise<void> {
212225
uv_version: uvVersion,
213226
cli_version: typeof OPENCODE_VERSION === "string" ? OPENCODE_VERSION : "local",
214227
installed_at: new Date().toISOString(),
228+
extras: ENGINE_INSTALL_SPEC,
215229
})
216230

217231
Telemetry.track({
@@ -220,6 +234,7 @@ async function ensureEngineImpl(): Promise<void> {
220234
session_id: Telemetry.getContext().sessionId,
221235
engine_version: ALTIMATE_ENGINE_VERSION,
222236
python_version: pyVersion,
237+
extras: ENGINE_INSTALL_SPEC,
223238
status: isUpgrade ? "upgraded" : "started",
224239
duration_ms: Date.now() - startTime,
225240
})

packages/opencode/src/altimate/telemetry/index.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -153,6 +153,7 @@ export namespace Telemetry {
153153
session_id: string
154154
engine_version: string
155155
python_version: string
156+
extras?: string
156157
status: "started" | "restarted" | "upgraded"
157158
duration_ms: number
158159
}

0 commit comments

Comments
 (0)