Skip to content

Commit 3090d6c

Browse files
fix: prevent port conflicts when launching multiple installs concurrently (#94)
* fix: prevent port conflicts when launching multiple installs concurrently Port reservation and instance detection previously only took effect after full boot (up to 2 minutes), creating a race window where concurrent launches could claim the same port. Main process: - Add in-memory _pendingPorts map to track ports during boot - Check pending reservations in port conflict detection - Pass reserved ports as exclusions to findAvailablePort - Add TOCTOU re-check before reserving to close async gap - Broadcast instance-launching/instance-launch-failed events Renderer: - Track launchingInstances in sessionStore via new IPC events - Update useLocalInstanceGuard to warn about booting instances Tests: 18 new tests across process.test.ts, sessionStore.test.ts, and useLocalInstanceGuard.test.ts Amp-Thread-ID: https://ampcode.com/threads/T-019caeaa-8ead-722d-942d-5aee9307865e Co-authored-by: Amp <amp@ampcode.com> * fix: retry rename on EPERM/EACCES in safe file writes (Windows) On Windows, antivirus or file indexer can briefly lock files after a write, causing EPERM on the atomic rename step. Add retry logic (up to 3 attempts with backoff) to both sync and async writeFileSafe variants. Clean up leftover .tmp files on permanent failure. Amp-Thread-ID: https://ampcode.com/threads/T-019caeaa-8ead-722d-942d-5aee9307865e Co-authored-by: Amp <amp@ampcode.com> * fix: drop busy-wait retry from sync writeFileSafe, keep async only The sync busy-wait blocks the event loop for up to 600ms. Instead, rely on readFileSafe's .bak fallback for recovery. Add a future enhancement note in DESIGN_PROCESS.md to convert sync callers to async. Amp-Thread-ID: https://ampcode.com/threads/T-019caeaa-8ead-722d-942d-5aee9307865e Co-authored-by: Amp <amp@ampcode.com> --------- Co-authored-by: Amp <amp@ampcode.com>
1 parent ac82f30 commit 3090d6c

11 files changed

Lines changed: 446 additions & 17 deletions

File tree

DESIGN_PROCESS.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -194,3 +194,21 @@ When reviewing code for modularity, check:
194194
5. **Does a renderer view contain hardcoded conditionals for specific action/source IDs?** Replace with a data-driven pattern (add a property to the schema).
195195
6. **Does a source repeat information already available from its own definition?** Have the framework (ipc.js) inject it.
196196
7. **Can a new source be added by only creating a file in `sources/` and registering it in `sources/index.js`?** If not, something is coupled.
197+
198+
## Future Enhancements
199+
200+
### Convert `writeFileSafe` callers to async
201+
202+
The synchronous `writeFileSafe` in `lib/safe-file.ts` cannot safely retry on
203+
Windows `EPERM`/`EACCES` rename errors (retry would require a busy-wait that
204+
blocks the event loop). The async variant `writeFileSafeAsync` handles this
205+
with proper `setTimeout`-based retries.
206+
207+
Callers that should be migrated to `writeFileSafeAsync`:
208+
- `src/main/settings.ts``set()` function
209+
- `src/main/lib/release-cache.ts``save()` function
210+
- `src/main/lib/models.ts``ensureModelPathsConfig()`
211+
- `src/main/lib/fetch.ts``fetchJSON()` cache write
212+
213+
Currently the sync version relies on `readFileSafe`'s `.bak` fallback to
214+
recover from a failed rename, which is adequate but not ideal.

src/main/lib/ipc.ts

Lines changed: 71 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -255,6 +255,15 @@ let _gpuPromise: Promise<GpuInfo | null> | null = null
255255

256256
const _operationAborts = new Map<string, AbortController>()
257257
const _runningSessions = new Map<string, SessionInfo>()
258+
const _pendingPorts = new Map<number, string>() // port → installationName
259+
260+
function _reservePort(port: number, installationName: string): void {
261+
_pendingPorts.set(port, installationName)
262+
}
263+
264+
function _releasePort(port: number): void {
265+
_pendingPorts.delete(port)
266+
}
258267

259268
function _broadcastToRenderer(channel: string, data: Record<string, unknown>): void {
260269
BrowserWindow.getAllWindows().forEach((win) => {
@@ -1400,41 +1409,83 @@ export function register(callbacks: RegisterCallbacks = {}): void {
14001409
setPortArg(launchCmd as LaunchCmd, actionData.portOverride as number)
14011410
}
14021411

1403-
const existingPids = await findPidsByPort(launchCmd.port!)
1404-
if (existingPids.length > 0) {
1412+
// Check for port conflicts: in-memory pending reservations, OS-level listeners, and disk locks
1413+
const pendingPortOwner = _pendingPorts.get(launchCmd.port!)
1414+
const existingPids = pendingPortOwner ? [] : await findPidsByPort(launchCmd.port!)
1415+
const portOccupied = !!pendingPortOwner || existingPids.length > 0
1416+
1417+
if (portOccupied) {
14051418
const defaults = source.getDefaults ? source.getDefaults() : {}
14061419
const portConflictMode = (inst.portConflict as string | undefined) || (defaults.portConflict as string | undefined) || 'auto'
14071420
const userArgs = ((inst.launchArgs as string | undefined) || '').trim()
14081421
const portIsExplicit = /(?:^|\s)--port\b/.test(userArgs)
14091422

1423+
const reservedPorts = new Set(_pendingPorts.keys())
14101424
let nextPort: number | null = null
14111425
try {
1412-
nextPort = await findAvailablePort('127.0.0.1', launchCmd.port! + 1, launchCmd.port! + 1000)
1426+
nextPort = await findAvailablePort('127.0.0.1', launchCmd.port! + 1, launchCmd.port! + 1000, reservedPorts)
14131427
} catch {}
14141428

14151429
if (portConflictMode === 'auto' && nextPort && !portIsExplicit) {
14161430
sendProgress('launch', { percent: -1, status: i18n.t('launch.portBusyUsing', { old: launchCmd.port!, new: nextPort }) })
14171431
setPortArg(launchCmd as LaunchCmd, nextPort)
14181432
} else {
1419-
const lock = readPortLock(launchCmd.port!)
14201433
let message: string
14211434
let isComfy: boolean
1422-
if (lock) {
1423-
message = i18n.t('errors.portConflictLauncher', { port: launchCmd.port!, name: lock.installationName })
1435+
if (pendingPortOwner) {
1436+
message = i18n.t('errors.portConflictLauncher', { port: launchCmd.port!, name: pendingPortOwner })
14241437
isComfy = true
14251438
} else {
1426-
const info = await getProcessInfo(existingPids[0]!)
1427-
isComfy = looksLikeComfyUI(info)
1428-
const processDesc = info ? info.name : `PID ${existingPids[0]}`
1429-
message = isComfy
1430-
? i18n.t('errors.portConflictComfy', { port: launchCmd.port!, process: processDesc })
1431-
: i18n.t('errors.portConflictOther', { port: launchCmd.port!, process: processDesc })
1439+
const lock = readPortLock(launchCmd.port!)
1440+
if (lock) {
1441+
message = i18n.t('errors.portConflictLauncher', { port: launchCmd.port!, name: lock.installationName })
1442+
isComfy = true
1443+
} else {
1444+
const info = await getProcessInfo(existingPids[0]!)
1445+
isComfy = looksLikeComfyUI(info)
1446+
const processDesc = info ? info.name : `PID ${existingPids[0]}`
1447+
message = isComfy
1448+
? i18n.t('errors.portConflictComfy', { port: launchCmd.port!, process: processDesc })
1449+
: i18n.t('errors.portConflictOther', { port: launchCmd.port!, process: processDesc })
1450+
}
14321451
}
14331452
_operationAborts.delete(installationId)
14341453
return { ok: false, message, portConflict: { port: launchCmd.port, pids: existingPids, isComfy, nextPort } }
14351454
}
14361455
}
14371456

1457+
// Synchronous re-check: another launch may have reserved this port while we
1458+
// were awaiting findPidsByPort / findAvailablePort above (TOCTOU gap).
1459+
const lateConflictOwner = _pendingPorts.get(launchCmd.port!)
1460+
if (lateConflictOwner) {
1461+
const defaults = source.getDefaults ? source.getDefaults() : {}
1462+
const portConflictMode = (inst.portConflict as string | undefined) || (defaults.portConflict as string | undefined) || 'auto'
1463+
const userArgs = ((inst.launchArgs as string | undefined) || '').trim()
1464+
const portIsExplicit = /(?:^|\s)--port\b/.test(userArgs)
1465+
1466+
const reservedPorts = new Set(_pendingPorts.keys())
1467+
let nextPort: number | null = null
1468+
try {
1469+
nextPort = await findAvailablePort('127.0.0.1', launchCmd.port! + 1, launchCmd.port! + 1000, reservedPorts)
1470+
} catch {}
1471+
1472+
if (portConflictMode === 'auto' && nextPort && !portIsExplicit) {
1473+
sendProgress('launch', { percent: -1, status: i18n.t('launch.portBusyUsing', { old: launchCmd.port!, new: nextPort }) })
1474+
setPortArg(launchCmd as LaunchCmd, nextPort)
1475+
} else {
1476+
_operationAborts.delete(installationId)
1477+
return {
1478+
ok: false,
1479+
message: i18n.t('errors.portConflictLauncher', { port: launchCmd.port!, name: lateConflictOwner }),
1480+
portConflict: { port: launchCmd.port, pids: [], isComfy: true, nextPort },
1481+
}
1482+
}
1483+
}
1484+
1485+
// Reserve port eagerly before spawning to prevent concurrent launches from claiming it
1486+
_reservePort(launchCmd.port!, inst.name)
1487+
_broadcastToRenderer('instance-launching', { installationId, installationName: inst.name })
1488+
14381489
const sessionPath = createSessionPath()
14391490
const launchEnv = { ...process.env, __COMFY_CLI_SESSION__: sessionPath }
14401491
const sendOutput = (text: string): void => {
@@ -1517,9 +1568,12 @@ export function register(callbacks: RegisterCallbacks = {}): void {
15171568
if (isPortConflict && portRetries < PORT_RETRY_MAX) {
15181569
portRetries++
15191570
try {
1520-
const retryPort = await findAvailablePort('127.0.0.1', launchCmd.port! + 1, launchCmd.port! + 1000)
1571+
const reservedPorts = new Set(_pendingPorts.keys())
1572+
const retryPort = await findAvailablePort('127.0.0.1', launchCmd.port! + 1, launchCmd.port! + 1000, reservedPorts)
15211573
sendOutput(`\nPort ${launchCmd.port} in use, retrying on port ${retryPort}…\n`)
1574+
_releasePort(launchCmd.port!)
15221575
setPortArg(launchCmd as LaunchCmd, retryPort)
1576+
_reservePort(launchCmd.port!, inst.name)
15231577
return tryLaunch()
15241578
} catch {}
15251579
}
@@ -1530,12 +1584,16 @@ export function register(callbacks: RegisterCallbacks = {}): void {
15301584

15311585
const launchResult = await tryLaunch()
15321586
if (!launchResult.ok) {
1587+
_releasePort(launchCmd.port!)
15331588
_operationAborts.delete(installationId)
1589+
_broadcastToRenderer('instance-launch-failed', { installationId })
15341590
if (launchResult.cancelled) return { ok: false, cancelled: true }
15351591
return { ok: false, message: launchResult.message }
15361592
}
15371593
let { proc } = launchResult
15381594

1595+
// Transition from pending reservation to confirmed session + port lock
1596+
_pendingPorts.delete(launchCmd.port!)
15391597
_operationAborts.delete(installationId)
15401598
const mode = (inst.launchMode as string | undefined) || 'window'
15411599
_addSession(installationId, { proc, port: launchCmd.port!, mode, installationName: inst.name })

src/main/lib/process.test.ts

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
import { describe, it, expect } from 'vitest'
2+
import { findAvailablePort } from './process'
3+
import net from 'net'
4+
5+
describe('findAvailablePort', () => {
6+
it('finds an available port in the given range', async () => {
7+
const port = await findAvailablePort('127.0.0.1', 49200, 49300)
8+
9+
expect(port).toBeGreaterThanOrEqual(49200)
10+
expect(port).toBeLessThanOrEqual(49300)
11+
})
12+
13+
it('skips ports in the excludePorts set', async () => {
14+
const firstPort = await findAvailablePort('127.0.0.1', 49200, 49300)
15+
16+
const excluded = new Set([firstPort])
17+
const result = await findAvailablePort('127.0.0.1', firstPort, 49300, excluded)
18+
19+
expect(result).not.toBe(firstPort)
20+
expect(result).toBeGreaterThanOrEqual(firstPort + 1)
21+
})
22+
23+
it('skips multiple excluded ports', async () => {
24+
const base = 49300
25+
const excluded = new Set([base, base + 1, base + 2])
26+
const result = await findAvailablePort('127.0.0.1', base, base + 100, excluded)
27+
28+
expect(excluded.has(result)).toBe(false)
29+
expect(result).toBeGreaterThanOrEqual(base + 3)
30+
})
31+
32+
it('rejects when all ports in range are excluded', async () => {
33+
const base = 49400
34+
const excluded = new Set([base, base + 1, base + 2])
35+
36+
await expect(
37+
findAvailablePort('127.0.0.1', base, base + 2, excluded)
38+
).rejects.toThrow('No available ports found')
39+
})
40+
41+
it('skips a port that is actually in use', async () => {
42+
const base = 49500
43+
// Bind a port to simulate it being in use
44+
const server = net.createServer()
45+
await new Promise<void>((resolve) => {
46+
server.listen(base, '127.0.0.1', () => resolve())
47+
})
48+
49+
try {
50+
const result = await findAvailablePort('127.0.0.1', base, base + 100)
51+
expect(result).toBeGreaterThan(base)
52+
} finally {
53+
await new Promise<void>((resolve) => server.close(() => resolve()))
54+
}
55+
})
56+
})

src/main/lib/process.ts

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ export interface LaunchCmd {
2323
port: number
2424
}
2525

26-
interface PortLock {
26+
export interface PortLock {
2727
pid: number
2828
installationName: string
2929
timestamp: number
@@ -206,13 +206,17 @@ export function setPortArg(launchCmd: LaunchCmd, port: number): void {
206206
launchCmd.port = port
207207
}
208208

209-
export function findAvailablePort(host: string, startPort: number, endPort: number): Promise<number> {
209+
export function findAvailablePort(host: string, startPort: number, endPort: number, excludePorts?: ReadonlySet<number>): Promise<number> {
210210
return new Promise((resolve, reject) => {
211211
function tryPort(port: number): void {
212212
if (port > endPort) {
213213
reject(new Error(`No available ports found between ${startPort} and ${endPort}`))
214214
return
215215
}
216+
if (excludePorts && excludePorts.has(port)) {
217+
tryPort(port + 1)
218+
return
219+
}
216220
const server = net.createServer()
217221
server.listen(port, host, () => {
218222
server.once("close", () => resolve(port))

src/main/lib/safe-file.ts

Lines changed: 24 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,12 @@ export function writeFileSafe(filePath: string, data: string, backup: boolean =
2828
if (backup) {
2929
try { fs.copyFileSync(filePath, bakPath) } catch {}
3030
}
31-
fs.renameSync(tmpPath, filePath)
31+
try {
32+
fs.renameSync(tmpPath, filePath)
33+
} catch (err) {
34+
try { fs.unlinkSync(tmpPath) } catch {}
35+
throw err
36+
}
3237
}
3338

3439
/**
@@ -66,7 +71,24 @@ export async function writeFileSafeAsync(filePath: string, data: string, backup:
6671
if (backup) {
6772
try { await fs.promises.copyFile(filePath, bakPath) } catch {}
6873
}
69-
await fs.promises.rename(tmpPath, filePath)
74+
// On Windows, antivirus or indexer may briefly lock the file after a write,
75+
// causing EPERM on rename. Retry a few times with a short delay.
76+
const RENAME_RETRIES = 3
77+
const RENAME_DELAY_MS = 100
78+
for (let attempt = 0; ; attempt++) {
79+
try {
80+
await fs.promises.rename(tmpPath, filePath)
81+
return
82+
} catch (err) {
83+
const code = (err as NodeJS.ErrnoException).code
84+
if ((code === 'EPERM' || code === 'EACCES') && attempt < RENAME_RETRIES) {
85+
await new Promise((r) => setTimeout(r, RENAME_DELAY_MS * (attempt + 1)))
86+
continue
87+
}
88+
try { await fs.promises.unlink(tmpPath) } catch {}
89+
throw err
90+
}
91+
}
7092
}
7193

7294
export async function readFileSafeAsync(filePath: string): Promise<string | null> {

src/preload/index.ts

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -106,6 +106,16 @@ const api: ElectronApi = {
106106
ipcRenderer.on('comfy-exited', handler)
107107
return () => ipcRenderer.removeListener('comfy-exited', handler)
108108
},
109+
onInstanceLaunching: (callback) => {
110+
const handler = (_event: IpcRendererEvent, data: unknown) => callback(data as Parameters<typeof callback>[0])
111+
ipcRenderer.on('instance-launching', handler)
112+
return () => ipcRenderer.removeListener('instance-launching', handler)
113+
},
114+
onInstanceLaunchFailed: (callback) => {
115+
const handler = (_event: IpcRendererEvent, data: unknown) => callback(data as Parameters<typeof callback>[0])
116+
ipcRenderer.on('instance-launch-failed', handler)
117+
return () => ipcRenderer.removeListener('instance-launch-failed', handler)
118+
},
109119
onInstanceStarted: (callback) => {
110120
const handler = (_event: IpcRendererEvent, data: unknown) => callback(data as Parameters<typeof callback>[0])
111121
ipcRenderer.on('instance-started', handler)

0 commit comments

Comments
 (0)