Skip to content

Commit aa898b6

Browse files
author
Lasim
committed
feat: Add detailed process termination procedures and cleanup operations to documentation
1 parent ae45a41 commit aa898b6

1 file changed

Lines changed: 151 additions & 8 deletions

File tree

development/satellite/process-management.mdx

Lines changed: 151 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -144,16 +144,159 @@ All communication uses newline-delimited JSON following JSON-RPC 2.0 specificati
144144

145145
### Graceful Termination
146146

147-
Termination follows a two-phase approach:
147+
Process termination follows a two-phase graceful shutdown approach to ensure clean process exit and proper resource cleanup.
148+
149+
#### Termination Phases
150+
151+
**Phase 1: SIGTERM (Graceful Shutdown)**
152+
- Send SIGTERM signal to the process
153+
- Process has 10 seconds (default timeout) to shut down gracefully
154+
- Process can complete in-flight operations and cleanup resources
155+
- Wait for process to exit voluntarily
156+
157+
**Phase 2: SIGKILL (Force Termination)**
158+
- If process doesn't exit within timeout period
159+
- Send SIGKILL signal to force immediate termination
160+
- Guaranteed process termination (cannot be caught or ignored)
161+
- Used as last resort for unresponsive processes
162+
163+
#### Termination Types
164+
165+
The system handles three types of intentional terminations differently:
166+
167+
**1. Manual Termination**
168+
- Triggered by explicit restart or stop commands
169+
- Status set to `'terminating'` before sending signals
170+
- No auto-restart triggered
171+
- Standard graceful shutdown with SIGTERM → SIGKILL
172+
173+
**2. Idle/Dormant Termination**
174+
- Triggered by idle timeout (default: 180 seconds of inactivity)
175+
- Process marked with `isDormantShutdown` flag
176+
- Configuration stored in dormant map for fast respawn
177+
- Tools remain cached for instant availability
178+
- No auto-restart triggered (intentional shutdown)
179+
- See [Idle Process Management](/development/satellite/idle-process-management) for details
180+
181+
**3. Uninstall Termination**
182+
- Triggered when server removed from configuration
183+
- Process marked with `isUninstallShutdown` flag
184+
- Complete cleanup: process, dormant config, tools, restart tracking
185+
- No auto-restart triggered (intentional removal)
186+
- Invoked via `removeServerCompletely()` method
187+
188+
#### Crash Detection vs Intentional Shutdown
189+
190+
The system distinguishes between crashes and intentional shutdowns:
191+
192+
**Crash Detection Logic:**
193+
```typescript
194+
// Process is considered crashed if:
195+
// 1. Exit code is non-zero (e.g., 1, 143)
196+
// 2. Status is NOT 'terminating'
197+
// 3. NOT marked as intentional shutdown (isDormantShutdown or isUninstallShutdown)
198+
const wasCrash = code !== 0 && code !== null &&
199+
processInfo.status !== 'terminating' &&
200+
!processInfo.isDormantShutdown &&
201+
!processInfo.isUninstallShutdown;
202+
```
203+
204+
**Why This Matters:**
205+
- SIGTERM exit code is 143 (non-zero)
206+
- Without flags, graceful termination would trigger auto-restart
207+
- Flags prevent unwanted restarts for intentional shutdowns
208+
209+
#### Cleanup Operations
210+
211+
During termination, the following cleanup operations occur:
212+
213+
1. **Active Request Cancellation**
214+
- All pending JSON-RPC requests are rejected
215+
- Active requests map is cleared
216+
- Clients receive termination error
217+
218+
2. **State Cleanup**
219+
- Remove from processes map (by process ID)
220+
- Remove from processIdsByName map (by installation name)
221+
- Remove from team tracking sets
222+
- Clear dormant config if exists (for uninstall)
223+
224+
3. **Resource Tracking**
225+
- Restart attempts cleared (for uninstall)
226+
- Respawn promises cleared
227+
- Process metrics finalized
228+
229+
4. **Event Emission**
230+
- Emit `processTerminated` internal event
231+
- Emit `processExit` with exit code and signal
232+
- Emit `mcp.server.crashed` if crash detected (Backend event)
148233

149-
1. **SIGTERM Phase**: Send graceful shutdown signal
150-
2. **SIGKILL Phase**: Force kill if timeout exceeded (default 10s)
234+
#### Complete Server Removal
235+
236+
The `removeServerCompletely()` method provides comprehensive cleanup for server uninstall:
237+
238+
**Method Signature:**
239+
```typescript
240+
async removeServerCompletely(
241+
installationName: string,
242+
timeout: number = 10000
243+
): Promise<{ active: boolean; dormant: boolean }>
244+
```
245+
246+
**Operation Flow:**
247+
1. Check for active process
248+
- If found: Set `isUninstallShutdown` flag
249+
- Terminate with graceful shutdown
250+
- Return `active: true`
251+
252+
2. Check for dormant config
253+
- If found: Remove from dormant map
254+
- Return `dormant: true`
255+
256+
3. Clear restart tracking
257+
- Delete restart attempts history
258+
- Prevent any future restart attempts
259+
260+
**Usage Example:**
261+
```typescript
262+
// Called when server removed from configuration
263+
const result = await processManager.removeServerCompletely(
264+
'sequential-thinking-team-name-abc123'
265+
);
266+
267+
// Result: { active: true, dormant: false }
268+
// - Active process was terminated
269+
// - No dormant config existed
270+
```
271+
272+
**Logging Output:**
273+
```
274+
INFO: Removing server completely: sequential-thinking-team-name-abc123
275+
INFO: Terminating active process: sequential-thinking-team-name-abc123
276+
DEBUG: Sent SIGTERM to sequential-thinking-team-name-abc123
277+
INFO: Process terminated for uninstall (not a crash)
278+
INFO: Server removed completely (active: true, dormant: false)
279+
```
151280

152-
**Cleanup Operations:**
153-
- Cancel all active requests with rejection
154-
- Clear active requests map
155-
- Remove from tracking maps (by ID, by name, by team)
156-
- Emit 'processTerminated' event
281+
#### Termination Timing
282+
283+
**Normal Termination:**
284+
- SIGTERM sent: ~1ms
285+
- Process cleanup: 10-500ms (application-dependent)
286+
- Total time: 11-501ms
287+
288+
**Forced Termination:**
289+
- SIGTERM sent: ~1ms
290+
- Timeout wait: 10,000ms
291+
- SIGKILL sent: ~1ms
292+
- Immediate kill: ~10ms
293+
- Total time: ~10,012ms
294+
295+
**Best Practices:**
296+
- MCP servers should handle SIGTERM gracefully
297+
- Complete in-flight requests within timeout
298+
- Close file handles and network connections
299+
- Exit with code 0 for clean shutdown
157300

158301
## Auto-Restart System
159302

0 commit comments

Comments
 (0)