Skip to content

Commit 19c8aeb

Browse files
committed
Safeguard check_stopped placement group teardown with robust exception handling to prevent controller crashes
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
1 parent f82eab9 commit 19c8aeb

1 file changed

Lines changed: 4 additions & 0 deletions

File tree

python/ray/serve/_private/deployment_state.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1547,6 +1547,10 @@ def check_stopped(self) -> bool:
15471547
logger.debug(
15481548
f"Placement group for {self._replica_id} was already removed."
15491549
)
1550+
except Exception:
1551+
logger.exception(
1552+
f"Unexpected error shutting down placement groups for {self._replica_id}."
1553+
)
15501554
finally:
15511555
# Always clear references to prevent memory leaks and dangling state.
15521556
self._gang_placement_group = None

0 commit comments

Comments
 (0)