Skip to content

Commit 043775c

Browse files
authored
fix: fix OOM (#2285)
Signed-off-by: Yuki Huang <yukih@nvidia.com>
1 parent 1109004 commit 043775c

1 file changed

Lines changed: 3 additions & 1 deletion

File tree

nemo_rl/models/generation/vllm/vllm_backend.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -207,6 +207,7 @@ def update_weights_via_ipc_zmq(self) -> bool:
207207
ipc_handle, list_keys, used_bytes = payload
208208
buffer = rebuild_cuda_tensor_from_ipc(ipc_handle, self.device.index)
209209

210+
weight = None
210211
weights = []
211212
offset = 0
212213
for key in list_keys:
@@ -258,7 +259,8 @@ def update_weights_via_ipc_zmq(self) -> bool:
258259
# copied the data, Python may not garbage collect these view objects immediately.
259260
# If sender reuses the buffer before GC runs, old views would read corrupted data.
260261
# Explicit del ensures immediate cleanup before sending ACK.
261-
del weights, policy_weights, draft_weights, buffer
262+
del weight, weights, policy_weights, draft_weights, buffer
263+
weight = None
262264
weights = None
263265
policy_weights = None
264266
draft_weights = None

0 commit comments

Comments
 (0)