Commit ec61f24
Free cached GPU memory before AR validation to avoid OOM
With LoRA co-training the model carries extra parameters and optimizer
states (LoRA A/B + Adam moments), reducing the headroom available for
the validation forward passes. Call torch.cuda.empty_cache() before
validate_ar() to release unused cached allocations without affecting
any live tensors (parameters, optimizer states, gradients).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Ye Yu <yeyu@nvidia.com>1 parent 5498e9f commit ec61f24
1 file changed
Lines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
260 | 260 | | |
261 | 261 | | |
262 | 262 | | |
| 263 | + | |
263 | 264 | | |
264 | 265 | | |
265 | 266 | | |
| |||
0 commit comments