Commit db8ae12
committed
fix: fallback to torch.cuda.mem_get_info when NVML memory query is unsupported
nvmlDeviceGetMemoryInfo returns NVML_ERROR_NOT_SUPPORTED on DGX Spark
(GB10). Log the error and fall back to torch.cuda.mem_get_info which
works on all CUDA devices.
Signed-off-by: Daniel Bustamante Ospina <dbustamante70@gmail.com>1 parent fa4ec27 commit db8ae12
1 file changed
Lines changed: 5 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
| 15 | + | |
15 | 16 | | |
16 | 17 | | |
17 | 18 | | |
18 | 19 | | |
19 | 20 | | |
| 21 | + | |
| 22 | + | |
20 | 23 | | |
21 | 24 | | |
22 | 25 | | |
| |||
84 | 87 | | |
85 | 88 | | |
86 | 89 | | |
87 | | - | |
88 | | - | |
| 90 | + | |
| 91 | + | |
89 | 92 | | |
90 | | - | |
91 | 93 | | |
92 | 94 | | |
93 | 95 | | |
| |||
0 commit comments