Commit 8d2ee36
[CloudRift] Fix NTP clock skew breaking Docker; handle amd-smi 7.x output format
CloudRift VMs boot with an incorrect RTC clock (~1h ahead). When NTP
corrects it backwards, Docker discards container exit events, leaving
containers stuck as ghosts forever. Add NTP sync wait before launching
the shim to prevent this.
Also handle both amd-smi output formats (flat array in ROCm 6.x,
wrapped {"gpu_data": [...]} in ROCm 7.x) and add a 2-minute timeout
to AMD GPU detection to prevent the shim from hanging indefinitely.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent a61cc26 commit 8d2ee36
3 files changed
Lines changed: 55 additions & 8 deletions
File tree
- runner/internal/shim/host
- src/dstack/_internal/core/backends/cloudrift
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| 13 | + | |
13 | 14 | | |
14 | 15 | | |
15 | 16 | | |
| |||
114 | 115 | | |
115 | 116 | | |
116 | 117 | | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
117 | 123 | | |
118 | 124 | | |
119 | 125 | | |
| |||
130 | 136 | | |
131 | 137 | | |
132 | 138 | | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
133 | 154 | | |
134 | 155 | | |
135 | 156 | | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
136 | 160 | | |
137 | 161 | | |
138 | 162 | | |
| |||
158 | 182 | | |
159 | 183 | | |
160 | 184 | | |
161 | | - | |
162 | | - | |
| 185 | + | |
| 186 | + | |
163 | 187 | | |
164 | 188 | | |
165 | 189 | | |
| |||
Lines changed: 14 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
72 | 72 | | |
73 | 73 | | |
74 | 74 | | |
75 | | - | |
| 75 | + | |
76 | 76 | | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
77 | 82 | | |
78 | 83 | | |
79 | | - | |
80 | | - | |
| 84 | + | |
81 | 85 | | |
82 | 86 | | |
83 | 87 | | |
| |||
97 | 101 | | |
98 | 102 | | |
99 | 103 | | |
100 | | - | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
101 | 110 | | |
102 | | - | |
| 111 | + | |
103 | 112 | | |
104 | 113 | | |
105 | 114 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
73 | 73 | | |
74 | 74 | | |
75 | 75 | | |
76 | | - | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
77 | 86 | | |
78 | 87 | | |
79 | 88 | | |
80 | 89 | | |
81 | 90 | | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
82 | 95 | | |
83 | 96 | | |
84 | 97 | | |
85 | 98 | | |
86 | 99 | | |
| 100 | + | |
87 | 101 | | |
88 | 102 | | |
89 | 103 | | |
| |||
0 commit comments