Commit db4756f
fix: store job_id explicitly to avoid separator collision in app_id parsing
When a role name ends with '_', the app_id string looks like:
experiment___role_name____job_id
Splitting on '___' produces job_id = '_job_id' (spurious leading '_'),
causing the status/cancel/log_iter calls to use a wrong ID and get 404.
Fix: _save_job_dir now stores the actual job_id in the JSON record.
describe(), log_iter(), and _cancel_existing() all read job_id from
the stored record, falling back to app_id.split('___')[-1] for
backwards compatibility with existing saved jobs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>1 parent c89ba4e commit db4756f
2 files changed
Lines changed: 46 additions & 6 deletions
File tree
- nemo_run/run/torchx_backend/schedulers
- test/run/torchx_backend/schedulers
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
161 | 161 | | |
162 | 162 | | |
163 | 163 | | |
164 | | - | |
| 164 | + | |
165 | 165 | | |
166 | 166 | | |
167 | 167 | | |
| |||
173 | 173 | | |
174 | 174 | | |
175 | 175 | | |
176 | | - | |
| 176 | + | |
| 177 | + | |
177 | 178 | | |
178 | 179 | | |
179 | 180 | | |
| |||
191 | 192 | | |
192 | 193 | | |
193 | 194 | | |
| 195 | + | |
194 | 196 | | |
195 | 197 | | |
196 | 198 | | |
| |||
217 | 219 | | |
218 | 220 | | |
219 | 221 | | |
220 | | - | |
| 222 | + | |
221 | 223 | | |
222 | 224 | | |
223 | 225 | | |
| |||
240 | 242 | | |
241 | 243 | | |
242 | 244 | | |
243 | | - | |
| 245 | + | |
244 | 246 | | |
245 | 247 | | |
246 | 248 | | |
| |||
257 | 259 | | |
258 | 260 | | |
259 | 261 | | |
260 | | - | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
261 | 265 | | |
262 | 266 | | |
263 | 267 | | |
| |||
276 | 280 | | |
277 | 281 | | |
278 | 282 | | |
| 283 | + | |
279 | 284 | | |
280 | 285 | | |
281 | 286 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
110 | 110 | | |
111 | 111 | | |
112 | 112 | | |
| 113 | + | |
113 | 114 | | |
114 | 115 | | |
115 | 116 | | |
| |||
132 | 133 | | |
133 | 134 | | |
134 | 135 | | |
| 136 | + | |
135 | 137 | | |
136 | 138 | | |
137 | 139 | | |
| |||
159 | 161 | | |
160 | 162 | | |
161 | 163 | | |
162 | | - | |
| 164 | + | |
163 | 165 | | |
164 | 166 | | |
165 | 167 | | |
| 168 | + | |
166 | 169 | | |
167 | 170 | | |
168 | 171 | | |
| |||
173 | 176 | | |
174 | 177 | | |
175 | 178 | | |
| 179 | + | |
176 | 180 | | |
177 | 181 | | |
178 | 182 | | |
| |||
188 | 192 | | |
189 | 193 | | |
190 | 194 | | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
191 | 223 | | |
192 | 224 | | |
193 | 225 | | |
| |||
207 | 239 | | |
208 | 240 | | |
209 | 241 | | |
| 242 | + | |
210 | 243 | | |
211 | 244 | | |
212 | 245 | | |
| |||
229 | 262 | | |
230 | 263 | | |
231 | 264 | | |
| 265 | + | |
232 | 266 | | |
233 | 267 | | |
234 | 268 | | |
| |||
245 | 279 | | |
246 | 280 | | |
247 | 281 | | |
| 282 | + | |
248 | 283 | | |
249 | 284 | | |
250 | 285 | | |
| |||
0 commit comments