Commit 9c84104
fix(e2e): heal provocation-race wedges in recovery-family scenarios (#148)
* fix(e2e): clear tamper-window wedge in recovery-node-id-mismatch
The provocation step (drbdadm down + sed + up) deliberately races
the satellite's Bug-287 revive. When the two drbdadm invocations
interleave badly, drbdmeta apply-al hits EBUSY on the backing
device and worker-2's kernel slot ends half-configured: disk
attached Inconsistent with AL suspended, peers registered but
never connected (StandAlone). The satellite then classifies the
slot as an operator disconnect (StandAlone + peer-device entries,
the W12 --skip-net guard) and never reconnects it, so the
post-recovery UpToDate wait times out. Seen as a ~30% lane-1
flake; not related to the .res node-id mismatch under test.
Bounce worker-2 with a bare drbdadm down after the provocation
and require kernel-truth UpToDate before applying the SKILL
recipe, so the recovery assertions start from a clean slot.
Validated 12/12 green on the dev stand (previously ~50% fail).
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
* fix(e2e): heal tamper-window wedge in recovery-down-reverses
The scenario's bare drbdadm down provocation can rarely collide
with the satellite's immediate revive: drbdmeta apply-al hits
EBUSY on the backing device and the revived slot ends
half-configured — disk Inconsistent, connections StandAlone with
peer-device entries retained. That matches the operator-disconnect
signature, so the satellite never reconnects the slot and the
final convergence wait times out on a provocation artefact, not on
the revive path under test. Seen twice on CI lanes with the
identical signature.
Unlike recovery-node-id-mismatch, the provocation here is already
a single writer, so an unconditional bounce would not reduce the
collision odds and would dilute the convergence assertion. Instead
the heal is conditional: the convergence wait keeps its full
untouched budget, and only a timeout that shows the exact wedge
signature triggers one clean bounce plus a kernel-truth UpToDate
wait before a single re-run of the wait. A regression of the
narrowed skip-net gate does not match the signature and still
fails loudly.
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
---------
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>1 parent c54be3b commit 9c84104
2 files changed
Lines changed: 136 additions & 13 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
60 | 60 | | |
61 | 61 | | |
62 | 62 | | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
63 | 90 | | |
64 | 91 | | |
65 | 92 | | |
66 | 93 | | |
67 | 94 | | |
68 | 95 | | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
69 | 99 | | |
70 | 100 | | |
71 | 101 | | |
| |||
160 | 190 | | |
161 | 191 | | |
162 | 192 | | |
163 | | - | |
164 | 193 | | |
165 | 194 | | |
166 | | - | |
167 | | - | |
168 | | - | |
169 | | - | |
170 | | - | |
171 | | - | |
172 | | - | |
173 | | - | |
174 | | - | |
175 | | - | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
176 | 213 | | |
177 | 214 | | |
178 | | - | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
179 | 251 | | |
180 | 252 | | |
181 | 253 | | |
| |||
205 | 277 | | |
206 | 278 | | |
207 | 279 | | |
208 | | - | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
303 | 303 | | |
304 | 304 | | |
305 | 305 | | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
306 | 353 | | |
307 | 354 | | |
308 | 355 | | |
| |||
0 commit comments