Commit c49c574
committed
workloadmeta/docker: recover panics in stream() event handlers (quickfix)
The docker workloadmeta collector's stream() goroutine runs forever and
handles two event streams (container + image) by calling
handleContainerEvent / handleImageEvent inline. Neither path had any
panic recovery, so a single panic inside buildCollectorEvent \u2014 for
example, one raised by moby/client's JSON decoder while processing a
malformed ContainerInspect response \u2014 would bubble up, terminate the
goroutine and crash the whole agent process.
This happened in production during an SMP regression-detector run on
the gpu_check_hopper_fake_nvml experiment (every replicate crashed the
agent after a few hundred seconds, exhausting the 1h10m SMP timeout);
see taskmds/05001 for the full captured stack and context.
Wrap both event handlers in a runWithRecovery helper that:
* logs the panic + debug.Stack() at ERROR with enough context to
identify the offending event (container ID + action for containers,
action for images),
* drops the offending event,
* lets the stream loop continue processing subsequent events.
This is a quickfix, not a root-cause fix \u2014 the underlying panic is
still inside moby/client@v0.4.0's JSON decoder and needs to be fixed
there (or upstream in Go's encoding/json + reflect interaction with
container.InspectResponse). taskmds/05001 tracks that work. The
companion regression test for the agent-visible half of the bug is at
pkg/util/docker/inspect_panic_test.go (commit bc6a6f4).
Three unit tests cover the new helper:
* TestRunWithRecovery_SwallowsPanic \u2014 a panicking function returns
to its caller normally.
* TestRunWithRecovery_NoPanicNoOp \u2014 the happy path is unchanged.
* TestRunWithRecovery_SubsequentCallsAfterPanic \u2014 the reusability
property the stream loop actually relies on: a panic in one call
does not affect the next.1 parent 14ff781 commit c49c574
2 files changed
Lines changed: 124 additions & 8 deletions
Lines changed: 46 additions & 8 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
| 15 | + | |
15 | 16 | | |
16 | 17 | | |
17 | 18 | | |
| |||
167 | 168 | | |
168 | 169 | | |
169 | 170 | | |
170 | | - | |
171 | | - | |
172 | | - | |
173 | | - | |
| 171 | + | |
174 | 172 | | |
175 | 173 | | |
176 | | - | |
177 | | - | |
178 | | - | |
179 | | - | |
| 174 | + | |
180 | 175 | | |
181 | 176 | | |
182 | 177 | | |
| |||
198 | 193 | | |
199 | 194 | | |
200 | 195 | | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
201 | 239 | | |
202 | 240 | | |
203 | 241 | | |
| |||
Lines changed: 78 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
0 commit comments