Commit 10746a7
authored
fix(envd): fix process cleanup in cgroup test to prevent OOM and data race (#2588)
The cgroup round-trip test spawns `tail /dev/zero` under a
memory-limited cgroup to verify the OOM kill. Three problems caused host
OOM and flaky failures when running with `-count` or `-race`:
Killing only bash left `tail` running as an orphan that ate unbounded
memory. The fix starts the child in its own process group (`Setpgid`)
and kills the entire group on timeout and in `t.Cleanup`. The command
now uses `exec` so bash replaces itself with the child process.
Both `waitForProcess` and `t.Cleanup` called `cmd.Wait()`, causing a
data race. Now `t.Cleanup` only kills, and `waitForProcess` owns the
wait — it drains the goroutine after killing on timeout so there's no
leak or race.
The second commit replaces `tail /dev/zero` with a perl one-liner that
allocates a fixed 512 MiB and sleeps. If the process escapes cleanup it
holds bounded memory rather than growing until the kernel intervenes.1 parent 6c0bcb1 commit 10746a7
1 file changed
Lines changed: 18 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
157 | 157 | | |
158 | 158 | | |
159 | 159 | | |
160 | | - | |
| 160 | + | |
161 | 161 | | |
162 | 162 | | |
163 | 163 | | |
164 | 164 | | |
165 | 165 | | |
166 | 166 | | |
| 167 | + | |
167 | 168 | | |
168 | 169 | | |
169 | 170 | | |
170 | 171 | | |
171 | 172 | | |
| 173 | + | |
| 174 | + | |
172 | 175 | | |
173 | 176 | | |
174 | 177 | | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
175 | 188 | | |
176 | 189 | | |
177 | 190 | | |
| |||
183 | 196 | | |
184 | 197 | | |
185 | 198 | | |
186 | | - | |
| 199 | + | |
187 | 200 | | |
188 | 201 | | |
189 | 202 | | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
190 | 206 | | |
191 | 207 | | |
192 | 208 | | |
| |||
0 commit comments