feat: add zombie-reaper.py to handle zombie qemu left-overs#588
Conversation
Martchus
left a comment
There was a problem hiding this comment.
The script name should probably reflect that this is about s390x and qemu processes specifically.
I think it is part of the normal operation that qemu processes are in the zombie state for a very short time (until the parent process reads the exit status). It is probably unlikely that the script will take action in those cases but it isn't ideal.
done
hm, not sure if that is realistic but ok :) Made the script reasonably more complicated by doing the following
|
and then, to be safe, check if it's not a meanwhile newly started process with the same PID ;-) |
43437f6 to
97414bc
Compare
done |
Motivation: The kernel team requested a crash dump (vmcore) to debug the `exit_mmap` deadlock on s390x. A standard reboot clears the system state but does not preserve the memory state needed for post-mortem analysis. Design Choices: We replace the `sudo reboot` command with a `sysrq-trigger` kernel panic (`echo c > /proc/sysrq-trigger`). This forces the kernel to trigger kdump, save the vmcore, and then automatically reboot the machine as configured in the kdump settings. Benefits: Automates the collection of invaluable debug data for the kernel team while maintaining the automated recovery of the openQA hypervisors. Related issue: https://bugzilla.suse.com/show_bug.cgi?id=1265624
97414bc to
9b4ae46
Compare
Related issue: https://progress.opensuse.org/issues/201144