Conversation
aarongable
left a comment
There was a problem hiding this comment.
I have a few minor comments about the code, but nothing big.
My major concern is that this simply doesn't feel like a prober. It's doing so much work, and has so many opportunities for failure, all of which get wrapped up in big accumulated errors. In my mind, the ideal prober is short, simple, only emits metrics (not meaningful logs), and we have alerts for every single metric it exposes. I understand that it is useful for this to live within boulder-observer, so we don't have to deploy and monitor Yet Another Component, but I'm registering my moderate discomfort with the complexity here.
beautifulentropy
left a comment
There was a problem hiding this comment.
Overall, the structure of this change looks good to me. I do have a few comments, mostly concerning some places where we could panic. I also noticed there isn't a corresponding update to cmd/boulder-observer/README.md.
|
Oops sorry I had one more change to push, the README updates! Thanks for all the great feedback BTW. |
Prior work: letsencrypt/crl-monitor#88
Fixes #8618