Commit 68ebd3f
authored
cosmo_seq: mask RAA229620A Iout overcurrent warnings (#2511)
See #2510 --- evidentially, AMD requires that the SVI3 fast sum
overcurrent threshold for VRMs supplying SVI3 power rails to Turin CPUs
be set at a threshold of 90% of 90% of the part's EDC amperage. This
results in the VRM setting the overcurrent warning bit in `STATUS_IOUT`
and tugging on its fault pin to request attention pretty incessantly
when an IRM Group G CPU is under heavy (but importantly, safe!) load.
AMD suggests that BMC's "mask or ignore" this warning, since the
threshold has to be set to a value that results in it being asserted
during normal operation due to reasons I am no doubt to provincial to
understand. Therefore, this commit makes Hubris mask out the
`STATUS_IOUT` bit for overcurrent warnings on the RAA229620A
`VDDCR_CPU0` and `VDDCR_CPU1` SVI3 regulators. This felt like the best
solution because it (a) shouldn't suppress any fault bits that actually
*do* represent something bad, and (b) unlike ignoring this specific
warning, doesn't result in Hubris taking a bunch of IRQs that it just
ultimately decides to do nothing about.
With help from @ericaasen, we have reproduced the warnings by manually
adjusting the threshold to a lower value, as the CPU load testing script
I was running wasn't actually able to make a 9755 hit the 297 amp
threshold. I've validated that masking the relevant bit in the
`SMBALERT_MASK` for the `STATUS_IOUT` register makes the IRQing go away.
So that's good.
I don't _love_ the function I wrote for masking this bit: it's specific
for setting `SMBALERT_MASK` for `STATUS_IOUT` and not for any other
register. I feel like it ought to be possible to write a single function
for setting `SMBALERT_MASK` for *any* status register, but the
abstractions provided by `pmbus` at time of writing made it difficult to
do so generically --- or perhaps I'm holding it wrong! Anyway, I'd
really like to get the immediate fix in first and then try to come up
with a nicer abstraction for `SMBALERT_MASK` later, since I'd like to
stop the ereport spew in R20 if possible.
In addition, I have made some changes to the `initialize_pmbus_alerts`
(née `initialize_pmbus_warnings`) method in `Vcore`. Since that method
now both sets the input undervoltage warning threshold _and_ masks the
output overcurrent warning, I've changed it to not return a `Result` so
that it cannot accidentally `?` out early without trying to set _all_
the PMBus configurations it tries to set. We probably want to attempt to
set the `SMBALERT_MASK` even if we encountered an amount of I2C Weather
that caused setting the UV threshold for one of the VRMs to fail. So now
we put the error in the ringbuf and keep going there.
Fixes #2510.1 parent 1754c74 commit 68ebd3f
3 files changed
Lines changed: 95 additions & 21 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
762 | 762 | | |
763 | 763 | | |
764 | 764 | | |
765 | | - | |
766 | | - | |
767 | | - | |
768 | | - | |
769 | | - | |
770 | | - | |
771 | | - | |
772 | | - | |
773 | | - | |
774 | | - | |
775 | | - | |
776 | | - | |
| 765 | + | |
| 766 | + | |
| 767 | + | |
| 768 | + | |
| 769 | + | |
777 | 770 | | |
778 | 771 | | |
779 | 772 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
| 21 | + | |
21 | 22 | | |
22 | 23 | | |
23 | 24 | | |
| |||
41 | 42 | | |
42 | 43 | | |
43 | 44 | | |
| 45 | + | |
44 | 46 | | |
45 | 47 | | |
46 | 48 | | |
| |||
51 | 53 | | |
52 | 54 | | |
53 | 55 | | |
54 | | - | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
55 | 62 | | |
56 | 63 | | |
57 | 64 | | |
| |||
198 | 205 | | |
199 | 206 | | |
200 | 207 | | |
201 | | - | |
| 208 | + | |
202 | 209 | | |
203 | 210 | | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
204 | 221 | | |
205 | | - | |
| 222 | + | |
| 223 | + | |
206 | 224 | | |
207 | | - | |
208 | | - | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
209 | 228 | | |
210 | | - | |
211 | | - | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
212 | 265 | | |
213 | 266 | | |
214 | 267 | | |
| |||
220 | 273 | | |
221 | 274 | | |
222 | 275 | | |
223 | | - | |
224 | | - | |
225 | 276 | | |
226 | 277 | | |
227 | 278 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
143 | 143 | | |
144 | 144 | | |
145 | 145 | | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
146 | 176 | | |
147 | 177 | | |
148 | 178 | | |
| |||
0 commit comments