Conversation
| report, | ||
| outcome: status::AnalysisOutcome::Unchanged, | ||
| }; | ||
| } |
There was a problem hiding this comment.
The compiler doesn't yell at you about code after this line being unreachable?
There was a problem hiding this comment.
yeah, I wondered the same - feels a little sketchy to have "impossible-to-reach" code down here that's definitely untested, but also this is probably fine for now as long as the build isn't broken?
There was a problem hiding this comment.
Yeah...I couldn't really think of a better way to do this incrementally while the diffing code and the sitrep-generating code both currently don't exist. I suppose we could just not add this part at all, but I wanted to make sure to remember to put it there when the necessary stuff is added. Once we actually have enough code to start trying to produce sitreps, any tests for that will start failing due to this until the diffing code is added, which felt like the right way to make sure we remembered to do that part...
What do you guys think?
There was a problem hiding this comment.
eh. I vote move forward for now. don't want to get hung around the wheel too much, but agree that we should not leave it in this weird intermediate state for too long.
There was a problem hiding this comment.
I would also like to avoid that! :)
| report, | ||
| outcome: status::AnalysisOutcome::Unchanged, | ||
| }; | ||
| } |
There was a problem hiding this comment.
yeah, I wondered the same - feels a little sketchy to have "impossible-to-reach" code down here that's definitely untested, but also this is probably fine for now as long as the build isn't broken?
Depends on #10258. As described in #10260 (and [§8.2 RFD 603]), **only _you_ can prevent inventory collection time travel**: because each Nexus that might participate in FM analysis has its own copy of the inventory which it considers to be current (loaded by its `inventory_load` background task). Since we must allow for the possibility of a particular Nexus being arbitrarily behind, we can easily imagine a scenario where a Nexus attempts to perform FM analysis and generate a sitrep while operating on a copy of the inventory that is actually *older* than the inventory that was used to produce the current sitrep. If this happens, we might incorrectly think that the state of the system has _changed_ from the state in the current sitrep, while actually, we are just operating on an outdated understanding. This branch fixes that. In #10260, I proposed addressing this by making the CTE that inserts a sitrep into the sitrep history only succeed if the inventory collection UUID in that sitrep's metadata is newer than the one in the current sitrep's metadata. This would probably have worked, but putting that check in the already somewhat hairy CTE was not actually necessary. Instead, we can just perform this check *before* actually doing any analysis, which feels a lot nicer. In order to do this, it was necessary to add a query to just go and fetch only the timestamps from an inventory by its UUID, which was not a huge deal. I think it is plausible that the "is strictly newer than" check for inventory collections may be useful elsewhere as well. Fixes #10260 [§8.2 RFD 603]: https://rfd.shared.oxide.computer/rfd/0603#_inventory
Depends on #10255.
This commit does a bit more of the necessary plumbing to allow diagnosis engines defined in the
nexus-fmcrate to run in thefm_analysisbackground task. Now, there's a synchronousnexus_fm::diagnosis::analyzefunction which, given a set of analysis inputs and a sitrep builder, does...something. In the future, this function will run the diagnosis engines for various components, classes of faults, subsystems, and so on, passing them all the new ereports from the inputs and the sitrep builder so that they can make changes to it. Right now, it just fails with the error "FM analysis is not implemented yet", which, technically, means that this PR makes no functional change from the previous state of this code. However, if that function were not to do that, we would then go on and attempt to actually commit the new sitrep. So that's cool. Furthermore, we now also take the analysis report from #10255 and jam it into the bg task output.