fm: minimum viable sitrep equality check by hawkw · Pull Request #10322 · oxidecomputer/omicron

hawkw · 2026-04-24T16:37:56Z

This branch implements the sitrep equality check that I left unimplemented in #10258. Now, when a new sitrep is generated, we check if it represents the same situation as the parent sitrep, and if it does not, we will insert it into the database. This is done using a normal PartialEq implementation. I punted on implementing proper diffing support (see #10259), since it's not actually necessary to test if two sitreps are equivalent.

In order to make this a bit easier to implement correctly, I refactored the nexus_types::fm::Sitrep struct to pull the non-metadata fields (the map of ereports and the map of cases, currently) into a new SitrepData struct, so that the top-level Sitrep now consists of SitrepMetadata and SitrepData. This is intended to make it easier to implement the comparison we need: checking that the sitrep contains identical cases and ereports to the parent, while ignoring the sitrep's ID, timestamp, and nexus ID (as these will differ for otherwise equal sitreps). I decided to factor out the data into its own struct so that the struct can just derive PartialEq, and we can test that the data is equal. This way, we don't have to individually check that the non-metadata fields are equal, which felt error-prone --- we don't want to have to remember to add any new non-metadata fields to the comparison later, since forgetting to do so might result in a non-equal sitrep being accidentally considered unchanged.

mergeconflict

Nice, I like it. Just a few minorish things.

mergeconflict · 2026-04-24T17:02:36Z

    pub async fn fm_sitrep_insert(
        &self,
        opctx: &OpContext,
        sitrep: Sitrep,


Nit: would it make sense to exhaustively destructure sitrep into data and metadata up front here?

mergeconflict · 2026-04-24T17:17:04Z

+    /// A map of all ereports associated with cases in this sitrep.
+    pub ereports_by_id: IdOrdMap<Arc<Ereport>>,


I wonder if we should still have ereports_by_id or, if we really need it, whether it should live somewhere else. I'm thinking about it because the PartialEq is looking at the set of ereports twice: once when traversing the cases, and again in aggregate here. If the two could ever drift somehow, then we'd have weirdness.

Could we just get away with an ereports_by_id() method that lazily computes the map?

So, the map is used for determining which ereports have been included in the sitrep when running the next preparation step, we do need it. I'd prefer not to have to re-build the map every time we do that. I'd also really prefer to not build it when the sitrep is constructed, since the sitrep loader already constructs this map while loading the sitrep parts from the database, here:

omicron/nexus/db-queries/src/db/datastore/fm.rs

Lines 285 to 298 in ed5c46a

// Fetch all ereports assigned to cases in this sitrep. We do this by

// querying the `fm_ereport_in_case` table for all entries with this

// sitrep ID, paginated by the ereport assignment's UUID. This query is

// `INNER JOIN`ed with the `fm_ereport` table to fetch the ereport's

// data for that assignment.

//

// We use the results of this query to populate a map of case UUIDs to

// the map of ereports assigned to that case. Ereports are de-duplicated

// using an additional map of `Arc`ed ereports, to reduce the in-memory

// size of the sitrep when an ereport is assigned to multiple cases. the

// JOINed query *will* potentially load the same ereport multiple times

// in that case, but this is still probably much more efficient than

// issuing a bunch of smaller queries to load ereports individually.

let mut ereports = iddqd::IdOrdMap::<Arc<fm::Ereport>>::new();

It feels pretty sad to build the map once while loading the sitrep, throw it away, and then have a Sitrep::new that immediately iterates over all the cases and re-builds a new copy of the same map we just made while loading and then threw away.

I think it would be fine to move this out of sitrep_data and into a top-level field, so that it's not part of the equality check; you're right that checking it separately from checing the cases should always be superfluous. Then, SitrepData has only one field in it, which feels silly...but the main purpose of that struct was just to ensure that any other non-case top-level sitrep objects go there, and we don't have any of those yet.

Actually, this is making me reconsider. I'm starting to think this data structure split is just in service of being able to derive the desired PartialEq, rather than being semantically meaningful. Like you point out, ereports_by_id is a useful view of the data — it's not metadata, and you don't want to recompute it every time you need it — but we don't really want to include it in our PartialEq implementation either.

How would you feel about just writing impl PartialEq for Sitrep by hand? You'd want to destructure it completely and then just compare the cases; that doesn't seem too bad.

Hmm, lemme think about it. On one hand, you're right that we could write an impl PartialEq for Sitrep that just ignores some fields, and uses exhaustive destructuring to ensure we don't forget to add any new fields that should be part of the equality check. This would be a much less invasive change than adding the SitrepData struct and having to change everything to access fields through it. On the other hand, I felt like there might be cases where we do want the ability to test whether two Sitrep structs are exactly equal, including their metadata --- I nearly offered tests as an example of this, but remembered that most of the tests that roundtrip through the database don't currently use the impl PartialEq for Sitrep due to the timestamps losing precision when read back from CRDB.

In general, I don't love the idea of having "opinionated" PartialEq implementations that skip potentially unequal fields, since it feels like one could just assume that sitrep_1 == sitrep_2 checks that they are exactly the same and not be aware that actually, the equality only checks that the cases are the same. The nice thing about writing sitrep_1.data == sitrep_2.data is that you are explicitly saying "I don't want to compare the metadata here" at the call site, rather than having to remember what the PartialEq impl for the whole struct does. I do have an idea for a potential third option, though; lemme sketch it out for you...

Here's an alternate design for this that makes it explicit when we are comparing the represented state, but also doesn't require changing the struct layout: https://github.com/oxidecomputer/omicron/pull/10325/changes

What do you think?

Sure, that works! Could also just be a plain function, whatever makes the call site look nice.

mergeconflict · 2026-04-24T17:28:34Z

+                    start_time,
+                    end_time,
+                    report,
+                    outcome: status::AnalysisOutcome::Unchanged,


What test coverage do we have for the different AnalysisOutcome variants? For example, if the equality comparison returns false, do we test that the outcome is Committed?

We can't easily test this yet as we never actually perform any analysis and it will always be "analysis failed". Once we are actually performing some analysis, I'll add tests for analysis steps that don't change the sitrep. We could try to refactor this code a bit more so that it's easier to unit test everything that happens after the diagnosis engines actually run, by manually constructing sitreps which should or should not be Unchanged, but I'm not convinced it's worth it.

mergeconflict · 2026-04-24T17:29:59Z

    }
 }

+/// The actual sitrep.


Nit: ehhh what is "actual," actually?

This branch implements the sitrep equality check that I left unimplemented in #10258. Now, when a new sitrep is generated, we check if it represents the same situation as the parent sitrep, and if it does not, we will insert it into the database. This is done using a normal `PartialEq` implementation. I punted on implementing proper diffing support (see #10259), since it's not actually necessary to test if two sitreps are equivalent. This is an alternate design from the one I implemented in #10322. Rather than factoring out the sitrep's state into a new struct, which was a bit awkward, we just add a wrapper type for doing state equivalence comparisons. Closes #10322.

hawkw added 2 commits April 24, 2026 09:25

[fm] split sitrep data from metadata

12d282f

[fm] minimum viable sitrep equality check

42a2fe7

hawkw requested review from mergeconflict and smklein April 24, 2026 16:37

mergeconflict approved these changes Apr 24, 2026

View reviewed changes

hawkw mentioned this pull request Apr 24, 2026

[fm] minimum viable sitrep equality check (alternate version) #10325

Merged

hawkw closed this in #10325 Apr 25, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fm: minimum viable sitrep equality check#10322

fm: minimum viable sitrep equality check#10322
hawkw wants to merge 2 commits intomainfrom
eliza/sitrep-eq

hawkw commented Apr 24, 2026

Uh oh!

mergeconflict left a comment

Uh oh!

mergeconflict Apr 24, 2026

Uh oh!

mergeconflict Apr 24, 2026

Uh oh!

hawkw Apr 24, 2026

Uh oh!

mergeconflict Apr 24, 2026

Uh oh!

hawkw Apr 24, 2026

Uh oh!

hawkw Apr 24, 2026

Uh oh!

mergeconflict Apr 25, 2026

Uh oh!

mergeconflict Apr 24, 2026

Uh oh!

hawkw Apr 24, 2026

Uh oh!

mergeconflict Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		/// A map of all ereports associated with cases in this sitrep.
		pub ereports_by_id: IdOrdMap<Arc<Ereport>>,

	// Fetch all ereports assigned to cases in this sitrep. We do this by
	// querying the `fm_ereport_in_case` table for all entries with this
	// sitrep ID, paginated by the ereport assignment's UUID. This query is
	// `INNER JOIN`ed with the `fm_ereport` table to fetch the ereport's
	// data for that assignment.
	//
	// We use the results of this query to populate a map of case UUIDs to
	// the map of ereports assigned to that case. Ereports are de-duplicated
	// using an additional map of `Arc`ed ereports, to reduce the in-memory
	// size of the sitrep when an ereport is assigned to multiple cases. the
	// JOINed query will potentially load the same ereport multiple times
	// in that case, but this is still probably much more efficient than
	// issuing a bunch of smaller queries to load ereports individually.
	let mut ereports = iddqd::IdOrdMap::<Arc<fm::Ereport>>::new();

Conversation

hawkw commented Apr 24, 2026

Uh oh!

mergeconflict left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants