feat(starfish-core): score integration feature branch#11533
Closed
piotrm50 wants to merge 51 commits into
Closed
Conversation
- Drop spurious increment_invalid_reports_count when scoring is locally disabled. The sender did nothing wrong; only the local node has scoring off, so charging them as having sent an invalid report is semantically wrong. - Replace the verify-vs-process TODO with a one-line invariant comment. The reconfig guard at the top of the process arm already covers the end-of-epoch lock-out, and validation has fully moved into verify_consensus_transaction.
- Make Misbehaviors::from infallible by replacing the inherent method
with the From trait. Every variant maps to Some, so the Option was
misleading.
- Replace bogus expect() template strings (whose {placeholders} never
interpolate) with direct indexing in MisbehaviorCounts::get_value
and get_metric. Rust's built-in panic prints the offending index.
- Add Eq/Copy/Debug derives to Misbehaviors and Eq to Version.
- Delete VersionedMisbehaviorReport::is_valid_version, which has no
callers after the refactor (MisbehaviorConfig::accepts_report
replaces it).
- Mark ConsensusOutputMisbehaviors as #[allow(dead_code)] until the
Starfish wiring lands; today misbehavior_counts() returns vec![].
Both directions of the MisbehaviorCounts <-> wire-format conversion now match on the Misbehaviors enum instead of indexing self.0[0..3]. Reordering ReportedMisbehaviors can no longer silently swap categories — adding or removing a variant becomes a compile error in the V1 arm. - to_report now takes &ReportedMisbehaviors and assigns LegacyReportPayload fields by matching on each row's variant. - The From<&VersionedMisbehaviorReport> impl is replaced by from_report, which builds rows in the order dictated by ReportedMisbehaviors. - Callers in MisbehaviorMonitor and ReportAggregator pass the schema through from MisbehaviorConfig.
The previous load -> merge_max -> store sequence on ArcSwapOption was not atomic: two concurrent process_report calls for the same authority could lose an update (the smaller merged value overwriting a larger one). The single-threaded consensus handler currently shields the aggregator from this race, but the type now enforces the invariant itself rather than relying on caller serialization. rcu retries the closure under contention, so concurrent merges serialize correctly. Reads remain lock-free.
Replace Vec<AtomicU64> with ArcSwap<Vec<u64>> for current_scores. The previous implementation stored each score with a per-cell Relaxed write inside update_scores_v1, so a checkpoint reader could observe a torn mix of old and new values across authorities. A single pointer swap now publishes the whole score vector at once; readers see a consistent snapshot. update_scores_v1 builds the new Vec<u64> locally and stores it in one shot, and current_scores() loads the snapshot and clones it.
Two leftovers from the scorer-refactoring rebase: - consensus_validator.rs imported and constructed MisbehaviorsV1, which was renamed to LegacyReportPayload in 972a704. Update the import and construction site. - consensus_handler.rs passed 8 args to starfish_core::CommittedSubDag::new which only takes 7; the extra vec![] was added in 47e51d4 during the consensus-output API wiring and never trimmed when the constructor shape settled. Drop the trailing argument.
Callers must call MisbehaviorSchemaVersion::accepts_report before constructing MisbehaviorCounts from a report; today the only path is ReportAggregator::process_report, which sits behind validate_report in verify_consensus_transaction. The assertion makes that contract explicit so future call sites or new wire-format versions can't silently produce wrongly-shaped counts.
The comment referenced the previous Vec<AtomicU64> storage; scores are now published as a single ArcSwap<Vec<u64>> snapshot.
…om_consensus_output The previous `.unwrap_or_default()` produced an empty `Vec<u64>` when a locally-tracked `Misbehaviors` variant was missing from the consensus output, violating the documented matrix invariant that each row's length equals `committee_size`. Downstream `get_value`/`get_metric` would then panic and `to_report` would build wire payloads peers reject. Take `committee_size` as a parameter and pre-allocate zero rows so missing categories project to zeros instead of empty rows. Flip the iteration to input -> schema, dropping the `swap_remove` (move-out is now free via the consuming `for` loop) and incidentally dropping silent duplicate-handling. Also document the single-writer assumption on the rate-limit triple in `MisbehaviorMonitor` to flag the implicit invariant for future readers.
The file no longer contains a MisbehaviorConfig struct; it holds the shared data model for the misbehavior subsystem (schema version, categories, counts, conversions). This commit moves the file with no content changes so the rename is detected by git review tools.
… Misbehavior(s)/ReportPayload Replace the in-memory `MisbehaviorCounts(Vec<Vec<u64>>)` with a versioned enum (`MisbehaviorCounts::V1(MisbehaviorCountsV1)`) whose V1 variant is a struct with named per-metric fields. Every operation that touches all metrics — `merge_max`, `to_report`, `from_report`, the median computation in `Scorer::calculate_median_report` — is now exhaustive at compile time via either a named struct literal or a `match` on `Misbehavior`. Adding a metric to the V1 struct produces missing-field errors at the call sites that need updating, replacing the previous documented "append-only" enum order convention. Renames bundled in for clarity: - `LegacyReportPayload` -> `ReportPayloadV1` (the active V1 wire format, not a deprecated one). - `Misbehaviors` -> `Misbehavior`, `ConsensusOutputMisbehaviors` -> `ConsensusOutputMisbehavior` (Rust idiom for tagged-union enums where each variant is a single instance). - Drop the unused `ReportPayload(Vec<Vec<u64>>)` placeholder. Future V2 will add `ReportPayloadV2` as a dedicated named-field struct. - Drop the duplicate `verify_legacy_payload` free function in iota-core; use the existing `ReportPayloadV1::verify` method on the type itself. `Scorer` now stores `schema_version` so it can resolve `Parameters` indices back to `Misbehavior` variants when looking up rows. `Parameters` itself remains positional and is flagged as a follow-up. BCS encoding of `DBReceivedReportsStatePerAuthority.received_metrics` changes from a length-prefixed nested array to an enum tag plus four named `Vec<u64>` fields. The loading path is dead-code-gated and the schema is unreleased; no migration needed.
… out of variants Reshape the type from an enum-with-OnceCell-per-variant into a struct that wraps a versioned `ReportPayload` enum and a single `#[serde(skip)]` digest cache. The digest is logically per-report, not per-payload-version, so it belongs in one place. Wire format is unchanged: BCS still serializes only `payload` (enum tag + the V1 named-field struct). Adding V2 now means appending one variant to `ReportPayload`; no per-variant cache to remember. Update call sites: pattern-match on `report.payload` instead of the report itself; constructors go through `VersionedMisbehaviorReport::new_v1`.
- scorer.rs Parameters: replace removed ReportedMisbehaviors reference with MisbehaviorSchemaVersion::reported_misbehaviors(). - messages_consensus.rs VersionedMisbehaviorReport: spell out the serde(skip)-only invariant rather than asserting layout equivalence to a bare ReportPayload. - messages_consensus.rs ReportPayload: drop the stale // V2(ReportPayloadV2), placeholder; the doc on the enum covers the extension pattern. - messages_consensus.rs ReportPayloadV1: convert to /// doc comment and note that field declaration order is part of the wire format. - misbehavior_monitor.rs rate-limit fields: clarify that the two AtomicU64s form a logical tuple while the AtomicBool is independent. - misbehavior.rs MisbehaviorCounts: clarify wire-format struct field order is load-bearing, in-memory mirror is not. - consensus_output_api.rs ConsensusOutputMisbehavior: explain why the enum is kept while Starfish wiring is pending. - report_aggregator.rs invalid_reports_count: point at validate_report as the bump site for the determinism rationale.
…ches Two related boundary cleanups around MisbehaviorCounts: - Drop the redundant version parameter from to_report and from_report. Each MisbehaviorCounts variant maps 1:1 to a ReportPayload variant, so the version was always derivable from self / report.payload. Removing it makes wire/schema mismatch unrepresentable, which lets the debug_assert! disappear with no loss of safety. Misbehavior gains Hash (used below) and the doc comment is updated to drop the misleading append-only claim — the enum is not serialized; what's load-bearing is the ordering of the schema's reported_misbehaviors() slice. - Make from_consensus_output's projection visible. It previously silently zero-filled missing categories and silently dropped extras. Now it warn-logs three drift conditions: row length not matching committee_size (skipped), duplicate category in the same output (overwritten), and category tracked locally but absent from consensus (zero-filled). No new metrics — operator log surface only. Plus a TODO at scorer.rs's MisbehaviorCounts::V1 destructure clarifying that V2 should branch the whole computation rather than be patched with an if let.
Two distinct failure modes (wrong wire-format variant vs. malformed
payload) used to collapse into a bare bool, leaving the caller's warn
log to say only "Received invalid misbehavior report from X".
Operators investigating drift had no signal to distinguish a peer
running a different schema version from a peer producing
wrongly-shaped vectors.
Introduce ReportValidationError { WrongSchemaVersion, PayloadShape }
with Display, return Result<(), ReportValidationError> from
validate_report, and propagate the reason in the warn at
verify_consensus_transaction.
…rsV1 The Scorer had two structural problems that compounded each other: - Layout mismatch: Scorer's positional Parameters (six Vecs indexed by misbehavior position) didn't mirror the named-field shape of MisbehaviorCountsV1 / ReportPayloadV1. The only reason the indices worked was MisbehaviorSchemaVersion::reported_misbehaviors(), a slice that existed solely to bridge them. - Delegation seam in the wrong type: Scorer had two impl blocks. One delegated via match self.version, the other implemented V1 math directly on Scorer with _v1-suffixed methods. The version enum carried only data; the implementation lived elsewhere. This commit moves the per-version implementation onto the version enum: - ScorerVersion is renamed to VersionedScorer. Variants are full per-version structs (today: ScorerV1) that own their parameters AND their update_scores method. - ScorerV1 has named MetricParams fields mirroring MisbehaviorCountsV1. metric_to_score moves to MetricParams. The score loop iterates (row, params) pairs explicitly via metric_pairs(), no more metric_index: usize, no precomputed minor_indices/minor_weights/ major_indices, no MisbehaviorSchemaVersion::reported_misbehaviors() lookup. - Scorer collapses to a thin shell with one impl block: it owns the ArcSwap-published score vector and the voting-power table, and delegates to self.version.update_scores(...). Net: MisbehaviorSchemaVersion drops reported_misbehaviors() and num_metrics(). Its remaining API is from_protocol + accepts_report — it now versions only the wire format. The from_consensus_output warning loop uses a local const-array of expected V1 categories. Test numerical expectations are unchanged; the refactor is structural.
After commit D, the type's responsibilities shrank to from_protocol + accepts_report — it versions the wire-format report, not a schema. The schema lives in MisbehaviorCountsV1 / ReportPayloadV1 named-field types. "Schema" is misleading; "report version" is what it actually selects. Renames: - MisbehaviorSchemaVersion -> MisbehaviorReportVersion (the type) - schema_version -> report_version (fields and locals) - ReportValidationError::WrongSchemaVersion -> WrongReportVersion - mock_schema_version -> mock_report_version (test helpers)
The cfg(test) calculate_scores_v1 / calculate_median_report bridges on Scorer existed only to keep pre-VersionedScorer tests working without edits. They masked the version split with extra glue: a destructure out of MisbehaviorCounts followed by a match on VersionedScorer that re-destructured into the V1 variant. V1-specific tests now address ScorerV1 directly via ScorerV1::v1_parameters() and call median_report / score_from_median on it. They build MisbehaviorCountsV1 inline rather than wrapping in MisbehaviorCounts::V1(...). When V2 lands its tests live alongside ScorerV2 without touching V1's. Tests of Scorer (initialization, update_scores end-to-end) stay on the outer struct since they exercise the publish path / construction.
Mirror the AuthorityCapabilitiesV1 shape: the report carries its own authority + generation instead of being passed as a 3-tuple alongside ConsensusTransactionKind::MisbehaviorReport. Callers thread one value through, the consensus key reads provenance off the report, and the 'authority matches consensus sender' check stays at the boundary. Wire format is preserved by ordering the new struct fields as authority || payload || generation — BCS encodes a struct as the concatenation of its fields, so the bytes match the legacy 3-tuple. Two new tests in iota-types pin this: - misbehavior_report_wire_format_unchanged: struct-level bytes - misbehavior_report_consensus_kind_wire_format_unchanged: full ConsensusTransactionKind including the variant tag Required because misbehavior reports are already on testnet — any wire drift here would halt a running cluster.
The counter is incremented in two sites in verify_consensus_transaction but nothing reads it: no Prometheus metric, no DBMap persistence (the DBReceivedReportsStatePerAuthority naming and to_serializable machinery notwithstanding), no Scorer input, no outgoing report field. Documenting this in place so the next reader doesn't assume the snapshot infrastructure is load-bearing.
The previous TODO claimed the DBReceivedReportsStatePerAuthority / to_serializable scaffolding was unwired. It is — on this branch — but that's intentional: Olivia's score-integration-persist-scorer-state follow-up branch already contains the DBMap, the snapshot path through ConsensusCommitOutput, and the restore_from_tables() logic. Refactor the comment to record that context so the next reader doesn't try to delete the snapshot machinery thinking it's orphaned.
The in-memory MisbehaviorCounts/MisbehaviorCountsV1 mirrored the wire ReportPayload/ReportPayloadV1 1:1 — same fields, same types, From impls that were pure clones. The wrapper carried no behavior beyond merge_max, zeros, and from_consensus_output, and the metric() accessor was unused. Drop the mirror, rename ReportPayload[V1] to MisbehaviorObservations[V1] in iota-types so the wire type also names the in-memory role, and move the three remaining helpers to free functions in iota-core's misbehavior module. Wire format is unchanged — the BCS-pinning tests in iota-types::messages_consensus::tests still pass byte-for-byte. Also fix a stale comment referring to the "consensus crate" that should read "starfish crate".
A round of post-refactor cleanups identified during review: - Rename outer `Scorer` to `Scoreboard` — it owns the published score vector and voting-power table, the actual math lives in `VersionedScorer`/`ScorerV1`. The thing called "scorer" now actually scores. - Move the `MisbehaviorObservations::V1(_)` unwrap from `ScorerV1::median_report` up into `VersionedScorer::update_scores`. `ScorerV1` now consumes typed V1 reporters directly; when V2 lands the dispatcher's match becomes non-exhaustive and forces a deliberate decision instead of silently dropping V2 reports. - Make `MisbehaviorMonitor::update_from_consensus_output` fold new counts in via `merge_max` under RCU, so the local view is monotonic even if upstream produces a transient dip. Closure is pure (commutative + associative + idempotent), so RCU's may-run-multiple-times semantics are safe. - Drop the `ReceivedReportsState` newtype in `report_aggregator`. Its hand-rolled `Index`/`FromIterator` impls were pure delegation to the inner `Vec`; removing them removes ~14 lines of ceremony. - Simplify `reporters_with_voting_power` by switching to `load_full()` and a plain `for`/`if let`. Drops the explicit `Guard` + `as_ref` + `Arc::clone` dance. - Remove the `misbehavior_monitor_version` protocol-config field (introduced earlier on this branch, not yet shipped) and have `MisbehaviorReportVersion::from_protocol` read `scorer_version` instead. Scorer impl and report wire format now bump together — one knob, simpler model. Decoupling can be reintroduced in a future PR if a concrete upgrade requires it.
Replaces the position-tagged `Vec<(ConsensusOutputMisbehavior, Vec<u64>)>` return shape with a named-field bridge struct `ConsensusOutputMisbehaviorCounts`. The trait impl is now the single bridge between Starfish's internal observation state and IOTA's wire schema; the mapper projects the bridge struct onto whichever `MisbehaviorObservationsVN` is active. Adding a metric is now a deliberate API change, not a position-list extension that could lose duplicates / missing categories silently. Side effects of the new shape: - `Misbehavior` category enum, `ConsensusOutputMisbehavior` enum, `V1_EXPECTED` array, `HashSet<Misbehavior>` dedup tracking, and the duplicate-row / missing-category warn loops are all gone — the conditions they defended against are unrepresentable in the new contract. - The trait method gets a default impl returning the all-empty struct, so unwired Starfish gets correct behavior for free. - A non-empty wrong-length row in `observations_from_consensus_output` now `debug_assert_eq!`s in dev/CI and `error!`s with structured fields in production. Wrong lengths mean consensus and iota-core disagree on committee size — programmer error, not adversarial input — but harmless because the zero-fill projection makes `merge_max(current, zeros)` a no-op. Plus a small batch of follow-up cleanups identified during review: - Drop the unused `_report_version` parameter on `Scoreboard::new`. - `ReportAggregator::validate_report` derives committee size from internal state, removing the "could caller and aggregator disagree" question. - Fix the `MisbehaviorReportVersion::from_protocol` panic message to say "scorer version" — it reads `scorer_version_as_option()`. - Rename `current_local_counts` → `current_local_observations` to match the renamed type.
Contributor
Author
|
Replaced by a fresh umbrella PR off develop — the old base branch carried stale scorer-refactoring commits that already landed in develop. See replacement coming next. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Umbrella PR for the Starfish score-integration work. Aggregates the stacked PRs below into a single feature branch so the full pipeline can be reviewed end-to-end and merged into `develop` once each piece is approved.
This PR is intentionally empty until the first child lands on the feature branch.
Stacked PRs (in merge order)
Tracking
Part of https://github.com/iotaledger/iota-private/issues/277
Notes for reviewers
Test plan