feat(starfish-core,iota-core): Starfish score integration#11569
Open
piotrm50 wants to merge 6 commits into
Open
feat(starfish-core,iota-core): Starfish score integration#11569piotrm50 wants to merge 6 commits into
piotrm50 wants to merge 6 commits into
Conversation
6eb79be to
bcaa136
Compare
polinikita
approved these changes
May 20, 2026
polinikita
reviewed
May 20, 2026
bcaa136 to
5b01491
Compare
tomxey
approved these changes
May 21, 2026
bingyanglin
approved these changes
May 25, 2026
…stence (#10088) # Description of change Introduce misbehavior scoring infrastructure in Starfish and persist metrics in storage. This merges the scope of the former PR #10086 (types + Prometheus metrics) with storage persistence. ## ScoringMetricsStore (formerly #10086) - **`ScoringMetricsStore`** (`scoring_metrics_store.rs`): holds three sets of per-authority counters (`current_local_metrics_count`, `cached_metrics`, `uncached_metrics`). Uses named-field `StarfishMisbehaviorCounts` struct with explicit `faulty_blocks_provable`, `faulty_blocks_unprovable`, `missing_proposals`, `equivocations` fields. - **Prometheus metrics**: 8 new per-authority metric collectors in `NodeMetrics`. `faulty_blocks_unprovable` uses `peer` label (not `authority`) since unprovable faults are attributed to the sending peer, not the alleged block author. - The store is **not wired into `Context`** — it will be passed directly to consumers (`DagState`) when they are wired up in #10127. ## Storage persistence - `WriteBatch` extended with `scoring_metrics` field for persisting score updates atomically with other DAG state. - `DagState::flush()` calls `score_updates_to_write()` (placeholder, returns empty) and includes the result in the write batch. - Storage layer (`mem_store`, `rocksdb_store`) updated to handle scoring metrics column. - Store tests updated. ## Links to any relevant issues Part of iotaledger/iota-private#277 Part of feature branch #10083 Subsumes #10086 ## How the change has been tested - [x] Basic tests (linting, compilation, formatting, unit/integration tests) - [ ] Patch-specific tests (correctness, functionality coverage) - [ ] I have added tests that prove my fix is effective or that my feature works - [x] I have checked that new and existing unit tests pass locally with my changes --------- Co-authored-by: Piotr Macek <4007944+piotrm50@users.noreply.github.com>
5724504 to
5d64e8d
Compare
cyberphysic4l
approved these changes
May 29, 2026
…ate (#11531) Port the runtime misbehavior tracking logic onto the `MisbehaviorStore` types introduced in #10088. - `MisbehaviorStore` tracks per-authority misbehavior in two buckets (`in_memory` + `persisted`). - On flush: recompute the in-memory window, accumulate evicted counts, write the persisted snapshot to storage. - On startup: restore persisted counts from RocksDB and recompute in-memory counts from cached block refs. - Wire faulty block header detection into `MisbehaviorStore` from every peer receive site (subscriber main + bundle, header synchronizer including the own-last-header fetch, and both commit syncers). - Source-aware classification: `Subscriber` keeps `UnexpectedAuthority` as Unprovable; commit-chain / fetch-shape errors stay Untracked and belong to a separate commit-sync metric. Stacks on top of #10088. Supersedes #10127. fixes iotaledger/iota-private#278 fixes iotaledger/iota-private#280 Part of iotaledger/iota-private#277 - [x] Basic tests (linting, compilation, formatting, unit/integration tests) - [x] Patch-specific tests (correctness, functionality coverage) - [x] I have added tests that prove my fix is effective or that my feature works - [x] I have checked that new and existing unit tests pass locally with my changes --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…11534) # Description of change Stacks on top of #11531. Wires the producer-side misbehavior store (landed in #10088 + #11531) into the consumer in `iota-core` so that the `Scoreboard` / `ReportAggregator` (from #10779) finally receives per-authority misbehavior signal from Starfish. Each `CommittedSubDag` now carries a per-authority absolute snapshot of `persisted + in_memory` counts from `MisbehaviorStore` at commit time. Consumers diff against their own last-seen state if they want deltas — that responsibility doesn't belong in Starfish, and the sum is invariant across the eviction-time move between buckets, so the snapshot is race-free relative to flush. Producer (`starfish-core`): - `MisbehaviorStore::snapshot_totals()` — returns the per-authority absolute totals. - `CommittedSubDag` gets a new `misbehavior_counts: Vec<MisbehaviorCountsV1>` field, threaded at all three production `CommittedSubDag::new` call sites (`commit_solidifier`, `commit_observer`, `commit_syncer/fast`) via the `DagState`-owned `MisbehaviorStore`. - `CommittedSubDag` is **not** serialized over the wire (local `CommitConsumer` channel only), so the added `Vec` is an in-process cost only. Consumer (`iota-core`): - Overrides `ConsensusOutputAPI::misbehavior_counts()` on `starfish_core::CommittedSubDag` to transpose the per-authority struct-of-fields snapshot into the four per-field vecs expected by `ConsensusOutputMisbehaviorCounts`. Downstream wiring in `consensus_handler` and the report aggregator was already in place and waiting for non-empty data. Also folds in a standalone fix: `starfish-core` now declares its `iota-sdk-types` `serde` feature explicitly. Workspace builds were masking the missing feature via unification through `iota-types`; `cargo check -p starfish-core` alone fails without it. ## Links to any relevant issues fixes iotaledger/iota-private#406 Part of iotaledger/iota-private#173 ## How the change has been tested - [x] Basic tests (linting, compilation, formatting, unit/integration tests) - [x] Patch-specific tests (correctness, functionality coverage) - [x] I have added tests that prove my fix is effective or that my feature works - [x] I have checked that new and existing unit tests pass locally with my changes Local verification: - `cargo ci-clippy` — clean. - `cargo nextest run -p starfish-core --lib` — new `test_snapshot_totals_sums_persisted_and_in_memory` passes; existing misbehavior tests pass. - `cargo nextest run -p iota-core --lib consensus_output_api::tests` — new transpose + empty-snapshot tests pass.
# Description of change Persists `ReportAggregator` state to RocksDB so per-validator received-reports tallies survive node restarts. Stacks on top of #11534. Refs #10366 (the original implementation by @oliviasaa, which we couldn't reasonably rebase post-Starfish migration). ## What gets persisted - `DBReceivedReportsStatePerAuthority { received_metrics, invalid_reports_count }` for each authority that has changed state in this epoch. - New column family on `AuthorityEpochTables`: `received_reports_state: DBMap<u32, DBReceivedReportsStatePerAuthority>`. ## How writes work When `process_report` mutates the aggregator for authority X, it returns the post-merge snapshot. The consensus handler stashes that snapshot into `ConsensusCommitOutput.report_state_snapshots[X]` (a `BTreeMap`, last-writer-wins). When the quarantine flushes the commit's `DBBatch`, the captured snapshots get written atomically with the rest of the commit's effects. This means the persisted row written under commit N's batch is **exactly** "aggregator state for that authority after commit N's processing" — independent of when quarantine flushes the batch, and matching what `scoreboard.update_scores` saw at end of N. Cross-validator determinism preserved. ## How restore works `AuthorityPerEpochStore::new` calls `ReportAggregator::restore_from_tables(&tables)` immediately after constructing the aggregator. The existing consensus-handler replay path then reprocesses any committed-but-unflushed commits; merge-max idempotency reproduces the same end state. ## Explicit non-goals - **Score-update optimization** (recalc only when ≥1 report seen in commit) — tracked separately. - **Verify-time invalid-count durability**. `verify_consensus_transaction`-time `increment_invalid_reports_count` bumps stay in-memory only. They land on disk the next time a `process_report` for the same authority captures a fresh snapshot. Counter is observability-only; replay re-runs verify on the recovery path, so correctness is preserved. Documented in code. - **Separate replay logic for unflushed commits**. Not needed — the existing consensus-handler recovery path replays for free, and merge-max is idempotent. ## Links to any relevant issues fixes iotaledger/iota-private#310 ## How the change has been tested - [x] Basic tests (linting, compilation, formatting, unit/integration tests) - [x] Patch-specific tests (correctness, functionality coverage) - [x] I have added tests that prove my fix is effective or that my feature works - [x] I have checked that new and existing unit tests pass locally with my changes Local verification: - `cargo ci-clippy` — clean. - `IOTA_SKIP_SIMTESTS=1 cargo nextest run -p iota-core --lib` — 562 tests pass. - New tests: - `test_restore_from_iter_populates_in_memory_state` - `test_restore_from_iter_rejects_out_of_range_authority` - `test_process_report_returns_post_merge_snapshot` - `test_restore_round_trip_through_dbmap` — full persist→restore round trip via `AuthorityEpochTables::open` with a tempdir.
# Description of change Adds two per-authority Prometheus gauges in `AuthorityMetrics`, both labelled by hostname and published from `ConsensusHandler` after every consensus commit and seeded at construction so the series exist from epoch start: - `validator_scoreboard_scores` — sourced from `Scoreboard::current_scores()`. Resolves the gap where Scoreboard scores were computed and consumed by the checkpoint service but not observable to operators. - `invalid_misbehavior_reports_by_authority` — moved from `starfish-core::NodeMetrics` (where it was registered but never written) to `AuthorityMetrics`, switched to `IntGaugeVec`, and sourced from the new `ReportAggregator::invalid_reports_counts()` snapshot. Both gauges are gated on `calculate_validator_scores()` and cleared in `reset_on_reconfigure`. Adds `dev-tools/grafana-local/dashboards/validator-dashboard-v2.json` covering the new gauges alongside the existing starfish misbehavior metrics (provable/unprovable block faults, missing proposals, equivocations). The four misbehavior panels stack `in_memory` and `persisted` series so the sum is the running total. Intended to grow into the replacement for the legacy validator dashboard. ## Links to any relevant issues Closes iotaledger/iota-private#282 ## How the change has been tested - [x] Basic tests (linting, compilation, formatting, unit/integration tests) - [ ] Patch-specific tests (correctness, functionality coverage) - [ ] I have added tests that prove my fix is effective or that my feature works - [x] I have checked that new and existing unit tests pass locally with my changes
…11562) # Description of change Skip the per-commit `Scoreboard::update_scores(&report_aggregator)` recompute when no `MisbehaviorReport` was processed in the commit. Stacks on top of #11561 (which introduces `ConsensusCommitOutput.report_state_snapshots` — the signal we gate on). ## How the gate works - `ConsensusCommitOutput::has_report_state_changes()` returns `true` iff at least one `process_report` ran during this commit (i.e. `report_state_snapshots` is non-empty). - In `process_consensus_transactions_and_commit_boundary`, the existing call becomes: ```rust if self.protocol_config().calculate_validator_scores() && output.has_report_state_changes() { self.scoreboard.update_scores(&self.report_aggregator); } ``` ## Why this is safe (cross-validator determinism) - `Scoreboard::update_scores` is a pure function of `(voting_power, aggregator.reporters_with_voting_power())`. Same inputs → same output across validators. - All validators see the same `report_state_snapshots` emptiness (consensus is deterministic), so all skip / run in lock-step. - When all skip, `current_scores` retains its prior `Arc<Vec<u64>>` — same across validators because the previous `update_scores` ran on the same aggregator state. - `invalid_reports_count` bumps don't affect scores (`reporters_with_voting_power` reads only `received_metrics`), so verify-time bumps that don't mark dirty are also score-irrelevant. ## Startup bootstrap A freshly constructed `Scoreboard` starts at `[MAX_SCORE; committee_size]`, but a never-restarted peer's reflects accumulated updates. Without a fix, the next no-report commit would skip on both, diverging the published vector. So `AuthorityPerEpochStore::new` now calls `scoreboard.update_scores(&report_aggregator)` once, immediately after `restore_from_tables`. If the restored aggregator is empty (fresh epoch), `update_scores` returns `None` and leaves the default — correct for the fresh-epoch path too. ## How the change has been tested - [x] Basic tests (linting, compilation, formatting, unit/integration tests) - [x] Patch-specific tests (correctness, functionality coverage) - [x] I have added tests that prove my fix is effective or that my feature works - [x] I have checked that new and existing unit tests pass locally with my changes Local verification: - `cargo ci-clippy` — clean. - `IOTA_SKIP_SIMTESTS=1 cargo nextest run -p iota-core --lib` — 564 tests pass. - New tests in `scorer.rs`: - `test_skipped_update_leaves_current_scores_unchanged` — proves the skip path preserves `current_scores` bit-for-bit. - `test_bootstrap_from_restored_aggregator_matches_live_path` — proves a restored-and-bootstrapped scoreboard publishes the same vector as a never-restarted peer's, given the same aggregator state. ## Links to any relevant issues fixes iotaledger/iota-private#408
5d64e8d to
eda8a4d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description of change
Umbrella branch landing the Starfish → scoreboard integration on
develop. Rolls up the following PRs so they ship together:feat(starfish-core): introduce ScoringMetricsStore with storage persistence. VersionedScoringMetricsStorewithStorageMisbehaviorCountsand the underlying RocksDB column family — the persistence substrate everything else builds on.feat(starfish-core): implement misbehavior tracking pipeline in DagState. Wires misbehavior tracking throughDagState: faulty block header detection and classification, peer-vs-author handling, commit-level error classification, andMisbehaviorStoreshareable viaArc. Producer-side plumbing for everything above the consensus layer.feat(starfish-core,iota-core): emit misbehavior counts to iota-core. Producer side (starfish-core) snapshots per-authoritypersisted + in_memorycounts fromMisbehaviorStoreat commit time and threads them throughCommittedSubDagat all three production call sites (commit_solidifier,commit_observer,commit_syncer/fast). Consumer side (iota-core) overridesConsensusOutputAPI::misbehavior_counts()to transpose the struct-of-fields snapshot into the four per-field vectors expected byConsensusOutputMisbehaviorCounts. Downstream wiring inconsensus_handler/ReportAggregatorfrom feat(consensus, iota-core, iota-type): Refactoring of the Scorer module #10779 was already in place and waiting for non-empty data. Also adds an explicitiota-sdk-types/serdefeature onstarfish-core(workspace unification throughiota-typeswas masking this).feat(iota-core): persist ReportAggregator state across restarts. Persists per-validator received-reports tallies to RocksDB via a newreceived_reports_statecolumn family onAuthorityEpochTables. Writes use snapshot-at-mutation-time semantics:process_reportreturns the post-merge snapshot, whichConsensusCommitOutputcaptures inreport_state_snapshotsand the quarantine flushes atomically with the rest of the commit'sDBBatch. Restore viaReportAggregator::restore_from_tables(&tables)inAuthorityPerEpochStore::new; existing consensus-handler replay reproduces any committed-but-unflushed state thanks to merge-max idempotency. Refs the original implementation by @oliviasaa in feat(iota-core): Persist and restore scorer received reports state across restarts #10366 — couldn't reasonably rebase post-Starfish migration.feat(iota-core): skip scoreboard.update_scores on no-report commits. Gates the per-commitScoreboard::update_scores(&report_aggregator)recompute onConsensusCommitOutput::has_report_state_changes()(true iff ≥1process_reportran in the commit). Cross-validator determinism preserved:update_scoresis a pure function of(voting_power, reporters_with_voting_power), and all validators see the same snapshot emptiness. Adds a one-shot bootstrap inAuthorityPerEpochStore::newso a never-restarted peer and a restored-and-bootstrapped peer publish the same vector given the same aggregator state.Links to any relevant issues
Part of https://github.com/iotaledger/iota-private/issues/173
Closes:
How the change has been tested
Coverage lives in the component PRs: misbehavior emission transpose + empty-snapshot tests (#11534), persist/restore round trip + post-merge snapshot tests (#11561), skip-path and bootstrap-equivalence tests (#11562). Workspace
cargo ci-clippyclean;IOTA_SKIP_SIMTESTS=1 cargo nextest run -p iota-core --libpasses (562/564 across the stack).Release Notes
scoring_metrics(on the starfish store) andreceived_reports_state(onAuthorityEpochTables) — used to persist Starfish misbehavior counts and per-validator received-reports tallies across restarts. No migration or operator action required.