Skip to content

feat(starfish-core,iota-core): Starfish score integration#11569

Open
piotrm50 wants to merge 6 commits into
developfrom
consensus/feat/starfish-score-integration
Open

feat(starfish-core,iota-core): Starfish score integration#11569
piotrm50 wants to merge 6 commits into
developfrom
consensus/feat/starfish-score-integration

Conversation

@piotrm50
Copy link
Copy Markdown
Contributor

@piotrm50 piotrm50 commented May 19, 2026

Description of change

Umbrella branch landing the Starfish → scoreboard integration on develop. Rolls up the following PRs so they ship together:

  • feat(starfish-core): introduce ScoringMetricsStore with storage persistence #10088feat(starfish-core): introduce ScoringMetricsStore with storage persistence. Versioned ScoringMetricsStore with StorageMisbehaviorCounts and the underlying RocksDB column family — the persistence substrate everything else builds on.
  • feat(starfish-core): implement misbehavior tracking pipeline in DagState #11531feat(starfish-core): implement misbehavior tracking pipeline in DagState. Wires misbehavior tracking through DagState: faulty block header detection and classification, peer-vs-author handling, commit-level error classification, and MisbehaviorStore shareable via Arc. Producer-side plumbing for everything above the consensus layer.
  • feat(starfish-core,iota-core): emit misbehavior counts to iota-core #11534feat(starfish-core,iota-core): emit misbehavior counts to iota-core. Producer side (starfish-core) snapshots per-authority persisted + in_memory counts from MisbehaviorStore at commit time and threads them through CommittedSubDag at all three production call sites (commit_solidifier, commit_observer, commit_syncer/fast). Consumer side (iota-core) overrides ConsensusOutputAPI::misbehavior_counts() to transpose the struct-of-fields snapshot into the four per-field vectors expected by ConsensusOutputMisbehaviorCounts. Downstream wiring in consensus_handler / ReportAggregator from feat(consensus, iota-core, iota-type): Refactoring of the Scorer module #10779 was already in place and waiting for non-empty data. Also adds an explicit iota-sdk-types/serde feature on starfish-core (workspace unification through iota-types was masking this).
  • feat(iota-core): persist ReportAggregator state across restarts #11561feat(iota-core): persist ReportAggregator state across restarts. Persists per-validator received-reports tallies to RocksDB via a new received_reports_state column family on AuthorityEpochTables. Writes use snapshot-at-mutation-time semantics: process_report returns the post-merge snapshot, which ConsensusCommitOutput captures in report_state_snapshots and the quarantine flushes atomically with the rest of the commit's DBBatch. Restore via ReportAggregator::restore_from_tables(&tables) in AuthorityPerEpochStore::new; existing consensus-handler replay reproduces any committed-but-unflushed state thanks to merge-max idempotency. Refs the original implementation by @oliviasaa in feat(iota-core): Persist and restore scorer received reports state across restarts  #10366 — couldn't reasonably rebase post-Starfish migration.
  • feat(iota-core): skip scoreboard.update_scores on no-report commits #11562feat(iota-core): skip scoreboard.update_scores on no-report commits. Gates the per-commit Scoreboard::update_scores(&report_aggregator) recompute on ConsensusCommitOutput::has_report_state_changes() (true iff ≥1 process_report ran in the commit). Cross-validator determinism preserved: update_scores is a pure function of (voting_power, reporters_with_voting_power), and all validators see the same snapshot emptiness. Adds a one-shot bootstrap in AuthorityPerEpochStore::new so a never-restarted peer and a restored-and-bootstrapped peer publish the same vector given the same aggregator state.

Links to any relevant issues

Part of https://github.com/iotaledger/iota-private/issues/173

Closes:

How the change has been tested

  • Basic tests (linting, compilation, formatting, unit/integration tests)
  • Patch-specific tests (correctness, functionality coverage)
  • I have added tests that prove my fix is effective or that my feature works
  • I have checked that new and existing unit tests pass locally with my changes

Coverage lives in the component PRs: misbehavior emission transpose + empty-snapshot tests (#11534), persist/restore round trip + post-merge snapshot tests (#11561), skip-path and bootstrap-equivalence tests (#11562). Workspace cargo ci-clippy clean; IOTA_SKIP_SIMTESTS=1 cargo nextest run -p iota-core --lib passes (562/564 across the stack).

Release Notes

  • Protocol:
  • Nodes (Validators and Full nodes): Two new RocksDB column families are auto-created on epoch store open — scoring_metrics (on the starfish store) and received_reports_state (on AuthorityEpochTables) — used to persist Starfish misbehavior counts and per-validator received-reports tallies across restarts. No migration or operator action required.
  • Indexer:
  • JSON-RPC:
  • GraphQL:
  • CLI:
  • Rust SDK:
  • gRPC:

@piotrm50 piotrm50 self-assigned this May 19, 2026
@iota-ci iota-ci added consensus Issues related to the Core Consensus team core-protocol labels May 19, 2026
@piotrm50 piotrm50 force-pushed the consensus/feat/starfish-score-integration branch from 6eb79be to bcaa136 Compare May 20, 2026 10:26
@piotrm50 piotrm50 marked this pull request as ready for review May 20, 2026 10:26
@piotrm50 piotrm50 requested review from a team as code owners May 20, 2026 10:26
@polinikita polinikita self-requested a review May 20, 2026 11:53
Comment thread crates/starfish/core/src/misbehavior_store.rs
@piotrm50 piotrm50 force-pushed the consensus/feat/starfish-score-integration branch from bcaa136 to 5b01491 Compare May 20, 2026 16:30
…stence (#10088)

# Description of change

Introduce misbehavior scoring infrastructure in Starfish and persist
metrics in storage. This merges the scope of the former PR #10086 (types
+ Prometheus metrics) with storage persistence.

## ScoringMetricsStore (formerly #10086)

- **`ScoringMetricsStore`** (`scoring_metrics_store.rs`): holds three
sets of per-authority counters (`current_local_metrics_count`,
`cached_metrics`, `uncached_metrics`). Uses named-field
`StarfishMisbehaviorCounts` struct with explicit
`faulty_blocks_provable`, `faulty_blocks_unprovable`,
`missing_proposals`, `equivocations` fields.
- **Prometheus metrics**: 8 new per-authority metric collectors in
`NodeMetrics`. `faulty_blocks_unprovable` uses `peer` label (not
`authority`) since unprovable faults are attributed to the sending peer,
not the alleged block author.
- The store is **not wired into `Context`** — it will be passed directly
to consumers (`DagState`) when they are wired up in #10127.

## Storage persistence

- `WriteBatch` extended with `scoring_metrics` field for persisting
score updates atomically with other DAG state.
- `DagState::flush()` calls `score_updates_to_write()` (placeholder,
returns empty) and includes the result in the write batch.
- Storage layer (`mem_store`, `rocksdb_store`) updated to handle scoring
metrics column.
- Store tests updated.

## Links to any relevant issues

Part of iotaledger/iota-private#277
Part of feature branch #10083
Subsumes #10086

## How the change has been tested

- [x] Basic tests (linting, compilation, formatting, unit/integration
tests)
- [ ] Patch-specific tests (correctness, functionality coverage)
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [x] I have checked that new and existing unit tests pass locally with
my changes

---------

Co-authored-by: Piotr Macek <4007944+piotrm50@users.noreply.github.com>
@piotrm50 piotrm50 force-pushed the consensus/feat/starfish-score-integration branch from 5724504 to 5d64e8d Compare May 29, 2026 11:03
Copy link
Copy Markdown
Contributor

@cyberphysic4l cyberphysic4l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just reviewed contents of #10088, #11531 and #11534 as I had already reviewed the other two.

piotrm50 and others added 5 commits May 29, 2026 22:35
…ate (#11531)

Port the runtime misbehavior tracking logic onto the `MisbehaviorStore`
types introduced in #10088.

- `MisbehaviorStore` tracks per-authority misbehavior in two buckets
(`in_memory` + `persisted`).
- On flush: recompute the in-memory window, accumulate evicted counts,
write the persisted snapshot to storage.
- On startup: restore persisted counts from RocksDB and recompute
in-memory counts from cached block refs.
- Wire faulty block header detection into `MisbehaviorStore` from every
peer receive site (subscriber main + bundle, header synchronizer
including the own-last-header fetch, and both commit syncers).
- Source-aware classification: `Subscriber` keeps `UnexpectedAuthority`
as Unprovable; commit-chain / fetch-shape errors stay Untracked and
belong to a separate commit-sync metric.

Stacks on top of #10088. Supersedes #10127.

fixes iotaledger/iota-private#278
fixes iotaledger/iota-private#280
Part of iotaledger/iota-private#277

- [x] Basic tests (linting, compilation, formatting, unit/integration
tests)
- [x] Patch-specific tests (correctness, functionality coverage)
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have checked that new and existing unit tests pass locally with
my changes

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…11534)

# Description of change

Stacks on top of #11531. Wires the producer-side misbehavior store
(landed in #10088 + #11531) into the consumer in `iota-core` so that
the `Scoreboard` / `ReportAggregator` (from #10779) finally receives
per-authority misbehavior signal from Starfish.

Each `CommittedSubDag` now carries a per-authority absolute snapshot of
`persisted + in_memory` counts from `MisbehaviorStore` at commit time.
Consumers diff against their own last-seen state if they want deltas —
that responsibility doesn't belong in Starfish, and the sum is invariant
across the eviction-time move between buckets, so the snapshot is
race-free relative to flush.

Producer (`starfish-core`):
- `MisbehaviorStore::snapshot_totals()` — returns the per-authority
  absolute totals.
- `CommittedSubDag` gets a new `misbehavior_counts:
Vec<MisbehaviorCountsV1>`
  field, threaded at all three production `CommittedSubDag::new` call
  sites (`commit_solidifier`, `commit_observer`, `commit_syncer/fast`)
  via the `DagState`-owned `MisbehaviorStore`.
- `CommittedSubDag` is **not** serialized over the wire (local
  `CommitConsumer` channel only), so the added `Vec` is an in-process
  cost only.

Consumer (`iota-core`):
- Overrides `ConsensusOutputAPI::misbehavior_counts()` on
  `starfish_core::CommittedSubDag` to transpose the per-authority
  struct-of-fields snapshot into the four per-field vecs expected by
  `ConsensusOutputMisbehaviorCounts`. Downstream wiring in
  `consensus_handler` and the report aggregator was already in place
  and waiting for non-empty data.

Also folds in a standalone fix: `starfish-core` now declares its
`iota-sdk-types` `serde` feature explicitly. Workspace builds were
masking the missing feature via unification through `iota-types`;
`cargo check -p starfish-core` alone fails without it.

## Links to any relevant issues

fixes iotaledger/iota-private#406
Part of iotaledger/iota-private#173

## How the change has been tested

- [x] Basic tests (linting, compilation, formatting, unit/integration
tests)
- [x] Patch-specific tests (correctness, functionality coverage)
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have checked that new and existing unit tests pass locally with
my changes

Local verification:
- `cargo ci-clippy` — clean.
- `cargo nextest run -p starfish-core --lib` — new
`test_snapshot_totals_sums_persisted_and_in_memory` passes; existing
misbehavior tests pass.
- `cargo nextest run -p iota-core --lib consensus_output_api::tests` —
new transpose + empty-snapshot tests pass.
# Description of change

Persists `ReportAggregator` state to RocksDB so per-validator
received-reports tallies survive node restarts. Stacks on top of #11534.

Refs #10366 (the original
implementation by @oliviasaa, which we couldn't reasonably rebase
post-Starfish migration).

## What gets persisted

- `DBReceivedReportsStatePerAuthority { received_metrics,
invalid_reports_count }` for each authority that has changed state in
this epoch.
- New column family on `AuthorityEpochTables`: `received_reports_state:
DBMap<u32, DBReceivedReportsStatePerAuthority>`.

## How writes work

When `process_report` mutates the aggregator for authority X, it returns
the post-merge snapshot. The consensus handler stashes that snapshot
into `ConsensusCommitOutput.report_state_snapshots[X]` (a `BTreeMap`,
last-writer-wins). When the quarantine flushes the commit's `DBBatch`,
the captured snapshots get written atomically with the rest of the
commit's effects.

This means the persisted row written under commit N's batch is
**exactly** "aggregator state for that authority after commit N's
processing" — independent of when quarantine flushes the batch, and
matching what `scoreboard.update_scores` saw at end of N.
Cross-validator determinism preserved.

## How restore works

`AuthorityPerEpochStore::new` calls
`ReportAggregator::restore_from_tables(&tables)` immediately after
constructing the aggregator. The existing consensus-handler replay path
then reprocesses any committed-but-unflushed commits; merge-max
idempotency reproduces the same end state.

## Explicit non-goals

- **Score-update optimization** (recalc only when ≥1 report seen in
commit) — tracked separately.
- **Verify-time invalid-count durability**.
`verify_consensus_transaction`-time `increment_invalid_reports_count`
bumps stay in-memory only. They land on disk the next time a
`process_report` for the same authority captures a fresh snapshot.
Counter is observability-only; replay re-runs verify on the recovery
path, so correctness is preserved. Documented in code.
- **Separate replay logic for unflushed commits**. Not needed — the
existing consensus-handler recovery path replays for free, and merge-max
is idempotent.

## Links to any relevant issues

fixes iotaledger/iota-private#310

## How the change has been tested

- [x] Basic tests (linting, compilation, formatting, unit/integration
tests)
- [x] Patch-specific tests (correctness, functionality coverage)
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have checked that new and existing unit tests pass locally with
my changes

Local verification:
- `cargo ci-clippy` — clean.
- `IOTA_SKIP_SIMTESTS=1 cargo nextest run -p iota-core --lib` — 562
tests pass.
- New tests:
  - `test_restore_from_iter_populates_in_memory_state`
  - `test_restore_from_iter_rejects_out_of_range_authority`
  - `test_process_report_returns_post_merge_snapshot`
- `test_restore_round_trip_through_dbmap` — full persist→restore round
trip via `AuthorityEpochTables::open` with a tempdir.
# Description of change

Adds two per-authority Prometheus gauges in `AuthorityMetrics`, both
labelled by hostname and published from `ConsensusHandler` after every
consensus commit and seeded at construction so the series exist from
epoch start:

- `validator_scoreboard_scores` — sourced from
`Scoreboard::current_scores()`. Resolves the gap where Scoreboard scores
were computed and consumed by the checkpoint service but not observable
to operators.
- `invalid_misbehavior_reports_by_authority` — moved from
`starfish-core::NodeMetrics` (where it was registered but never written)
to `AuthorityMetrics`, switched to `IntGaugeVec`, and sourced from the
new `ReportAggregator::invalid_reports_counts()` snapshot.

Both gauges are gated on `calculate_validator_scores()` and cleared in
`reset_on_reconfigure`.

Adds `dev-tools/grafana-local/dashboards/validator-dashboard-v2.json`
covering the new gauges alongside the existing starfish misbehavior
metrics (provable/unprovable block faults, missing proposals,
equivocations). The four misbehavior panels stack `in_memory` and
`persisted` series so the sum is the running total. Intended to grow
into the replacement for the legacy validator dashboard.

## Links to any relevant issues

Closes iotaledger/iota-private#282

## How the change has been tested

- [x] Basic tests (linting, compilation, formatting, unit/integration
tests)
- [ ] Patch-specific tests (correctness, functionality coverage)
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [x] I have checked that new and existing unit tests pass locally with
my changes
…11562)

# Description of change

Skip the per-commit `Scoreboard::update_scores(&report_aggregator)`
recompute when no `MisbehaviorReport` was processed in the commit.
Stacks on top of #11561 (which introduces
`ConsensusCommitOutput.report_state_snapshots` — the signal we gate on).

## How the gate works

- `ConsensusCommitOutput::has_report_state_changes()` returns `true` iff
at least one `process_report` ran during this commit (i.e.
`report_state_snapshots` is non-empty).
- In `process_consensus_transactions_and_commit_boundary`, the existing
call becomes:
  ```rust
if self.protocol_config().calculate_validator_scores() &&
output.has_report_state_changes() {
      self.scoreboard.update_scores(&self.report_aggregator);
  }
  ```

## Why this is safe (cross-validator determinism)

- `Scoreboard::update_scores` is a pure function of `(voting_power,
aggregator.reporters_with_voting_power())`. Same inputs → same output
across validators.
- All validators see the same `report_state_snapshots` emptiness
(consensus is deterministic), so all skip / run in lock-step.
- When all skip, `current_scores` retains its prior `Arc<Vec<u64>>` —
same across validators because the previous `update_scores` ran on the
same aggregator state.
- `invalid_reports_count` bumps don't affect scores
(`reporters_with_voting_power` reads only `received_metrics`), so
verify-time bumps that don't mark dirty are also score-irrelevant.

## Startup bootstrap

A freshly constructed `Scoreboard` starts at `[MAX_SCORE;
committee_size]`, but a never-restarted peer's reflects accumulated
updates. Without a fix, the next no-report commit would skip on both,
diverging the published vector. So `AuthorityPerEpochStore::new` now
calls `scoreboard.update_scores(&report_aggregator)` once, immediately
after `restore_from_tables`. If the restored aggregator is empty (fresh
epoch), `update_scores` returns `None` and leaves the default — correct
for the fresh-epoch path too.

## How the change has been tested

- [x] Basic tests (linting, compilation, formatting, unit/integration
tests)
- [x] Patch-specific tests (correctness, functionality coverage)
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have checked that new and existing unit tests pass locally with
my changes

Local verification:
- `cargo ci-clippy` — clean.
- `IOTA_SKIP_SIMTESTS=1 cargo nextest run -p iota-core --lib` — 564
tests pass.
- New tests in `scorer.rs`:
- `test_skipped_update_leaves_current_scores_unchanged` — proves the
skip path preserves `current_scores` bit-for-bit.
- `test_bootstrap_from_restored_aggregator_matches_live_path` — proves a
restored-and-bootstrapped scoreboard publishes the same vector as a
never-restarted peer's, given the same aggregator state.

## Links to any relevant issues

fixes iotaledger/iota-private#408
@piotrm50 piotrm50 force-pushed the consensus/feat/starfish-score-integration branch from 5d64e8d to eda8a4d Compare May 29, 2026 20:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

consensus Issues related to the Core Consensus team core-protocol

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants