Skip to content

feat(starfish-core): implement misbehavior tracking pipeline in DagState#11531

Merged
piotrm50 merged 15 commits into
consensus/feat/starfish-score-integrationfrom
protocol-research/feat/score-integration-starfish-dag-state
May 19, 2026
Merged

feat(starfish-core): implement misbehavior tracking pipeline in DagState#11531
piotrm50 merged 15 commits into
consensus/feat/starfish-score-integrationfrom
protocol-research/feat/score-integration-starfish-dag-state

Conversation

@piotrm50
Copy link
Copy Markdown
Contributor

@piotrm50 piotrm50 commented May 14, 2026

Description of change

Port the runtime misbehavior tracking logic onto the MisbehaviorStore types introduced in #10088.

  • MisbehaviorStore tracks per-authority misbehavior in two buckets (in_memory + persisted).
  • On flush: recompute the in-memory window, accumulate evicted counts, write the persisted snapshot to storage.
  • On startup: restore persisted counts from RocksDB and recompute in-memory counts from cached block refs.
  • Wire faulty block header detection into MisbehaviorStore from every peer receive site (subscriber main + bundle, header synchronizer including the own-last-header fetch, and both commit syncers).
  • Source-aware classification: Subscriber keeps UnexpectedAuthority as Unprovable; commit-chain / fetch-shape errors stay Untracked and belong to a separate commit-sync metric.

Stacks on top of #10088. Supersedes #10127.

Links to any relevant issues

fixes https://github.com/iotaledger/iota-private/issues/278
fixes https://github.com/iotaledger/iota-private/issues/280
Part of https://github.com/iotaledger/iota-private/issues/277

How the change has been tested

  • Basic tests (linting, compilation, formatting, unit/integration tests)
  • Patch-specific tests (correctness, functionality coverage)
  • I have added tests that prove my fix is effective or that my feature works
  • I have checked that new and existing unit tests pass locally with my changes

@iota-ci iota-ci added consensus Issues related to the Core Consensus team core-protocol labels May 14, 2026
@piotrm50 piotrm50 self-assigned this May 14, 2026
@piotrm50 piotrm50 force-pushed the protocol-research/feat/score-integration-starfish-storage branch from d0d4a00 to b53b4fe Compare May 14, 2026 13:44
@piotrm50 piotrm50 force-pushed the protocol-research/feat/score-integration-starfish-dag-state branch from 22ee8a7 to e4ad9e7 Compare May 14, 2026 13:45
piotrm50 added a commit that referenced this pull request May 14, 2026
…ten gauge updates

Address pending review comments on PR #11531:

- Drop `ErrorSource` enum and `classify_for_source`. With Provable-stays-
  Provable across all sources, the only source-specific case was the
  Subscriber's `UnexpectedAuthority` mapping; fold it into
  `classify_block_header_error` so every classifier path goes through one
  function. `record_faulty_block_header` no longer takes a source argument.
- Pass `context: &Context` to `record_faulty_block_header` and bump the
  in_memory gauges (`faulty_blocks_provable_by_authority`,
  `faulty_blocks_unprovable_by_peer` with `source="in_memory"`) on every
  recorded fault. `flush_faulty_block_header_buffer` resets them to zero,
  matching the existing pattern for missing_proposals and equivocations.
- Drop the verify_fetched_headers recording in commit_syncer fast/regular
  paths — those errors are fetch-shape and classify as Untracked today.
  Replace with TODO pointing at the recording entry point for when
  per-header faults become observable there.
- Move `misbehavior_store.reset()` to step 1 of `reinitialize` alongside
  the other in-memory cache clears.
piotrm50 added a commit that referenced this pull request May 14, 2026
…ten gauge updates

Address pending review comments on PR #11531:

- Drop `ErrorSource` enum and `classify_for_source`. With Provable-stays-
  Provable across all sources, the only source-specific case was the
  Subscriber's `UnexpectedAuthority` mapping; fold it into
  `classify_block_header_error` so every classifier path goes through one
  function. `record_faulty_block_header` no longer takes a source argument.
- Pass `context: &Context` to `record_faulty_block_header` and bump the
  in_memory gauges (`faulty_blocks_provable_by_authority`,
  `faulty_blocks_unprovable_by_peer` with `source="in_memory"`) on every
  recorded fault. `flush_faulty_block_header_buffer` resets them to zero,
  matching the existing pattern for missing_proposals and equivocations.
- Drop the verify_fetched_headers recording in commit_syncer fast/regular
  paths — those errors are fetch-shape and classify as Untracked today.
  Replace with TODO pointing at the recording entry point for when
  per-header faults become observable there.
- Move `misbehavior_store.reset()` to step 1 of `reinitialize` alongside
  the other in-memory cache clears.
@piotrm50 piotrm50 force-pushed the protocol-research/feat/score-integration-starfish-dag-state branch 3 times, most recently from 898f11d to faa517a Compare May 15, 2026 09:11
@piotrm50 piotrm50 force-pushed the protocol-research/feat/score-integration-starfish-dag-state branch 2 times, most recently from 32db939 to d867aca Compare May 15, 2026 13:01
@piotrm50 piotrm50 marked this pull request as ready for review May 15, 2026 13:09
@piotrm50 piotrm50 requested a review from a team as a code owner May 15, 2026 13:10
Base automatically changed from protocol-research/feat/score-integration-starfish-storage to consensus/feat/starfish-score-integration May 15, 2026 16:35
piotrm50 and others added 12 commits May 15, 2026 18:50
Port the runtime misbehavior tracking logic from PR #10127 onto

MisbehaviorStore tracks per-authority misbehavior in two buckets:
- in_memory: recomputed on each flush from blocks in the DAG cache
- persisted: cumulative counts from evicted blocks, written to RocksDB

On flush, update_misbehavior_counts_on_eviction recomputes the
in-memory window, accumulates newly-evicted counts into persisted,
and returns a snapshot for atomic storage writes.

On startup, persisted counts are restored from storage and in-memory
counts are recomputed from cached block refs.

Type renames for consistency:
- StarfishMisbehaviorCounts → CommitteeMisbehaviorCounts (Mutex-based)
- StorageScoringMetrics → MisbehaviorCounts (versioned enum)
- WriteBatch.scoring_metrics → WriteBatch.misbehavior_counts
- scan_scoring_metrics → scan_misbehavior_counts
Add error classification for faulty block headers, ported from the old
Mysticeti consensus crate. Errors are classified as:
- Provable: valid signature + protocol violation (author is guilty)
- Unprovable: can't prove authorship (bad sig, sync path)
- Untracked: not misbehavior (e.g. BlockRejected)

Classification is source-aware: subscriber errors keep provable status,
while synchronizer/commit_syncer errors are downgraded to unprovable.

Add record_faulty_block_header() which buffers events in the in_memory
bucket. On flush, buffered counts are moved to persisted and written
to storage, ensuring faulty block header counts are never lost.
Add new_with_misbehavior_store() constructor to DagState that accepts
an Arc<MisbehaviorStore>, allowing the store to be shared with other
components (authority_service, synchronizers). The existing new()
constructor is kept for backward compatibility (creates its own store).

Authority_node now creates the MisbehaviorStore and passes it to
DagState. Add reset() method for fast sync reinitialization.
…rStore

Call record_faulty_block_header from the three detection points:
- authority_service: subscriber path (ErrorSource::Subscriber)
- header_synchronizer: fetched headers path (ErrorSource::Synchronizer)
- commit_syncer: fetch_loop error handler (ErrorSource::CommitSyncer)

For the commit syncer, add misbehavior_store to the shared Inner struct
so both FastCommitSyncer and RegularCommitSyncer can access it.
The fetch_loop's Ok(Err(e)) branch now records the fault before the
error is consumed by the into() conversion to a static label.

Update all test call sites to pass the new misbehavior_store argument
to HeaderSynchronizer::start and the two syncer constructors.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add unit tests for the error classification and record_faulty_block_header
pipeline in scoring_metrics_store:

- test_subscriber_provable_errors: protocol-rule violations (TooManyAncestors,
  TooManyTransactions, InvalidTransaction) stay provable via Subscriber
- test_subscriber_unprovable_errors: signature/context errors (WrongEpoch,
  UnexpectedGenesisHeader, UnexpectedAuthority) are unprovable
- test_subscriber_untracked_errors: BlockRejected increments nothing
- test_synchronizer_downgrades_provable_to_unprovable: provable errors
  become unprovable when coming from the synchronizer path
- test_synchronizer_specific_unprovable_errors: UnexpectedNumberOfHeadersFetched
  is directly classified as unprovable
- test_commit_syncer_downgrades_provable_to_unprovable: same downgrade via
  the commit syncer path
- test_commit_syncer_specific_unprovable_errors: NoCommitReceived and
  FetchedTransactionsMismatch are classified as unprovable

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ibution

record_faulty_block_header now takes separate `peer` and `author`
arguments and chooses which to charge based on fault type:
- Provable (valid signature, protocol violation) → charged to `author`
  because the block header itself is cryptographic proof of their fault.
- Unprovable (bad signature, context-dependent) → charged to `peer`
  because we cannot verify the author field.

This fixes a bug in the additional-block-headers subscriber path where
verify() failures on headers authored by any validator were being charged
to the claimed author even when the signature was invalid (unprovable),
meaning we couldn't trust the author field at all.

For synchronizer and commit_syncer paths, all errors are already
downgraded to unprovable, so passing peer==author has no effect but
makes the intent explicit.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…it syncer

The commit syncer's fetch_loop passes all ConsensusErrors to
record_faulty_block_header, but not all errors from verify_commits
are block header faults. Commit-level errors (MalformedCommit,
UnexpectedStartCommit, UnexpectedCommitSequence, NoCommitReceived,
NotEnoughCommitVotes, TooManyCommitsFromPeer, etc.) are faults about
the commit chain, not block headers, so they should not increment
faulty_blocks_unprovable.

Change classify_commit_syncer_error to return Untracked for these
commit-level errors. Only genuine block header errors that reach the
commit syncer (e.g. invalid voting block header signature via
block_verifier.verify) are classified as Unprovable and counted.

Update test to verify commit-level errors produce no counts.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Audit all paths where peer-provided block headers are deserialized or
verified and ensure misbehavior is recorded in every one:

- authority_service: record MalformedHeader on bcs::from_bytes (main
  block + additional bundle headers)
- header_synchronizer: record MalformedHeader in verify_block_headers
  and wire misbehavior_store through start_fetch_own_last_block_header_task
- commit_syncer/mod: record MalformedHeader in verify_commits
- commit_syncer/regular & fast: record errors from verify_fetched_headers
  on the serving peer

Simplify classification: collapse classify_subscriber_error and
classify_sync_error into a single classify_for_source. Commit-chain and
fetch-shape errors (MalformedCommit, UnexpectedNumberOfHeadersFetched,
UnexpectedBlockHeaderForCommit, FetchedTransactionsMismatch, etc.) now
fall through to Untracked via classify_block_header_error's catch-all;
they are protocol-level peer faults that belong to a separate
commit-sync metric, not the per-header faulty_blocks counters.

ErrorSource derives Clone only in test builds.
…ten gauge updates

Address pending review comments on PR #11531:

- Drop `ErrorSource` enum and `classify_for_source`. With Provable-stays-
  Provable across all sources, the only source-specific case was the
  Subscriber's `UnexpectedAuthority` mapping; fold it into
  `classify_block_header_error` so every classifier path goes through one
  function. `record_faulty_block_header` no longer takes a source argument.
- Pass `context: &Context` to `record_faulty_block_header` and bump the
  in_memory gauges (`faulty_blocks_provable_by_authority`,
  `faulty_blocks_unprovable_by_peer` with `source="in_memory"`) on every
  recorded fault. `flush_faulty_block_header_buffer` resets them to zero,
  matching the existing pattern for missing_proposals and equivocations.
- Drop the verify_fetched_headers recording in commit_syncer fast/regular
  paths — those errors are fetch-shape and classify as Untracked today.
  Replace with TODO pointing at the recording entry point for when
  per-header faults become observable there.
- Move `misbehavior_store.reset()` to step 1 of `reinitialize` alongside
  the other in-memory cache clears.
- Gate `DagState::new` behind `#[cfg(test)]` (production uses
  `new_with_misbehavior_store` to share the misbehavior store).
- Drop the unused test-only `MisbehaviorStore::dummy_for_test` and
  `DagState::evicted_rounds` helpers.
- Remove redundant `use bcs;` in `scoring_metrics_store::tests`.
- Use `BTreeMap::keys()` instead of `iter().map(|(k, _)| ..)` in
  DagState's flush trace log.
- Drop redundant clones at `DagState::new` / `RegularCommitSyncer::new`
  call sites where the source values are dropped without further use.
…olve gauges

Replace the generic `get` + `update` combinators on
`CommitteeMisbehaviorCounts` with a fixed set of named methods —
`record_block_fault_provable/unprovable`, `drain_block_faults`,
`add_block_faults`, `set_block_faults`, `set_dag_faults`,
`add_dag_faults`, `snapshot`. The names spell out which counter
family the operation touches (block faults vs DAG-observed faults
like missing proposals / equivocations) and which arithmetic
discipline applies (overwrite vs additive).

Three consequences:

- The flush path no longer has a TOCTOU window. The previous
  `get → check → update(0)` sequence in
  `flush_faulty_block_header_buffer` could lose increments from
  concurrent `record_faulty_block_header` calls on network threads;
  `drain_block_faults` now reads-and-zeroes the counters under a
  single lock acquisition and the Prometheus gauges follow in the
  same critical section. Counter state and gauge readings can no
  longer disagree.

- Restore-from-storage in `initialize_misbehavior_counts` is now
  idempotent for both fault families. The old impl used `=` for
  faulty counts but `+=` for missing/equivocation; the latter would
  silently double if a future caller forgot to `reset()` first.
  Both paths now use `set_*` semantics.

- `(metrics, hostname)` no longer threads through every call site.
  `CommitteeMisbehaviorCounts::new` resolves the four Prometheus
  gauge handles per authority once, stored in a new
  `AuthorityGauges` struct. `MisbehaviorStore::new(&Context)`
  replaces `MisbehaviorStore::new(committee_size)` and gives
  internal methods the context they need at construction time.
  `record_faulty_block_header` and `update_misbehavior_counts_on_eviction`
  drop their per-call `&Context` / `hostname` args. 30+ call sites
  across `authority_service`, `header_synchronizer`,
  `commit_syncer`, `authority_node`, and `dag_state` simplify
  correspondingly.

Also tightens the `FaultType::Unprovable` doc to cover the
pre-signature `UnexpectedAuthority` case, and the
`flush_faulty_block_header_buffer` test comment to spell out the
second condition under which it returns `None`.
…ily to misbehavior_*

The runtime type was renamed `ScoringMetricsStore` → `MisbehaviorStore`
and the storage type `StorageScoringMetrics` → `MisbehaviorCounts` in
the earlier commit; finish that rename for two name stragglers:

- Move `scoring_metrics_store.rs` → `misbehavior_store.rs` via
  `git mv`, and update the module path everywhere
  (`crate::scoring_metrics_store` → `crate::misbehavior_store`).
- Rename the RocksDB column family string `"scoring_metrics"` →
  `"misbehavior_counts"`. Safe to change because the column family
  was introduced in PR #10088, which has not merged.
…ent imports

`mod misbehavior_store` and the cascading `misbehavior_store::*` use
lines were declared out of alphabetical order, which `cargo +nightly
fmt` (and CI's rustfmt check) flags. No code change.
@piotrm50 piotrm50 force-pushed the protocol-research/feat/score-integration-starfish-dag-state branch from 067b189 to a641ba3 Compare May 15, 2026 17:10
…o::test]

The 4 tests that invoke `Context::new_for_test` (`test_provable_errors`,
`test_unprovable_errors`, `test_untracked_errors`,
`test_provable_fault_charges_both_author_and_serving_peer`) were marked
`#[test]`. `Context::new_for_test` transitively touches tokio runtime
APIs; under plain `cargo test` that resolves benignly, but under
`cargo simtest` msim intercepts tokio and `current_node()` panics at
`msim/src/sim/runtime/context.rs:15` because plain `#[test]` runs
outside any simulator node context.

Matches the existing convention used by other tests in the file that
call `Context::new_for_test` (`test_update_misbehavior_counts_on_eviction_edge_cases`,
`test_metrics_flush_and_recovery`, `test_no_double_counting_on_restart`).
Also declares the `iota-sdk-types` serde feature on starfish-core's own
dependency so the crate builds standalone (workspace builds were
masking this via feature unification through `iota-types`).
Comment thread crates/starfish/core/src/misbehavior_store.rs
@piotrm50 piotrm50 requested a review from a team May 18, 2026 12:11
Keep the counter-and-gauge-always-agree invariant: reset() now zeros
all four Prometheus gauges under the same per-authority lock that
clears the counters.
@piotrm50 piotrm50 merged commit 98bbb57 into consensus/feat/starfish-score-integration May 19, 2026
35 checks passed
@piotrm50 piotrm50 deleted the protocol-research/feat/score-integration-starfish-dag-state branch May 19, 2026 11:33
piotrm50 added a commit that referenced this pull request May 19, 2026
…11534)

# Description of change

Stacks on top of #11531. Wires the producer-side misbehavior store
(landed in #10088 + #11531) into the consumer in `iota-core` so that
the `Scoreboard` / `ReportAggregator` (from #10779) finally receives
per-authority misbehavior signal from Starfish.

Each `CommittedSubDag` now carries a per-authority absolute snapshot of
`persisted + in_memory` counts from `MisbehaviorStore` at commit time.
Consumers diff against their own last-seen state if they want deltas —
that responsibility doesn't belong in Starfish, and the sum is invariant
across the eviction-time move between buckets, so the snapshot is
race-free relative to flush.

Producer (`starfish-core`):
- `MisbehaviorStore::snapshot_totals()` — returns the per-authority
  absolute totals.
- `CommittedSubDag` gets a new `misbehavior_counts:
Vec<MisbehaviorCountsV1>`
  field, threaded at all three production `CommittedSubDag::new` call
  sites (`commit_solidifier`, `commit_observer`, `commit_syncer/fast`)
  via the `DagState`-owned `MisbehaviorStore`.
- `CommittedSubDag` is **not** serialized over the wire (local
  `CommitConsumer` channel only), so the added `Vec` is an in-process
  cost only.

Consumer (`iota-core`):
- Overrides `ConsensusOutputAPI::misbehavior_counts()` on
  `starfish_core::CommittedSubDag` to transpose the per-authority
  struct-of-fields snapshot into the four per-field vecs expected by
  `ConsensusOutputMisbehaviorCounts`. Downstream wiring in
  `consensus_handler` and the report aggregator was already in place
  and waiting for non-empty data.

Also folds in a standalone fix: `starfish-core` now declares its
`iota-sdk-types` `serde` feature explicitly. Workspace builds were
masking the missing feature via unification through `iota-types`;
`cargo check -p starfish-core` alone fails without it.

## Links to any relevant issues

fixes iotaledger/iota-private#406
Part of iotaledger/iota-private#173

## How the change has been tested

- [x] Basic tests (linting, compilation, formatting, unit/integration
tests)
- [x] Patch-specific tests (correctness, functionality coverage)
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have checked that new and existing unit tests pass locally with
my changes

Local verification:
- `cargo ci-clippy` — clean.
- `cargo nextest run -p starfish-core --lib` — new
`test_snapshot_totals_sums_persisted_and_in_memory` passes; existing
misbehavior tests pass.
- `cargo nextest run -p iota-core --lib consensus_output_api::tests` —
new transpose + empty-snapshot tests pass.
piotrm50 added a commit that referenced this pull request May 20, 2026
…ate (#11531)

Port the runtime misbehavior tracking logic onto the `MisbehaviorStore`
types introduced in #10088.

- `MisbehaviorStore` tracks per-authority misbehavior in two buckets
(`in_memory` + `persisted`).
- On flush: recompute the in-memory window, accumulate evicted counts,
write the persisted snapshot to storage.
- On startup: restore persisted counts from RocksDB and recompute
in-memory counts from cached block refs.
- Wire faulty block header detection into `MisbehaviorStore` from every
peer receive site (subscriber main + bundle, header synchronizer
including the own-last-header fetch, and both commit syncers).
- Source-aware classification: `Subscriber` keeps `UnexpectedAuthority`
as Unprovable; commit-chain / fetch-shape errors stay Untracked and
belong to a separate commit-sync metric.

Stacks on top of #10088. Supersedes #10127.

fixes iotaledger/iota-private#278
fixes iotaledger/iota-private#280
Part of iotaledger/iota-private#277

- [x] Basic tests (linting, compilation, formatting, unit/integration
tests)
- [x] Patch-specific tests (correctness, functionality coverage)
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have checked that new and existing unit tests pass locally with
my changes

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
piotrm50 added a commit that referenced this pull request May 20, 2026
…11534)

# Description of change

Stacks on top of #11531. Wires the producer-side misbehavior store
(landed in #10088 + #11531) into the consumer in `iota-core` so that
the `Scoreboard` / `ReportAggregator` (from #10779) finally receives
per-authority misbehavior signal from Starfish.

Each `CommittedSubDag` now carries a per-authority absolute snapshot of
`persisted + in_memory` counts from `MisbehaviorStore` at commit time.
Consumers diff against their own last-seen state if they want deltas —
that responsibility doesn't belong in Starfish, and the sum is invariant
across the eviction-time move between buckets, so the snapshot is
race-free relative to flush.

Producer (`starfish-core`):
- `MisbehaviorStore::snapshot_totals()` — returns the per-authority
  absolute totals.
- `CommittedSubDag` gets a new `misbehavior_counts:
Vec<MisbehaviorCountsV1>`
  field, threaded at all three production `CommittedSubDag::new` call
  sites (`commit_solidifier`, `commit_observer`, `commit_syncer/fast`)
  via the `DagState`-owned `MisbehaviorStore`.
- `CommittedSubDag` is **not** serialized over the wire (local
  `CommitConsumer` channel only), so the added `Vec` is an in-process
  cost only.

Consumer (`iota-core`):
- Overrides `ConsensusOutputAPI::misbehavior_counts()` on
  `starfish_core::CommittedSubDag` to transpose the per-authority
  struct-of-fields snapshot into the four per-field vecs expected by
  `ConsensusOutputMisbehaviorCounts`. Downstream wiring in
  `consensus_handler` and the report aggregator was already in place
  and waiting for non-empty data.

Also folds in a standalone fix: `starfish-core` now declares its
`iota-sdk-types` `serde` feature explicitly. Workspace builds were
masking the missing feature via unification through `iota-types`;
`cargo check -p starfish-core` alone fails without it.

## Links to any relevant issues

fixes iotaledger/iota-private#406
Part of iotaledger/iota-private#173

## How the change has been tested

- [x] Basic tests (linting, compilation, formatting, unit/integration
tests)
- [x] Patch-specific tests (correctness, functionality coverage)
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have checked that new and existing unit tests pass locally with
my changes

Local verification:
- `cargo ci-clippy` — clean.
- `cargo nextest run -p starfish-core --lib` — new
`test_snapshot_totals_sums_persisted_and_in_memory` passes; existing
misbehavior tests pass.
- `cargo nextest run -p iota-core --lib consensus_output_api::tests` —
new transpose + empty-snapshot tests pass.
piotrm50 added a commit that referenced this pull request May 20, 2026
…ate (#11531)

Port the runtime misbehavior tracking logic onto the `MisbehaviorStore`
types introduced in #10088.

- `MisbehaviorStore` tracks per-authority misbehavior in two buckets
(`in_memory` + `persisted`).
- On flush: recompute the in-memory window, accumulate evicted counts,
write the persisted snapshot to storage.
- On startup: restore persisted counts from RocksDB and recompute
in-memory counts from cached block refs.
- Wire faulty block header detection into `MisbehaviorStore` from every
peer receive site (subscriber main + bundle, header synchronizer
including the own-last-header fetch, and both commit syncers).
- Source-aware classification: `Subscriber` keeps `UnexpectedAuthority`
as Unprovable; commit-chain / fetch-shape errors stay Untracked and
belong to a separate commit-sync metric.

Stacks on top of #10088. Supersedes #10127.

fixes iotaledger/iota-private#278
fixes iotaledger/iota-private#280
Part of iotaledger/iota-private#277

- [x] Basic tests (linting, compilation, formatting, unit/integration
tests)
- [x] Patch-specific tests (correctness, functionality coverage)
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have checked that new and existing unit tests pass locally with
my changes

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
piotrm50 added a commit that referenced this pull request May 20, 2026
…11534)

# Description of change

Stacks on top of #11531. Wires the producer-side misbehavior store
(landed in #10088 + #11531) into the consumer in `iota-core` so that
the `Scoreboard` / `ReportAggregator` (from #10779) finally receives
per-authority misbehavior signal from Starfish.

Each `CommittedSubDag` now carries a per-authority absolute snapshot of
`persisted + in_memory` counts from `MisbehaviorStore` at commit time.
Consumers diff against their own last-seen state if they want deltas —
that responsibility doesn't belong in Starfish, and the sum is invariant
across the eviction-time move between buckets, so the snapshot is
race-free relative to flush.

Producer (`starfish-core`):
- `MisbehaviorStore::snapshot_totals()` — returns the per-authority
  absolute totals.
- `CommittedSubDag` gets a new `misbehavior_counts:
Vec<MisbehaviorCountsV1>`
  field, threaded at all three production `CommittedSubDag::new` call
  sites (`commit_solidifier`, `commit_observer`, `commit_syncer/fast`)
  via the `DagState`-owned `MisbehaviorStore`.
- `CommittedSubDag` is **not** serialized over the wire (local
  `CommitConsumer` channel only), so the added `Vec` is an in-process
  cost only.

Consumer (`iota-core`):
- Overrides `ConsensusOutputAPI::misbehavior_counts()` on
  `starfish_core::CommittedSubDag` to transpose the per-authority
  struct-of-fields snapshot into the four per-field vecs expected by
  `ConsensusOutputMisbehaviorCounts`. Downstream wiring in
  `consensus_handler` and the report aggregator was already in place
  and waiting for non-empty data.

Also folds in a standalone fix: `starfish-core` now declares its
`iota-sdk-types` `serde` feature explicitly. Workspace builds were
masking the missing feature via unification through `iota-types`;
`cargo check -p starfish-core` alone fails without it.

## Links to any relevant issues

fixes iotaledger/iota-private#406
Part of iotaledger/iota-private#173

## How the change has been tested

- [x] Basic tests (linting, compilation, formatting, unit/integration
tests)
- [x] Patch-specific tests (correctness, functionality coverage)
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have checked that new and existing unit tests pass locally with
my changes

Local verification:
- `cargo ci-clippy` — clean.
- `cargo nextest run -p starfish-core --lib` — new
`test_snapshot_totals_sums_persisted_and_in_memory` passes; existing
misbehavior tests pass.
- `cargo nextest run -p iota-core --lib consensus_output_api::tests` —
new transpose + empty-snapshot tests pass.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

consensus Issues related to the Core Consensus team core-protocol

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants