refactor: break circular dependency over net_processing and dkgsessionhandler by knst · Pull Request #7314 · dashpay/dash

knst · 2026-05-08T17:56:08Z

Issue being fixed or feature implemented

This PR is continuous of #7247
This PR is not direct dependency of kernel project.

This PR aim to resolve next issues:

constructor of PeerManager uses references to unique_ptr to multiple objects that will be initialized later, such as:

const std::unique_ptr<ActiveContext>& active_ctx,
const std::unique_ptr<CDeterministicMNManager>& dmnman,
const std::unique_ptr<CJWalletManager>& cj_walletman,
const std::unique_ptr<LLMQContext>& llmq_ctx,
const std::unique_ptr<llmq::ObserverContext>& observer_ctx,

That's a fragile design that has multiple assumptions about already initialized members and their life term

Implementation of state machine for DKG mechanism and p2p implementation is tightly connected.

What was done?

CDKGSessionManager is reduced to a pure state class, it owns DB and provides 2 new helper: ForEachHandler / DoForHandler
CDKGSessionHandler and ActiveDKGSessionHandler loses its threading and ProcessMessage members
MessageProcessingResult usages are dropped from llmq/ consensus code
PeerManager forgot about Observer/ActiveContext and lost 2 unique_ptr& from constructor
new NetHandler NetDKG is introduced which takes responsibilities for p2p communications for DKG works and for running threads

How Has This Been Tested?

Run unit, functional tests
Run test/lint/lint-circular-dependencies.py linter

Removed circular dependency over dkgsessionhandler <-> net_processing

Breaking Changes

N/A

Checklist:

I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have added or updated relevant unit/integration/functional/e2e tests
I have made corresponding changes to the documentation
I have assigned this pull request to a milestone (for repository code-owners and collaborators only)

github-actions · 2026-05-08T17:57:02Z

✅ No Merge Conflicts Detected

This PR currently has no conflicts with other open PRs.

thepastaclaw · 2026-05-08T17:57:05Z

✅ Review complete (commit 496e479)

coderabbitai · 2026-05-08T18:08:25Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

This PR separates LLMQ DKG networking and per-quorum phase driving into a new llmq::NetDKG (and NetDKGStub), changes CDKGSession/ActiveDKGSession phase hooks to return std::optional payloads instead of enqueueing, refactors CDKGPendingMessages to store pre-serialized payloads, adds session-manager handler-dispatch helpers, simplifies Active/Observer context ctor/init wiring, decouples PeerManager from ActiveContext by passing a CActiveMasternodeManager* nodeman, and adds debug helpers and a spork check for enabling DKG.

Sequence Diagram(s)

sequenceDiagram
  participant Node as DKG Node
  participant NetDKG as NetDKG Handler
  participant SessionMgr as CDKGSessionManager
  participant SessionHdlr as CDKGSessionHandler
  participant ActiveDKG as ActiveDKGSession

  Node->>NetDKG: ProcessMessage(MSG_QUORUM_CONTRIB)
  NetDKG->>SessionMgr: route message (llmqType, quorumIndex, hash)
  SessionMgr->>SessionHdlr: enqueue(serialized, hash)
  Note over SessionHdlr: Batches messages in pending queues

  NetDKG->>NetDKG: PhaseHandlerThread -> HandleDKGRound()
  loop per_phase
    NetDKG->>SessionHdlr: ProcessPendingMessageBatch()
    SessionHdlr->>ActiveDKG: Contribute()/VerifyAndCommit()
    ActiveDKG-->>SessionHdlr: std::optional<Message>
    SessionHdlr->>NetDKG: RelayInvToParticipants()
  end

sequenceDiagram
  participant Init as Initialization
  participant ActiveCtx as ActiveContext
  participant PeerMgr as PeerManager
  participant NetDKG as NetDKG Handler
  participant Spork as CSporkManager

  Init->>ActiveCtx: construct(dmnman, qman, qsnapman, sigman)
  Init->>PeerMgr: make(nodeman = active_ctx ? active_ctx->nodeman.get() : nullptr)
  Init->>NetDKG: construct(sporkman, dkgsman, qman[, active deps])
  Init->>Spork: IsQuorumDKGEnabled()
  Init->>ActiveCtx: Start()

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

dashpay/dash#7065: Related ActiveContext API and lifecycle refactors affecting DKG handler setup.
dashpay/dash#7247: Related observer/participant LLMQ context and lifecycle changes.

Suggested reviewers

UdjinM6
PastaPastaPasta
thepastaclaw

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'refactor: break circular dependency over net_processing and dkgsessionhandler' clearly and accurately describes the main objective of this pull request, which is to remove circular dependencies by restructuring DKG networking and peer management architecture.
Description check	✅ Passed	The PR description is comprehensive and directly related to the changeset. It explains the issues being addressed, details what was done to fix them, mentions testing performed, and confirms removal of circular dependencies.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (3)

src/active/dkgsession.cpp (1)
106-111: ⚡ Quick win

Move the sent* debug updates to the actual enqueue/broadcast path.

These methods now only build and return a message. Setting sentContributions, sentComplaint, sentJustification, and sentPrematureCommitment here records a successful send before NetDKG has actually serialized and queued/broadcast the payload.

Also applies to: 292-297, 382-387, 539-544
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/active/dkgsession.cpp` around lines 106 - 111, The
dkgDebugManager.UpdateLocalSessionStatus calls inside the message-builder
functions (e.g., setting CDKGDebugSessionStatus::statusBits.sentContributions,
sentComplaint, sentJustification, sentPrematureCommitment) must be removed from
those builders (the functions that build and return qc/messages) and moved into
the actual send path inside NetDKG — i.e., the code that serializes and
enqueues/broadcasts the payload. Locate the UpdateLocalSessionStatus calls in
the builders and delete them there, then add equivalent UpdateLocalSessionStatus
updates immediately after NetDKG performs the serialization/queuing/broadcast so
the debug flags reflect a real successful send.
src/llmq/debug.cpp (1)
213-228: 💤 Low value

Optional: make MarkAborted idempotent w.r.t. nTime.

MarkAborted's lambda always returns true, so each call bumps localStatus.nTime even when the session was already marked aborted. MarkPhaseAdvanced already does the right thing (returns changed). For consistency and to avoid spurious timestamp updates if the helper is invoked more than once on the same aborted session, consider returning a real changed flag.
♻️ Proposed change
 void CDKGDebugManager::MarkAborted(Consensus::LLMQType llmqType, int quorumIndex)
 {
     UpdateLocalSessionStatus(llmqType, quorumIndex, [&](CDKGDebugSessionStatus& status) {
+        if (status.statusBits.aborted) return false;
         status.statusBits.aborted = true;
         return true;
     });
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/llmq/debug.cpp` around lines 213 - 228, MarkAborted currently always
returns true from its UpdateLocalSessionStatus lambda which forces
localStatus.nTime to update every call; change the lambda in
CDKGDebugManager::MarkAborted to compute a changed flag by comparing
status.statusBits.aborted with the new value, set status.statusBits.aborted =
true, and return that changed flag (i.e., return status.statusBits.aborted was
previously false). This makes MarkAborted idempotent like MarkPhaseAdvanced and
avoids spurious nTime updates.
src/llmq/net_dkg.cpp (1)
449-482: 💤 Low value

Inconsistent dynamic_cast usage between Start() and Interrupt(); consider tightening shutdown.

Start() uses the throwing reference form (dynamic_cast<ActiveDKGSessionHandler&>) while Interrupt() uses the safe pointer form. Both iterate the same handler set and both early-return on m_active == nullptr, so the invariant is identical and the two should agree.

The reference form also has a small resilience gap: if the cast were ever to throw mid-iteration, the threads already pushed into m_phase_threads would never be joined, because ~NetDKG() only calls DisconnectManagers() (line 254), not Stop(). Either use the pointer form here as well, or have the destructor call Stop() defensively so a partially-initialized state still cleans up.
♻️ Proposed alignment with `Interrupt()`
     m_qdkgsman.ForEachHandler([this](CDKGSessionHandler& base) {
-        auto& handler = dynamic_cast<ActiveDKGSessionHandler&>(base);
-        std::string thread_name = strprintf("llmq-%d-%d", std23::to_underlying(handler.params.type), handler.QuorumIndex());
-        m_phase_threads.emplace_back([this, name = std::move(thread_name), &handler] {
-            util::TraceThread(name.c_str(), [this, &handler] { PhaseHandlerThread(handler); });
-        });
+        auto* handler = dynamic_cast<ActiveDKGSessionHandler*>(&base);
+        if (!Assume(handler != nullptr)) return;
+        std::string thread_name = strprintf("llmq-%d-%d", std23::to_underlying(handler->params.type), handler->QuorumIndex());
+        m_phase_threads.emplace_back([this, name = std::move(thread_name), handler] {
+            util::TraceThread(name.c_str(), [this, handler] { PhaseHandlerThread(*handler); });
+        });
     });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/llmq/net_dkg.cpp` around lines 449 - 482, Start() uses
dynamic_cast<ActiveDKGSessionHandler&> which can throw partway through filling
m_phase_threads and leave threads unjoined; make Start() mirror Interrupt() by
using the non-throwing pointer form (dynamic_cast<ActiveDKGSessionHandler*>)
when iterating m_qdkgsman.ForEachHandler so you only create threads for valid
handlers and avoid exceptions during the loop, ensuring m_phase_threads remains
consistent for later Stop() join; update the lambda in NetDKG::Start to check
the pointer, capture it safely, and call PhaseHandlerThread(handler) with the
pointer/ref as appropriate.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/active/dkgsession.cpp`:
- Around line 106-111: The dkgDebugManager.UpdateLocalSessionStatus calls inside
the message-builder functions (e.g., setting
CDKGDebugSessionStatus::statusBits.sentContributions, sentComplaint,
sentJustification, sentPrematureCommitment) must be removed from those builders
(the functions that build and return qc/messages) and moved into the actual send
path inside NetDKG — i.e., the code that serializes and enqueues/broadcasts the
payload. Locate the UpdateLocalSessionStatus calls in the builders and delete
them there, then add equivalent UpdateLocalSessionStatus updates immediately
after NetDKG performs the serialization/queuing/broadcast so the debug flags
reflect a real successful send.

In `@src/llmq/debug.cpp`:
- Around line 213-228: MarkAborted currently always returns true from its
UpdateLocalSessionStatus lambda which forces localStatus.nTime to update every
call; change the lambda in CDKGDebugManager::MarkAborted to compute a changed
flag by comparing status.statusBits.aborted with the new value, set
status.statusBits.aborted = true, and return that changed flag (i.e., return
status.statusBits.aborted was previously false). This makes MarkAborted
idempotent like MarkPhaseAdvanced and avoids spurious nTime updates.

In `@src/llmq/net_dkg.cpp`:
- Around line 449-482: Start() uses dynamic_cast<ActiveDKGSessionHandler&> which
can throw partway through filling m_phase_threads and leave threads unjoined;
make Start() mirror Interrupt() by using the non-throwing pointer form
(dynamic_cast<ActiveDKGSessionHandler*>) when iterating
m_qdkgsman.ForEachHandler so you only create threads for valid handlers and
avoid exceptions during the loop, ensuring m_phase_threads remains consistent
for later Stop() join; update the lambda in NetDKG::Start to check the pointer,
capture it safely, and call PhaseHandlerThread(handler) with the pointer/ref as
appropriate.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 757eb414-ab77-46e9-b643-a3f32d98e788

📥 Commits

Reviewing files that changed from the base of the PR and between 5fd84aa and 53be42b.

📒 Files selected for processing (25)

src/Makefile.am
src/active/context.cpp
src/active/context.h
src/active/dkgsession.cpp
src/active/dkgsession.h
src/active/dkgsessionhandler.cpp
src/active/dkgsessionhandler.h
src/init.cpp
src/llmq/debug.cpp
src/llmq/debug.h
src/llmq/dkgsession.h
src/llmq/dkgsessionhandler.cpp
src/llmq/dkgsessionhandler.h
src/llmq/dkgsessionmgr.cpp
src/llmq/dkgsessionmgr.h
src/llmq/net_dkg.cpp
src/llmq/net_dkg.h
src/llmq/observer.cpp
src/llmq/observer.h
src/llmq/options.cpp
src/llmq/options.h
src/net_processing.cpp
src/net_processing.h
src/test/util/setup_common.cpp
test/lint/lint-circular-dependencies.py

💤 Files with no reviewable changes (1)

test/lint/lint-circular-dependencies.py

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (7)

src/llmq/net_dkg.cpp (6)
100-110: 💤 Low value

Defensive null-check in fallback verification loop.

In the per-message fallback verification loop (lines 100-110), member is dereferenced at line 106 (member->dmn->pdmnState->pubKeyOperator.Get()) without a null check. The reasoning is that any nodeId whose GetMember() returned nullptr in the first loop was already inserted into ret (line 54) and is filtered by ret.count(nodeId) (line 101). That holds today, but an in-flight membership change between the two GetMember calls (or future refactors) would silently NPE here.

A cheap guard would future-proof this:
🛡️ Proposed defensive check
         auto member = session.GetMember(msg->proTxHash);
+        if (!member) {
+            ret.emplace(nodeId);
+            continue;
+        }
         bool valid = msg->sig.VerifyInsecure(member->dmn->pdmnState->pubKeyOperator.Get(), msg->GetSignHash());
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/llmq/net_dkg.cpp` around lines 100 - 110, The fallback verification loop
can dereference a null member returned by session.GetMember; update the loop
that iterates over messages to defensively check the result of
session.GetMember(msg->proTxHash) before using
member->dmn->pdmnState->pubKeyOperator.Get(): if member is null, insert nodeId
into ret and continue, otherwise perform msg->sig.VerifyInsecure(...,
msg->GetSignHash()) as before; this prevents a potential NPE if membership
changes between the two GetMember calls.
466-473: 💤 Low value

Stop() is not idempotent w.r.t. unjoinable threads after early failure.

Stop() calls Interrupt() then joins all threads. If any thread in m_phase_threads is in a non-joinable state (e.g., already joined or std::thread default-constructed because creation failed), t.join() is skipped via the joinable() guard — good. But if a thread is running and throws an unhandled exception before stop is requested, PhaseHandlerThread only catches AbortPhaseException (line 492); any other exception will terminate the thread per std::terminate. The catch-all loop in PhaseHandlerThread does not include a generic catch (...).

Consider adding a final catch (const std::exception&) to log and break the loop cleanly, so Stop() always sees joinable but exited threads.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/llmq/net_dkg.cpp` around lines 466 - 473, Stop() can hang on threads that
terminated due to unexpected exceptions because PhaseHandlerThread only catches
AbortPhaseException; add a broad exception handler inside PhaseHandlerThread's
main loop (after the AbortPhaseException catch) that catches const
std::exception& (and optionally catch(...) as a fallback), logs the error with
context, and breaks/returns cleanly so the std::thread object exits normally and
becomes joinable for Stop() to join; reference PhaseHandlerThread,
AbortPhaseException, m_phase_threads and Interrupt() when locating where to add
the catch and logging.
244-244: 💤 Low value

Redundant temporary in m_active construction.

std::make_unique<ActiveDKG>(ActiveDKG{...}) constructs a temporary ActiveDKG and then move/copy-constructs another inside make_unique. You can forward the args directly:
♻️ Proposed simplification
-    m_active{std::make_unique<ActiveDKG>(ActiveDKG{dmnman, mn_metaman, dkgdbgman, qblockman, qsnapman, connman})}
+    m_active{std::make_unique<ActiveDKG>(dmnman, mn_metaman, dkgdbgman, qblockman, qsnapman, connman)}
This requires ActiveDKG to have an aggregate-equivalent constructor or a matching one; if it currently relies on aggregate brace-init, add an explicit constructor.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/llmq/net_dkg.cpp` at line 244, The m_active member is being initialized
with a redundant temporary via std::make_unique<ActiveDKG>(ActiveDKG{...});
change the call to forward the constructor args directly (e.g.
std::make_unique<ActiveDKG>(dmnman, mn_metaman, dkgdbgman, qblockman, qsnapman,
connman)) so no extra temporary is created, and if ActiveDKG does not currently
have a matching constructor add an explicit constructor on ActiveDKG that
accepts (dmnman, mn_metaman, dkgdbgman, qblockman, qsnapman, connman) to allow
direct in-place construction.
216-255: ⚖️ Poor tradeoff

Two near-identical constructors invite drift.

The observer (lines 216-231) and active (lines 233-253) constructors share initialization of m_qdkgsman, m_qman, m_sporkman, m_chainman, m_quorums_watch, the m_qdkgsman.InitializeHandlers(...) call, and the closing m_qman.ConnectManagers(...) call. The only differences are:

m_active (nullptr vs constructed)

the lambda type returned by InitializeHandlers (CDKGSessionHandler vs ActiveDKGSessionHandler)

Consider delegating the active constructor to the observer one, or factor common init into a private helper, to keep the two paths (and the ConnectManagers/DisconnectManagers lifecycle) in sync going forward.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/llmq/net_dkg.cpp` around lines 216 - 255, The two NetDKG constructors
duplicate common initialization (m_qdkgsman, m_qman, m_sporkman, m_chainman,
m_quorums_watch, the InitializeHandlers/ConnectManagers lifecycle) and should be
unified: extract the shared setup into a private helper (e.g., InitCommon or
InitializeManagers) that takes a
std::function<std::unique_ptr<CDKGSessionHandler>(const Consensus::LLMQParams&,
int)> or a template/lambda wrapper, or else implement constructor delegation
from the active constructor to the observer constructor and only set m_active
afterward; update the observer constructor to call the helper (or be the
delegated target) and change the active constructor to only construct m_active
and supply the ActiveDKGSessionHandler-producing lambda to the shared
InitializeHandlers call, ensuring m_qman.ConnectManagers and
m_qdkgsman.InitializeHandlers remain in one place.
451-464: ⚡ Quick win

dynamic_cast<ActiveDKGSessionHandler&> at line 458 will throw on type mismatch.

Start() is guarded by m_active == nullptr (line 452), and the active-mode constructor (lines 246-251) installs ActiveDKGSessionHandler instances, so in practice the cast succeeds. However, m_qdkgsman.InitializeHandlers(...) is the only invariant that ties m_active != nullptr to "all handlers are ActiveDKGSessionHandler", and it lives outside this class. If that invariant is ever violated (a future code path that mixes handler types, or a partially initialized state), dynamic_cast on a reference will throw std::bad_cast from inside this lambda chain and may abort the node.

Interrupt() (line 479) already uses the safer pointer form. Consider mirroring that here:
🛡️ Safer cast in `Start()`
     m_qdkgsman.ForEachHandler([this](CDKGSessionHandler& base) {
-        auto& handler = dynamic_cast<ActiveDKGSessionHandler&>(base);
-        std::string thread_name = strprintf("llmq-%d-%d", std23::to_underlying(handler.params.type), handler.QuorumIndex());
-        m_phase_threads.emplace_back([this, name = std::move(thread_name), &handler] {
-            util::TraceThread(name.c_str(), [this, &handler] { PhaseHandlerThread(handler); });
-        });
+        auto* handler = dynamic_cast<ActiveDKGSessionHandler*>(&base);
+        if (!handler) return;
+        std::string thread_name = strprintf("llmq-%d-%d", std23::to_underlying(handler->params.type),
+                                            handler->QuorumIndex());
+        m_phase_threads.emplace_back([this, name = std::move(thread_name), handler] {
+            util::TraceThread(name.c_str(), [this, handler] { PhaseHandlerThread(*handler); });
+        });
     });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/llmq/net_dkg.cpp` around lines 451 - 464, The dynamic_cast in Start()
currently uses dynamic_cast<ActiveDKGSessionHandler&> inside the
m_qdkgsman.ForEachHandler lambda which will throw std::bad_cast on mismatch;
change it to use the pointer form dynamic_cast<ActiveDKGSessionHandler*>
(matching Interrupt()) and check for nullptr before using handler: if the cast
returns nullptr, log or skip creating the phase thread for that handler instead
of dereferencing/throwing, otherwise capture the pointer and pass it to
PhaseHandlerThread; update references in the lambda to use the pointer (e.g.,
handler->QuorumIndex(), PhaseHandlerThread(*handler)) so Start() no longer can
throw from a bad_cast.
257-294: 💤 Low value

vRecv is moved into pm later — make the rewind/peek pattern explicit.

The flow at lines 291-294 reads llmqType (1 byte) and quorumHash (32 bytes) and then rewinds both, with the goal of leaving vRecv untouched so the entire payload (including header bytes) can be hashed and stored at lines 349-352. This works, but:

The rewind order (uint256 then uint8_t) is the reverse of read order; that's intentional for Rewind but easy to break in the future.

A short comment would make the intent (peek-only) obvious to maintainers.
♻️ Suggested clarifying comment
+    // Peek llmqType + quorumHash without consuming so the full payload can be re-serialized
+    // and hashed below at line 349.
     Consensus::LLMQType llmqType;
     uint256 quorumHash;
     vRecv >> llmqType;
     vRecv >> quorumHash;
     vRecv.Rewind(sizeof(uint256));
     vRecv.Rewind(sizeof(uint8_t));
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/llmq/net_dkg.cpp` around lines 257 - 294, The code in
NetDKG::ProcessMessage reads llmqType and quorumHash from vRecv and then calls
vRecv.Rewind in the reverse order to restore the stream for later hashing; make
this intent explicit and less fragile by adding a short comment above the
reads/Rewind calls describing that we are only peeking at the header bytes
(uint8_t then uint256) and that the subsequent Rewind calls intentionally undo
the reads so the full payload remains unchanged for hashing (used later when
moving vRecv into pm). Optionally, reorder the Rewind calls to mirror the read
order (Rewind(sizeof(uint8_t)) then Rewind(sizeof(uint256))) to reduce future
mistakes, while keeping the explanatory comment.
src/llmq/dkgsessionhandler.h (1)
112-121: 💤 Low value

Duplicate public: access specifiers.

Lines 112 and 121 both declare public: with no intervening private:/protected: block, making the second specifier redundant. Consider consolidating into a single section.
♻️ Proposed cleanup
 public:
     const Consensus::LLMQParams& params;

     // Do not guard these, they protect their internals themselves
     CDKGPendingMessages pendingContributions;
     CDKGPendingMessages pendingComplaints;
     CDKGPendingMessages pendingJustifications;
     CDKGPendingMessages pendingPrematureCommitments;

-public:
     explicit CDKGSessionHandler(const Consensus::LLMQParams& _params);
     virtual ~CDKGSessionHandler();
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/llmq/dkgsessionhandler.h` around lines 112 - 121, The class header
contains two consecutive "public:" access specifiers (surrounding members like
params and CDKGPendingMessages
pendingContributions/pendingComplaints/pendingJustifications/pendingPrematureCommitments),
making the second one redundant; remove the duplicate "public:" and consolidate
these members under a single public section so there is only one "public:"
before params and the pending* members.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/llmq/dkgsession.cpp`:
- Around line 627-629: The empty stubs for CDKGSession::FinalizeCommitments()
and CDKGSession::FinalizeSingleCommitment() silently disable producing mineable
commitments in HandleDKGRound; either restore the original finalize logic into
these two methods (port the pre-refactor bodies into
CDKGSession::FinalizeCommitments and CDKGSession::FinalizeSingleCommitment so
they return the actual CFinalCommitment(s) used by
qblockman.AddMineableCommitment(...)), or if this work is intentionally
deferred, replace the stubs with an explicit TODO and a runtime-safe guard: log
an error (process/ulogs) and assert or throw so the failure is visible (do not
silently return {}), referencing the methods by name (FinalizeCommitments and
FinalizeSingleCommitment) and ensuring HandleDKGRound’s expectations are met.

In `@src/llmq/net_dkg.cpp`:
- Around line 384-411: AlreadyHave reports DKG invs seen by observer nodes
(m_quorums_watch) but ProcessGetData currently returns early when m_active ==
nullptr causing watchers to claim possession but refuse getdata; remove the
early "if (m_active == nullptr) return false;" check from NetDKG::ProcessGetData
so that m_qdkgsman.GetContribution/GetComplaint/etc. are dynamically dispatched
(base virtuals will return false in non-serving handlers), and keep or update
the explanatory comment to note that observer handlers will be consulted via
m_qdkgsman even when m_active is null.
- Around line 375-379: The current code calls
m_peer_manager->PeerMisbehaving(pfrom.GetId(), 100) when DoForHandler indicates
no session handler (dispatched == false), which is too punitive; change the
penalty to 10 (i.e., m_peer_manager->PeerMisbehaving(pfrom.GetId(), 10)) or
alternatively remove the ban and replace it with a LogPrint/LogPrintf so
legitimate peers slightly ahead aren't banned—update the branch in NetDKG where
dispatched is checked (the block that logs "NetDKG -- no session handlers for
quorumIndex" and calls PeerMisbehaving) to apply the lower score or silent log
as discussed.

---

Nitpick comments:
In `@src/llmq/dkgsessionhandler.h`:
- Around line 112-121: The class header contains two consecutive "public:"
access specifiers (surrounding members like params and CDKGPendingMessages
pendingContributions/pendingComplaints/pendingJustifications/pendingPrematureCommitments),
making the second one redundant; remove the duplicate "public:" and consolidate
these members under a single public section so there is only one "public:"
before params and the pending* members.

In `@src/llmq/net_dkg.cpp`:
- Around line 100-110: The fallback verification loop can dereference a null
member returned by session.GetMember; update the loop that iterates over
messages to defensively check the result of session.GetMember(msg->proTxHash)
before using member->dmn->pdmnState->pubKeyOperator.Get(): if member is null,
insert nodeId into ret and continue, otherwise perform
msg->sig.VerifyInsecure(..., msg->GetSignHash()) as before; this prevents a
potential NPE if membership changes between the two GetMember calls.
- Around line 466-473: Stop() can hang on threads that terminated due to
unexpected exceptions because PhaseHandlerThread only catches
AbortPhaseException; add a broad exception handler inside PhaseHandlerThread's
main loop (after the AbortPhaseException catch) that catches const
std::exception& (and optionally catch(...) as a fallback), logs the error with
context, and breaks/returns cleanly so the std::thread object exits normally and
becomes joinable for Stop() to join; reference PhaseHandlerThread,
AbortPhaseException, m_phase_threads and Interrupt() when locating where to add
the catch and logging.
- Line 244: The m_active member is being initialized with a redundant temporary
via std::make_unique<ActiveDKG>(ActiveDKG{...}); change the call to forward the
constructor args directly (e.g. std::make_unique<ActiveDKG>(dmnman, mn_metaman,
dkgdbgman, qblockman, qsnapman, connman)) so no extra temporary is created, and
if ActiveDKG does not currently have a matching constructor add an explicit
constructor on ActiveDKG that accepts (dmnman, mn_metaman, dkgdbgman, qblockman,
qsnapman, connman) to allow direct in-place construction.
- Around line 216-255: The two NetDKG constructors duplicate common
initialization (m_qdkgsman, m_qman, m_sporkman, m_chainman, m_quorums_watch, the
InitializeHandlers/ConnectManagers lifecycle) and should be unified: extract the
shared setup into a private helper (e.g., InitCommon or InitializeManagers) that
takes a std::function<std::unique_ptr<CDKGSessionHandler>(const
Consensus::LLMQParams&, int)> or a template/lambda wrapper, or else implement
constructor delegation from the active constructor to the observer constructor
and only set m_active afterward; update the observer constructor to call the
helper (or be the delegated target) and change the active constructor to only
construct m_active and supply the ActiveDKGSessionHandler-producing lambda to
the shared InitializeHandlers call, ensuring m_qman.ConnectManagers and
m_qdkgsman.InitializeHandlers remain in one place.
- Around line 451-464: The dynamic_cast in Start() currently uses
dynamic_cast<ActiveDKGSessionHandler&> inside the m_qdkgsman.ForEachHandler
lambda which will throw std::bad_cast on mismatch; change it to use the pointer
form dynamic_cast<ActiveDKGSessionHandler*> (matching Interrupt()) and check for
nullptr before using handler: if the cast returns nullptr, log or skip creating
the phase thread for that handler instead of dereferencing/throwing, otherwise
capture the pointer and pass it to PhaseHandlerThread; update references in the
lambda to use the pointer (e.g., handler->QuorumIndex(),
PhaseHandlerThread(*handler)) so Start() no longer can throw from a bad_cast.
- Around line 257-294: The code in NetDKG::ProcessMessage reads llmqType and
quorumHash from vRecv and then calls vRecv.Rewind in the reverse order to
restore the stream for later hashing; make this intent explicit and less fragile
by adding a short comment above the reads/Rewind calls describing that we are
only peeking at the header bytes (uint8_t then uint256) and that the subsequent
Rewind calls intentionally undo the reads so the full payload remains unchanged
for hashing (used later when moving vRecv into pm). Optionally, reorder the
Rewind calls to mirror the read order (Rewind(sizeof(uint8_t)) then
Rewind(sizeof(uint256))) to reduce future mistakes, while keeping the
explanatory comment.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: d8950d10-63a8-42c9-9f3f-6963acd754d2

📥 Commits

Reviewing files that changed from the base of the PR and between 53be42b and f4b6aae.

📒 Files selected for processing (16)

src/active/context.cpp
src/active/context.h
src/active/dkgsession.cpp
src/init.cpp
src/llmq/dkgsession.cpp
src/llmq/dkgsession.h
src/llmq/dkgsessionhandler.cpp
src/llmq/dkgsessionhandler.h
src/llmq/dkgsessionmgr.cpp
src/llmq/dkgsessionmgr.h
src/llmq/net_dkg.cpp
src/llmq/net_dkg.h
src/llmq/observer.cpp
src/llmq/observer.h
src/llmq/options.cpp
src/llmq/options.h

🚧 Files skipped from review as they are similar to previous changes (12)

src/llmq/options.cpp
src/llmq/observer.cpp
src/llmq/options.h
src/llmq/net_dkg.h
src/llmq/dkgsessionmgr.cpp
src/active/context.h
src/llmq/dkgsessionhandler.cpp
src/llmq/observer.h
src/llmq/dkgsessionmgr.h
src/llmq/dkgsession.h
src/active/dkgsession.cpp
src/init.cpp

github-actions · 2026-05-12T13:55:22Z

This pull request has conflicts, please rebase.

github-actions · 2026-05-12T14:28:51Z

This pull request has conflicts, please rebase.

coderabbitai

♻️ Duplicate comments (2)

src/llmq/net_dkg.cpp (2)
375-379: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Penalty of 100 for missing handler may be too punitive.

At line 319, a penalty of 10 is applied for unknown quorumHash with the rationale "we might be lagging behind." The same rationale applies here: if DoForHandler returns false because no session handler exists for {llmqType, quorumIndex}, this node may simply not have initialized handlers for that quorum yet.

Using 100 here risks instantly banning legitimate participants who are slightly ahead.
Proposed fix
     if (!dispatched) {
         LogPrintf("NetDKG -- no session handlers for quorumIndex [%d]\n", quorumIndex);
-        m_peer_manager->PeerMisbehaving(pfrom.GetId(), 100);
+        // NOTE: do not insta-ban, we might be lagging behind
+        m_peer_manager->PeerMisbehaving(pfrom.GetId(), 10);
         return;
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/llmq/net_dkg.cpp` around lines 375 - 379, The current NetDKG branch
penalizes peers with m_peer_manager->PeerMisbehaving(pfrom.GetId(), 100) when
dispatched is false; lower the penalty to match the earlier unknown quorumHash
case (use 10) and keep the log message, because absence of a handler likely
means we're lagging rather than malicious behavior—modify the call in the NetDKG
handler where dispatched is false to use 10 instead of 100 (the symbols to
change: variable quorumIndex, boolean dispatched, and method
m_peer_manager->PeerMisbehaving).
384-411: ⚠️ Potential issue | 🟡 Minor | ⚖️ Poor tradeoff

Asymmetry between AlreadyHave and ProcessGetData for observer-mode nodes.

AlreadyHave (line 391) only gates on IsQuorumDKGEnabled(m_sporkman) and walks every handler's pending sets, so observer nodes (m_quorums_watch=true, m_active=nullptr) will report true for DKG inv hashes they've seen.

ProcessGetData (line 411) short-circuits to false when m_active == nullptr, meaning observer nodes will tell peers "I already have this" but then refuse to serve it via getdata.

Consider either:

Removing the m_active == nullptr early return and relying on virtual dispatch to return false naturally, or

Symmetrically gating AlreadyHave to return false when m_active == nullptr.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/llmq/net_dkg.cpp` around lines 384 - 411, AlreadyHave reports DKG invs as
present for observer nodes while ProcessGetData refuses to serve them when
m_active == nullptr, causing asymmetry; to fix, add a symmetric guard in
NetDKG::AlreadyHave that returns false when m_active == nullptr (or when
m_quorums_watch indicates observer mode) before calling
IsQuorumDKGEnabled/ForEachHandler so observers won't claim they have items they
won't serve; alternatively remove the early return in NetDKG::ProcessGetData so
virtual handlers decide, but prefer adding the m_active/nullptr check in
AlreadyHave to preserve current ProcessGetData behavior.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@src/llmq/net_dkg.cpp`:
- Around line 375-379: The current NetDKG branch penalizes peers with
m_peer_manager->PeerMisbehaving(pfrom.GetId(), 100) when dispatched is false;
lower the penalty to match the earlier unknown quorumHash case (use 10) and keep
the log message, because absence of a handler likely means we're lagging rather
than malicious behavior—modify the call in the NetDKG handler where dispatched
is false to use 10 instead of 100 (the symbols to change: variable quorumIndex,
boolean dispatched, and method m_peer_manager->PeerMisbehaving).
- Around line 384-411: AlreadyHave reports DKG invs as present for observer
nodes while ProcessGetData refuses to serve them when m_active == nullptr,
causing asymmetry; to fix, add a symmetric guard in NetDKG::AlreadyHave that
returns false when m_active == nullptr (or when m_quorums_watch indicates
observer mode) before calling IsQuorumDKGEnabled/ForEachHandler so observers
won't claim they have items they won't serve; alternatively remove the early
return in NetDKG::ProcessGetData so virtual handlers decide, but prefer adding
the m_active/nullptr check in AlreadyHave to preserve current ProcessGetData
behavior.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 8ef86bf0-4e53-424c-a970-64325e97fd7d

📥 Commits

Reviewing files that changed from the base of the PR and between f4b6aae and 22d5cec.

📒 Files selected for processing (26)

src/Makefile.am
src/active/context.cpp
src/active/context.h
src/active/dkgsession.cpp
src/active/dkgsession.h
src/active/dkgsessionhandler.cpp
src/active/dkgsessionhandler.h
src/init.cpp
src/llmq/debug.cpp
src/llmq/debug.h
src/llmq/dkgsession.cpp
src/llmq/dkgsession.h
src/llmq/dkgsessionhandler.cpp
src/llmq/dkgsessionhandler.h
src/llmq/dkgsessionmgr.cpp
src/llmq/dkgsessionmgr.h
src/llmq/net_dkg.cpp
src/llmq/net_dkg.h
src/llmq/observer.cpp
src/llmq/observer.h
src/llmq/options.cpp
src/llmq/options.h
src/net_processing.cpp
src/net_processing.h
src/test/util/setup_common.cpp
test/lint/lint-circular-dependencies.py

💤 Files with no reviewable changes (1)

test/lint/lint-circular-dependencies.py

🚧 Files skipped from review as they are similar to previous changes (20)

src/llmq/options.h
src/llmq/observer.cpp
src/Makefile.am
src/llmq/debug.h
src/llmq/observer.h
src/llmq/options.cpp
src/active/context.h
src/llmq/dkgsessionmgr.h
src/active/dkgsession.h
src/active/dkgsessionhandler.h
src/llmq/dkgsessionhandler.cpp
src/llmq/dkgsessionmgr.cpp
src/llmq/dkgsession.cpp
src/llmq/dkgsession.h
src/net_processing.h
src/llmq/net_dkg.h
src/init.cpp
src/active/dkgsession.cpp
src/llmq/dkgsessionhandler.h
src/net_processing.cpp

thepastaclaw

Code Review

Refactor cleanly splits DKG network handling into NetDKG/NetDKGStub and reduces CDKGSessionManager to a state holder; no blocking issues found. The notable regression is that on the active-masternode path the startup tip is delivered to CDSNotificationInterface twice. Remaining items are minor cleanups and an architectural suggestion.

Reviewed commit: 22d5cec

🟡 3 suggestion(s) | 💬 3 nitpick(s)

6 finding(s) posted in the review body

suggestion: Active-node startup runs CDSNotificationInterface::InitializeCurrentBlockTip twice

src/init.cpp (lines 2451-2540)

On the active-masternode path the import thread first calls g_ds_notification_interface->InitializeCurrentBlockTip(tip, ibd) directly at line 2457, then later calls GetMainSignals().InitializeCurrentBlockTip(tip, ibd) at line 2540 after nodeman->Init(). Because g_ds_notification_interface is registered as a CValidationInterface at init.cpp:2213, the broadcast at line 2540 dispatches back through CDSNotificationInterface::InitializeCurrentBlockTip, which in turn invokes SynchronousUpdatedBlockTip and UpdatedBlockTip a second time for the same tip — re-running CDeterministicMNManager::UpdatedBlockTip, CMasternodeSync::UpdatedBlockTip, CDSTXManager::UpdatedBlockTip, and CGovernanceManager::UpdatedBlockTip. Either temporarily unregister g_ds_notification_interface around the second broadcast, or have the second call notify only the subscribers that depend on a valid proTxHash.

nitpick: Duplicate forward declarations in dkgsessionmgr.h

src/llmq/dkgsessionmgr.h (lines 35-45)

Five classes are forward-declared twice in this header block (CDKGSessionHandler, CDKGContribution, CDKGComplaint, CDKGJustification, CDKGPrematureCommitment). The second copies appear after class CQuorumSnapshotManager; and look like an accidental copy/paste from the include-cleanup commit. Legal but noisy.

💡 Suggested change

class CDKGComplaint;
class CDKGContribution;
class CDKGJustification;
class CDKGPrematureCommitment;
class CDKGSessionHandler;
class CQuorumSnapshotManager;

suggestion: DKG inv types now go through MSG_DSQ's cj_walletman lookup before reaching handlers

src/net_processing.cpp (lines 2378-2387)

The combined switch arm groups MSG_QUORUM_CONTRIB/COMPLAINT/JUSTIFICATION/PREMATURE_COMMITMENT with MSG_DSQ, so every DKG inv first calls m_cj_walletman->hasQueue(inv.hash) before iterating m_handlers. The hash spaces are independent in practice so a false positive is astronomically unlikely, but the structural coupling is misleading and the lookup is wasted work on every DKG inv (hot during DKG phases). Either split MSG_DSQ into its own case, or iterate m_handlers first and let the CoinJoin pre-check live behind MSG_DSQ.

suggestion: NetDKG has no destructor that joins m_phase_threads

src/llmq/net_dkg.cpp (lines 466-473)

NetDKG::Stop() joins m_phase_threads and is currently invoked through PeerManagerImpl::RemoveHandlers() before ~NetDKG runs. If a NetDKG is ever destroyed outside that PeerManager-owned lifecycle (tests, future refactors), the base ~NetHandler runs only the static Interrupt()/Stop() (derived vtable is gone), and the phase threads — which capture references to ActiveDKG-owned objects (dmnman, mn_metaman, qsnapman, connman) — outlive their captured state. A ~NetDKG() that calls Stop() explicitly would make this robust to lifecycle changes.

nitpick: Unreachable guard: NetDKG is never installed with !is_masternode && !m_quorums_watch

src/llmq/net_dkg.cpp (lines 278-282)

At init.cpp:2232-2245, NetDKG is only registered when active_ctx is set (active masternode, m_active != nullptr) or observer_ctx is set (observer mode, constructed with quorums_watch=true); everything else gets NetDKGStub. The !is_masternode && !m_quorums_watch branch therefore cannot be reached on a live NetDKG. Either drop it and rely on NetDKGStub, or add a short comment explaining it as defense-in-depth.

nitpick: No targeted tests for NetDKG/NetDKGStub misbehavior contract

src/llmq/net_dkg.h (lines 52-123)

Existing functional DKG tests cover the happy paths, but the new NetDKG/NetDKGStub split introduces distinct misbehavior scores (10 vs 100) depending on mode (stub vs observer-without-watch vs QWATCH-from-non-masternode). A small unit test exercising each PeerMisbehaving branch would prevent future regressions in the ban-score contract.

🤖 Prompt for all review comments with AI agents

These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

- [SUGGESTION] In `src/init.cpp`:2451-2540: Active-node startup runs CDSNotificationInterface::InitializeCurrentBlockTip twice
  On the active-masternode path the import thread first calls `g_ds_notification_interface->InitializeCurrentBlockTip(tip, ibd)` directly at line 2457, then later calls `GetMainSignals().InitializeCurrentBlockTip(tip, ibd)` at line 2540 after `nodeman->Init()`. Because `g_ds_notification_interface` is registered as a `CValidationInterface` at init.cpp:2213, the broadcast at line 2540 dispatches back through `CDSNotificationInterface::InitializeCurrentBlockTip`, which in turn invokes `SynchronousUpdatedBlockTip` and `UpdatedBlockTip` a second time for the same tip — re-running `CDeterministicMNManager::UpdatedBlockTip`, `CMasternodeSync::UpdatedBlockTip`, `CDSTXManager::UpdatedBlockTip`, and `CGovernanceManager::UpdatedBlockTip`. Either temporarily unregister `g_ds_notification_interface` around the second broadcast, or have the second call notify only the subscribers that depend on a valid proTxHash.
- [NITPICK] In `src/llmq/dkgsessionmgr.h`:35-45: Duplicate forward declarations in dkgsessionmgr.h
  Five classes are forward-declared twice in this header block (CDKGSessionHandler, CDKGContribution, CDKGComplaint, CDKGJustification, CDKGPrematureCommitment). The second copies appear after `class CQuorumSnapshotManager;` and look like an accidental copy/paste from the include-cleanup commit. Legal but noisy.
- [SUGGESTION] In `src/net_processing.cpp`:2378-2387: DKG inv types now go through MSG_DSQ's cj_walletman lookup before reaching handlers
  The combined switch arm groups MSG_QUORUM_CONTRIB/COMPLAINT/JUSTIFICATION/PREMATURE_COMMITMENT with MSG_DSQ, so every DKG inv first calls `m_cj_walletman->hasQueue(inv.hash)` before iterating `m_handlers`. The hash spaces are independent in practice so a false positive is astronomically unlikely, but the structural coupling is misleading and the lookup is wasted work on every DKG inv (hot during DKG phases). Either split MSG_DSQ into its own case, or iterate `m_handlers` first and let the CoinJoin pre-check live behind MSG_DSQ.
- [SUGGESTION] In `src/llmq/net_dkg.cpp`:466-473: NetDKG has no destructor that joins m_phase_threads
  `NetDKG::Stop()` joins `m_phase_threads` and is currently invoked through `PeerManagerImpl::RemoveHandlers()` before `~NetDKG` runs. If a `NetDKG` is ever destroyed outside that PeerManager-owned lifecycle (tests, future refactors), the base `~NetHandler` runs only the static `Interrupt()/Stop()` (derived vtable is gone), and the phase threads — which capture references to `ActiveDKG`-owned objects (dmnman, mn_metaman, qsnapman, connman) — outlive their captured state. A `~NetDKG()` that calls `Stop()` explicitly would make this robust to lifecycle changes.
- [NITPICK] In `src/llmq/net_dkg.cpp`:278-282: Unreachable guard: NetDKG is never installed with !is_masternode && !m_quorums_watch
  At init.cpp:2232-2245, NetDKG is only registered when `active_ctx` is set (active masternode, `m_active != nullptr`) or `observer_ctx` is set (observer mode, constructed with `quorums_watch=true`); everything else gets NetDKGStub. The `!is_masternode && !m_quorums_watch` branch therefore cannot be reached on a live NetDKG. Either drop it and rely on NetDKGStub, or add a short comment explaining it as defense-in-depth.
- [NITPICK] In `src/llmq/net_dkg.h`:52-123: No targeted tests for NetDKG/NetDKGStub misbehavior contract
  Existing functional DKG tests cover the happy paths, but the new NetDKG/NetDKGStub split introduces distinct misbehavior scores (10 vs 100) depending on mode (stub vs observer-without-watch vs QWATCH-from-non-masternode). A small unit test exercising each PeerMisbehaving branch would prevent future regressions in the ban-score contract.

Inline posting failed; posted verified findings in the top-level review body.

coderabbitai

🧹 Nitpick comments (1)

src/llmq/net_dkg.cpp (1)

287-292: 💤 Low value

Consider documenting the stream rewind pattern.

The code reads llmqType and quorumHash for early validation, then rewinds the stream so the full serialized payload (including those fields) can be hashed and stored. This pattern is correct but non-obvious.

📝 Optional clarifying comment

+    // Read llmqType and quorumHash for validation, then rewind so the full payload
+    // (including these fields) can be hashed and stored for later deserialization.
     Consensus::LLMQType llmqType;
     uint256 quorumHash;
     vRecv >> llmqType;
     vRecv >> quorumHash;
     vRecv.Rewind(sizeof(uint256));
     vRecv.Rewind(sizeof(uint8_t));

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/llmq/net_dkg.cpp` around lines 287 - 292, Add an inline comment above the
early-read-and-rewind block explaining the pattern: that llmqType and quorumHash
are read from the CDataStream (vRecv) to perform lightweight validation/early
checks, and then vRecv.Rewind(sizeof(uint256)) and vRecv.Rewind(sizeof(uint8_t))
are called to restore the stream position so the full serialized payload
(including those fields) is hashed/stored later; reference the variables
llmqType, quorumHash and the calls to vRecv.Rewind to make the intent clear to
future readers.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/llmq/net_dkg.cpp`:
- Around line 287-292: Add an inline comment above the early-read-and-rewind
block explaining the pattern: that llmqType and quorumHash are read from the
CDataStream (vRecv) to perform lightweight validation/early checks, and then
vRecv.Rewind(sizeof(uint256)) and vRecv.Rewind(sizeof(uint8_t)) are called to
restore the stream position so the full serialized payload (including those
fields) is hashed/stored later; reference the variables llmqType, quorumHash and
the calls to vRecv.Rewind to make the intent clear to future readers.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: ace2a259-edf8-4ea4-8c5e-f6b54b06f1ad

📥 Commits

Reviewing files that changed from the base of the PR and between 22d5cec and 66e6156.

📒 Files selected for processing (26)

src/Makefile.am
src/active/context.cpp
src/active/context.h
src/active/dkgsession.cpp
src/active/dkgsession.h
src/active/dkgsessionhandler.cpp
src/active/dkgsessionhandler.h
src/init.cpp
src/llmq/debug.cpp
src/llmq/debug.h
src/llmq/dkgsession.cpp
src/llmq/dkgsession.h
src/llmq/dkgsessionhandler.cpp
src/llmq/dkgsessionhandler.h
src/llmq/dkgsessionmgr.cpp
src/llmq/dkgsessionmgr.h
src/llmq/net_dkg.cpp
src/llmq/net_dkg.h
src/llmq/observer.cpp
src/llmq/observer.h
src/llmq/options.cpp
src/llmq/options.h
src/net_processing.cpp
src/net_processing.h
src/test/util/setup_common.cpp
test/lint/lint-circular-dependencies.py

💤 Files with no reviewable changes (1)

test/lint/lint-circular-dependencies.py

🚧 Files skipped from review as they are similar to previous changes (19)

src/llmq/options.cpp
src/llmq/options.h
src/llmq/observer.h
src/llmq/observer.cpp
src/llmq/debug.h
src/llmq/net_dkg.h
src/test/util/setup_common.cpp
src/net_processing.h
src/llmq/dkgsession.cpp
src/active/dkgsessionhandler.cpp
src/llmq/dkgsessionmgr.cpp
src/init.cpp
src/active/dkgsessionhandler.h
src/active/context.cpp
src/llmq/dkgsession.h
src/llmq/dkgsessionhandler.cpp
src/net_processing.cpp
src/llmq/dkgsessionhandler.h
src/active/dkgsession.cpp

thepastaclaw

Code Review

Latest delta correctly resolves three of the six prior findings: NetDKG destructor now joins phase threads via Stop() (net_dkg.cpp:255-259), duplicate forward declarations are gone from net_dkg.h, and the unreachable !is_masternode && !m_quorums_watch guard has been removed from ProcessMessage. The latest delta also drops the now-unused mn_sync dependency from LLMQContext/CInstantSendManager and removes a redundant cs_main lock around Misbehaving — both look correct on HEAD. Three prior findings remain valid on HEAD: the active-masternode startup path still double-dispatches CDSNotificationInterface::InitializeCurrentBlockTip, MSG_DSQ shares the AlreadyHave arm with four DKG inv types (forcing wasted cj_walletman lookups per DKG inv), and the new NetDKG/NetDKGStub misbehavior contract still lacks targeted tests. No new in-scope defects introduced.

🟡 2 suggestion(s) | 💬 1 nitpick(s)

2 additional finding(s) omitted (not in diff).

🤖 Prompt for all review comments with AI agents

These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `src/init.cpp`:
- [SUGGESTION] src/init.cpp:2447-2540: Active-masternode startup double-dispatches CDSNotificationInterface::InitializeCurrentBlockTip
  On the active-masternode path, the import thread calls `g_ds_notification_interface->InitializeCurrentBlockTip(tip, ibd)` directly at src/init.cpp:2456, then later calls `GetMainSignals().InitializeCurrentBlockTip(tip, ibd)` at src/init.cpp:2539 after `nodeman->Init()`. Because `g_ds_notification_interface` is registered as a `CValidationInterface` at src/init.cpp:2212, the broadcast at 2539 re-enters `CDSNotificationInterface::InitializeCurrentBlockTip()` (src/dsnotificationinterface.cpp:32-36), which reruns both `SynchronousUpdatedBlockTip()` and `UpdatedBlockTip()` for the same tip. That re-executes `CDeterministicMNManager::UpdatedBlockTip`, `CMasternodeSync::UpdatedBlockTip`, `CDSTXManager::UpdatedBlockTip`, and `CGovernanceManager::UpdatedBlockTip` during active-node startup. The new comment at 2451-2455 documents the *intent* (defer the full broadcast until proTxHash is available, only kick the DS notification early) but does not prevent the second dispatch. Either temporarily `UnregisterValidationInterface(g_ds_notification_interface.get())` around the second broadcast, gate `CDSNotificationInterface::InitializeCurrentBlockTip` with an idempotency flag, or restructure so only the still-pending proTxHash-dependent subscribers are notified by the second call.

In `src/net_processing.cpp`:
- [SUGGESTION] src/net_processing.cpp:2374-2384: DKG inventory checks still route through CoinJoin DSQ cache lookup
  The shared `AlreadyHave()` arm groups `MSG_QUORUM_CONTRIB`, `MSG_QUORUM_COMPLAINT`, `MSG_QUORUM_JUSTIFICATION`, `MSG_QUORUM_PREMATURE_COMMITMENT` with `MSG_DSQ` and unconditionally calls `m_cj_walletman->hasQueue(inv.hash)` before iterating registered NetHandlers. For the four DKG inv types this is unrelated work on a hot path (every DKG inv during a DKG phase) and structurally couples DKG inventory handling to CoinJoin queue state even though the hash spaces are independent — running counter to the stated goal of this refactor to decouple DKG networking concerns. A false-positive collision is astronomically unlikely, so this is not a correctness break, but the coupling should be removed. Split `MSG_DSQ` into its own case, or iterate `m_handlers` first and keep the cj_walletman check gated by `inv.type == MSG_DSQ`.

thepastaclaw · 2026-05-30T02:56:21Z

+class NetDKG final : public NetHandler
+{
+public:
+    //! Observer-mode constructor.
+    NetDKG(PeerManagerInternal* peer_manager, const CSporkManager& sporkman, CDKGSessionManager& qdkgsman,
+           const ChainstateManager& chainman, bool quorums_watch, CQuorumManager& qman, QuorumRole& role);
+
+    //! Active-mode constructor: takes the masternode-only dep bundle as required references.
+    NetDKG(PeerManagerInternal* peer_manager, const CSporkManager& sporkman, CDKGSessionManager& qdkgsman,
+           const ChainstateManager& chainman, bool quorums_watch, CQuorumManager& qman, QuorumRole& role,
+           CBLSWorker& bls_worker, CDeterministicMNManager& dmnman, CMasternodeMetaMan& mn_metaman,
+           CDKGDebugManager& dkgdbgman, CQuorumBlockProcessor& qblockman, CQuorumSnapshotManager& qsnapman,
+           const CActiveMasternodeManager& mn_activeman, CConnman& connman);
+
+    ~NetDKG();
+
+    // NetHandler
+    void ProcessMessage(CNode& pfrom, const std::string& msg_type, CDataStream& vRecv) override EXCLUSIVE_LOCKS_REQUIRED(!cs_indexed_quorums_cache);
+    bool AlreadyHave(const CInv& inv) override;
+    bool ProcessGetData(CNode& pfrom, const CInv& inv, CConnman& connman, const CNetMsgMaker& msgMaker) override;
+    /**
+     * Drives one phase-handler thread per ActiveDKGSessionHandler in active mode;
+     * no-op in observer mode (no curSession to drive).
+     */
+    void Start() override;
+    void Stop() override;
+    void Interrupt() override;
+
+private:
+    //! Bundle of refs that exist only in active masternode mode.
+    struct ActiveDKG {
+        CDeterministicMNManager& dmnman;
+        CMasternodeMetaMan& mn_metaman;
+        CDKGDebugManager& dkgdbgman;
+        CQuorumBlockProcessor& qblockman;
+        CQuorumSnapshotManager& qsnapman;
+        CConnman& connman;
+    };
+
+    void PhaseHandlerThread(ActiveDKGSessionHandler& handler);
+    void HandleDKGRound(ActiveDKGSessionHandler& handler);
+
+    CDKGSessionManager& m_qdkgsman;
+    CQuorumManager& m_qman;
+    const CSporkManager& m_sporkman;
+    const ChainstateManager& m_chainman;
+    const bool m_quorums_watch;
+    const std::unique_ptr<ActiveDKG> m_active; //!< null in observer mode, non-null in active mode
+
+    /** Cache: quorum hash → quorum index, populated lazily by ProcessMessage. */
+    mutable Mutex cs_indexed_quorums_cache;
+    mutable std::map<Consensus::LLMQType, Uint256LruHashMap<int>> indexed_quorums_cache GUARDED_BY(cs_indexed_quorums_cache);
+
+    std::vector<std::thread> m_phase_threads;
+};
+
+/**
+ * Minimal NetHandler installed on nodes that run neither active nor observer
+ * DKG mode. Just punishes peers that push DKG messages we cannot serve.
+ */
+class NetDKGStub final : public NetHandler
+{
+public:
+    explicit NetDKGStub(PeerManagerInternal* peer_manager) :
+        NetHandler(peer_manager)
+    {
+    }
+
+    // NetHandler
+    void ProcessMessage(CNode& pfrom, const std::string& msg_type, CDataStream& vRecv) override;
+};


💬 Nitpick: No targeted tests pin the NetDKG/NetDKGStub misbehavior contract

The refactor encodes distinct misbehavior outcomes across NetDKG and NetDKGStub: stub-mode nodes punish all DKG-typed messages with score 10 (src/llmq/net_dkg.cpp:600-606), while live NetDKG paths assign 10 or 100 depending on message type and validation branch (src/llmq/net_dkg.cpp:272-377). Existing DKG functional tests cover only happy paths, leaving the new mode-dependent ban-score contract vulnerable to silent drift in future cleanups. A small unit test exercising each PeerMisbehaving branch (QWATCH-from-non-masternode, empty vRecv, unknown llmqType, unknown quorumHash, disabled llmqType, invalid quorumIndex, no session handler, and the NetDKGStub catch-all) would pin the contract cheaply.

_{source: ['claude', 'codex']}

still double-dispatches CDSNotificationInterface::InitializeCurrentBlockTip

It's pre-existing behaviour, not introduced by this PR, out-of-scope

MSG_DSQ shares the AlreadyHave arm with four DKG inv types (forcing wasted cj_walletman lookups per DKG inv),

That's an pre-existing behaviour; most of messages goes for lookup for all extra handlers

the new NetDKG/NetDKGStub misbehavior contract still lacks targeted tests.

That's also out-of-scope of this PR as this PR is purely refactoring; no new functionality is introduced.

UdjinM6 · 2026-06-04T13:48:15Z

pls consider 0e2e7d3

thepastaclaw

Code Review

Latest delta (2e4e5e4) is a clean dead-code removal: drops the unused NetDKG::m_quorums_watch member and the matching observer-mode constructor parameter. No new in-scope defects introduced by this commit. Three prior findings remain STILL VALID on HEAD: the active-masternode startup still double-dispatches CDSNotificationInterface::InitializeCurrentBlockTip, the shared AlreadyHave() arm still routes DKG inventory checks through the CoinJoin DSQ cache, and the new NetDKG/NetDKGStub misbehavior matrix still lacks targeted tests.

🟡 2 suggestion(s) | 💬 1 nitpick(s)

2 additional finding(s) omitted (not in diff).

🤖 Prompt for all review comments with AI agents

These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `src/init.cpp`:
- [SUGGESTION] src/init.cpp:2447-2540: Active-masternode startup double-dispatches CDSNotificationInterface::InitializeCurrentBlockTip (carried forward, STILL VALID)
  On the active-masternode path the import thread calls `g_ds_notification_interface->InitializeCurrentBlockTip(tip, ibd)` directly at src/init.cpp:2456, then `GetMainSignals().InitializeCurrentBlockTip(tip, ibd)` at src/init.cpp:2539 after `nodeman->Init()`. Because `g_ds_notification_interface` is registered as a `CValidationInterface` earlier (src/init.cpp:2209-2212) and is never unregistered between the two calls, the broadcast at line 2539 re-enters `CDSNotificationInterface::InitializeCurrentBlockTip()` (src/dsnotificationinterface.cpp:32-36), which reruns both `SynchronousUpdatedBlockTip()` and `UpdatedBlockTip()` for the same tip. That re-executes `CDeterministicMNManager::UpdatedBlockTip`, `CMasternodeSync::UpdatedBlockTip`, `CDSTXManager::UpdatedBlockTip`, and `CGovernanceManager::UpdatedBlockTip` during active-node startup. The comment at lines 2451-2455 documents the deferral intent (only the proTxHash-dependent subscribers actually need the late call) but does not prevent the redundant second dispatch into DS. Either temporarily `UnregisterValidationInterface(g_ds_notification_interface.get())` around the late broadcast, gate `CDSNotificationInterface::InitializeCurrentBlockTip` with an idempotency flag, or restructure so only the still-pending proTxHash-dependent subscribers receive the second call. The deferred-broadcast pattern is in scope because it was introduced in commits 27129d9b24 and 863ee6b5ae which are part of this PR.

In `src/net_processing.cpp`:
- [SUGGESTION] src/net_processing.cpp:2374-2384: DKG inventory checks still route through the CoinJoin DSQ cache (carried forward, STILL VALID)
  The shared `AlreadyHave()` switch arm groups `MSG_QUORUM_CONTRIB`, `MSG_QUORUM_COMPLAINT`, `MSG_QUORUM_JUSTIFICATION`, `MSG_QUORUM_PREMATURE_COMMITMENT` with `MSG_DSQ` and unconditionally calls `m_cj_walletman->hasQueue(inv.hash)` before iterating registered `NetHandler`s. For the four DKG inventory types this is unrelated work on a hot path (every DKG inv during a DKG phase) and structurally couples DKG inventory handling to CoinJoin queue state, running counter to the stated goal of this PR (decoupling DKG networking concerns into `NetDKG`). A false-positive collision against a DSQ entry is astronomically unlikely so this is not a correctness bug, but the coupling should be removed by either splitting `MSG_DSQ` into its own case or iterating `m_handlers` first and gating the `cj_walletman` pre-check behind `inv.type == MSG_DSQ`.

thepastaclaw · 2026-06-04T15:03:58Z

+void NetDKG::ProcessMessage(CNode& pfrom, const std::string& msg_type, CDataStream& vRecv)
+{
+    if (!IsQuorumDKGEnabled(m_sporkman)) return;
+
+    if (msg_type != NetMsgType::QCONTRIB && msg_type != NetMsgType::QCOMPLAINT && msg_type != NetMsgType::QJUSTIFICATION &&
+        msg_type != NetMsgType::QPCOMMITMENT && msg_type != NetMsgType::QWATCH) {
+        return;
+    }
+
+    const bool is_masternode = m_active != nullptr;
+
+    if (msg_type == NetMsgType::QWATCH) {
+        if (!is_masternode) {
+            // non-masternodes should never receive this
+            m_peer_manager->PeerMisbehaving(pfrom.GetId(), 10);
+            return;
+        }
+        pfrom.qwatch = true;
+        return;
+    }
+
+    if (vRecv.empty()) {
+        m_peer_manager->PeerMisbehaving(pfrom.GetId(), 100);
+        return;
+    }
+
+    Consensus::LLMQType llmqType;
+    uint256 quorumHash;
+    vRecv >> llmqType;
+    vRecv >> quorumHash;
+    vRecv.Rewind(sizeof(uint256));
+    vRecv.Rewind(sizeof(uint8_t));
+
+    const auto& llmq_params_opt = Params().GetLLMQ(llmqType);
+    if (!llmq_params_opt.has_value()) {
+        LogPrintf("NetDKG -- invalid llmqType [%d]\n", std23::to_underlying(llmqType));
+        m_peer_manager->PeerMisbehaving(pfrom.GetId(), 100);
+        return;
+    }
+    const auto& llmq_params = llmq_params_opt.value();
+
+    int quorumIndex{-1};
+    {
+        LOCK(cs_indexed_quorums_cache);
+        if (indexed_quorums_cache.empty()) {
+            utils::InitQuorumsCache(indexed_quorums_cache, m_chainman.GetConsensus());
+        }
+        indexed_quorums_cache[llmqType].get(quorumHash, quorumIndex);
+    }
+
+    if (quorumIndex == -1) {
+        const CBlockIndex* pQuorumBaseBlockIndex = WITH_LOCK(::cs_main,
+                                                             return m_chainman.m_blockman.LookupBlockIndex(quorumHash));
+        if (pQuorumBaseBlockIndex == nullptr) {
+            LogPrintf("NetDKG -- unknown quorumHash %s\n", quorumHash.ToString());
+            // NOTE: do not insta-ban for this, we might be lagging behind
+            m_peer_manager->PeerMisbehaving(pfrom.GetId(), 10);
+            return;
+        }
+        if (!m_chainman.IsQuorumTypeEnabled(llmqType, pQuorumBaseBlockIndex->pprev)) {
+            LogPrintf("NetDKG -- llmqType [%d] quorums aren't active\n", std23::to_underlying(llmqType));
+            m_peer_manager->PeerMisbehaving(pfrom.GetId(), 100);
+            return;
+        }
+        quorumIndex = pQuorumBaseBlockIndex->nHeight % llmq_params.dkgInterval;
+        const int quorumIndexMax = IsQuorumRotationEnabled(llmq_params, pQuorumBaseBlockIndex)
+                                       ? llmq_params.signingActiveQuorumCount - 1
+                                       : 0;
+        if (quorumIndex > quorumIndexMax) {
+            LogPrintf("NetDKG -- invalid quorumHash %s\n", quorumHash.ToString());
+            m_peer_manager->PeerMisbehaving(pfrom.GetId(), 100);
+            return;
+        }
+    }
+
+    int inv_type = 0;
+    if (msg_type == NetMsgType::QCONTRIB)
+        inv_type = MSG_QUORUM_CONTRIB;
+    else if (msg_type == NetMsgType::QCOMPLAINT)
+        inv_type = MSG_QUORUM_COMPLAINT;
+    else if (msg_type == NetMsgType::QJUSTIFICATION)
+        inv_type = MSG_QUORUM_JUSTIFICATION;
+    else if (msg_type == NetMsgType::QPCOMMITMENT)
+        inv_type = MSG_QUORUM_PREMATURE_COMMITMENT;
+    Assume(inv_type != 0); // guarded by the early-return above
+
+    auto pm = std::make_shared<CDataStream>(std::move(vRecv));
+    CHashWriter hw(SER_GETHASH, 0);
+    hw.write(AsWritableBytes(Span{*pm}));
+    const uint256 hash = hw.GetHash();
+
+    const NodeId from = pfrom.GetId();
+    const bool dispatched = m_qdkgsman.DoForHandler({llmqType, quorumIndex}, [&](CDKGSessionHandler& handler) {
+        CDKGPendingMessages* pending = nullptr;
+        switch (inv_type) {
+        case MSG_QUORUM_CONTRIB:
+            pending = &handler.pendingContributions;
+            break;
+        case MSG_QUORUM_COMPLAINT:
+            pending = &handler.pendingComplaints;
+            break;
+        case MSG_QUORUM_JUSTIFICATION:
+            pending = &handler.pendingJustifications;
+            break;
+        case MSG_QUORUM_PREMATURE_COMMITMENT:
+            pending = &handler.pendingPrematureCommitments;
+            break;
+        }
+        Assume(pending != nullptr);
+        WITH_LOCK(::cs_main, m_peer_manager->PeerEraseObjectRequest(from, CInv{static_cast<uint32_t>(inv_type), hash}));
+        pending->PushPendingMessage(from, std::move(pm), hash);
+    });
+    if (!dispatched) {
+        LogPrintf("NetDKG -- no session handlers for quorumIndex [%d]\n", quorumIndex);
+        m_peer_manager->PeerMisbehaving(pfrom.GetId(), 100);
+        return;
+    }
+
+    WITH_LOCK(cs_indexed_quorums_cache, indexed_quorums_cache[llmqType].insert(quorumHash, quorumIndex));
+}
+
+bool NetDKG::AlreadyHave(const CInv& inv)
+{
+    switch (inv.type) {
+    case MSG_QUORUM_CONTRIB:
+    case MSG_QUORUM_COMPLAINT:
+    case MSG_QUORUM_JUSTIFICATION:
+    case MSG_QUORUM_PREMATURE_COMMITMENT: {
+        if (!IsQuorumDKGEnabled(m_sporkman)) return false;
+        bool seen = false;
+        m_qdkgsman.ForEachHandler([&](CDKGSessionHandler& h) {
+            if (seen) return;
+            if (h.pendingContributions.HasSeen(inv.hash) || h.pendingComplaints.HasSeen(inv.hash) ||
+                h.pendingJustifications.HasSeen(inv.hash) || h.pendingPrematureCommitments.HasSeen(inv.hash)) {
+                seen = true;
+            }
+        });
+        return seen;
+    }
+    }
+    return false;
+}
+
+bool NetDKG::ProcessGetData(CNode& pfrom, const CInv& inv, CConnman& connman, const CNetMsgMaker& msgMaker)
+{
+    // Default implementations of GetContribution and the other virtual methods
+    // return false in observer mode; m_active is only an early exit and does
+    // not affect logic.
+    if (m_active == nullptr) return false;
+
+    switch (inv.type) {
+    case MSG_QUORUM_CONTRIB: {
+        CDKGContribution o;
+        if (m_qdkgsman.GetContribution(inv.hash, o)) {
+            connman.PushMessage(&pfrom, msgMaker.Make(NetMsgType::QCONTRIB, o));
+            return true;
+        }
+        return false;
+    }
+    case MSG_QUORUM_COMPLAINT: {
+        CDKGComplaint o;
+        if (m_qdkgsman.GetComplaint(inv.hash, o)) {
+            connman.PushMessage(&pfrom, msgMaker.Make(NetMsgType::QCOMPLAINT, o));
+            return true;
+        }
+        return false;
+    }
+    case MSG_QUORUM_JUSTIFICATION: {
+        CDKGJustification o;
+        if (m_qdkgsman.GetJustification(inv.hash, o)) {
+            connman.PushMessage(&pfrom, msgMaker.Make(NetMsgType::QJUSTIFICATION, o));
+            return true;
+        }
+        return false;
+    }
+    case MSG_QUORUM_PREMATURE_COMMITMENT: {
+        CDKGPrematureCommitment o;
+        if (m_qdkgsman.GetPrematureCommitment(inv.hash, o)) {
+            connman.PushMessage(&pfrom, msgMaker.Make(NetMsgType::QPCOMMITMENT, o));
+            return true;
+        }
+        return false;
+    }
+    }
+    return false;
+}
+
+void NetDKG::Start()
+{
+    if (m_active == nullptr) return;
+    if (!m_phase_threads.empty()) {
+        throw std::runtime_error("Tried to start PhaseHandlerThreads again.");
+    }
+
+    m_qdkgsman.ForEachHandler([this](CDKGSessionHandler& base) {
+        auto& handler = dynamic_cast<ActiveDKGSessionHandler&>(base);
+        std::string thread_name = strprintf("llmq-%d-%d", std23::to_underlying(handler.params.type), handler.QuorumIndex());
+        m_phase_threads.emplace_back([this, name = std::move(thread_name), &handler] {
+            util::TraceThread(name.c_str(), [this, &handler] { PhaseHandlerThread(handler); });
+        });
+    });
+}
+
+void NetDKG::Stop()
+{
+    Interrupt();
+    for (auto& t : m_phase_threads) {
+        if (t.joinable()) t.join();
+    }
+    m_phase_threads.clear();
+}
+
+void NetDKG::Interrupt()
+{
+    if (m_active == nullptr) return;
+    m_qdkgsman.ForEachHandler([](CDKGSessionHandler& base) {
+        if (auto* handler = dynamic_cast<ActiveDKGSessionHandler*>(&base)) {
+            handler->RequestStop();
+        }
+    });
+}
+
+void NetDKG::PhaseHandlerThread(ActiveDKGSessionHandler& handler)
+{
+    while (!handler.IsStopRequested()) {
+        try {
+            LogPrint(BCLog::LLMQ_DKG, "NetDKG::%s -- %s qi[%d] - starting HandleDKGRound\n", __func__,
+                     handler.params.name, handler.QuorumIndex());
+            HandleDKGRound(handler);
+        } catch (AbortPhaseException& e) {
+            m_active->dkgdbgman.MarkAborted(handler.params.type, handler.QuorumIndex());
+            LogPrint(BCLog::LLMQ_DKG, "NetDKG::%s -- %s qi[%d] - aborted current DKG session\n", __func__,
+                     handler.params.name, handler.QuorumIndex());
+        }
+    }
+}
+
+void NetDKG::HandleDKGRound(ActiveDKGSessionHandler& handler)
+{
+    auto& active = *Assert(m_active);
+
+    handler.WaitForNextPhase(std::nullopt, QuorumPhase::Initialized);
+
+    handler.ClearPendingMessages();
+    uint256 curQuorumHash = handler.GetCurrentQuorumHash();
+
+    const CBlockIndex* pQuorumBaseBlockIndex = WITH_LOCK(::cs_main,
+                                                         return m_chainman.m_blockman.LookupBlockIndex(curQuorumHash));
+
+    if (!pQuorumBaseBlockIndex || !handler.InitNewQuorum(pQuorumBaseBlockIndex)) {
+        // should actually never happen
+        handler.WaitForNewQuorum(curQuorumHash);
+        throw AbortPhaseException();
+    }
+
+    active.dkgdbgman.MarkPhaseAdvanced(handler.params.type, handler.QuorumIndex(), QuorumPhase::Initialized);
+
+    auto* curSession = handler.GetCurSession();
+    if (handler.params.is_single_member()) {
+        auto finalCommitment = curSession->FinalizeSingleCommitment();
+        if (!finalCommitment.IsNull()) { // it can be null only if we are not member
+            if (auto inv_opt = active.qblockman.AddMineableCommitment(finalCommitment); inv_opt.has_value()) {
+                m_peer_manager->PeerRelayInv(inv_opt.value());
+            }
+        }
+        handler.WaitForNextPhase(QuorumPhase::Initialized, QuorumPhase::Contribute, curQuorumHash);
+        return;
+    }
+
+    const auto tip_mn_list = active.dmnman.GetListAtChainTip();
+    utils::EnsureQuorumConnections(handler.params, active.connman, m_sporkman,
+                                   {active.dmnman, active.qsnapman, m_chainman, pQuorumBaseBlockIndex}, tip_mn_list,
+                                   curSession->ProTx(), /*is_masternode=*/true, handler.QuorumsWatch());
+    if (curSession->AreWeMember()) {
+        utils::AddQuorumProbeConnections(handler.params, active.connman, active.mn_metaman, m_sporkman,
+                                         {active.dmnman, active.qsnapman, m_chainman, pQuorumBaseBlockIndex},
+                                         tip_mn_list, curSession->ProTx());
+    }
+
+    handler.WaitForNextPhase(QuorumPhase::Initialized, QuorumPhase::Contribute, curQuorumHash);
+
+    // Contribute
+    auto fContributeStart = [curSession, &handler]() {
+        if (auto qc = curSession->Contribute(); qc) {
+            EnqueueOwn(handler.pendingContributions, *qc);
+        }
+    };
+    auto fContributeWait = [this, curSession, &handler, &active] {
+        return ProcessPendingMessageBatch<CDKGContribution>(active.connman, *curSession, handler.pendingContributions,
+                                                            *m_peer_manager, 8);
+    };
+    handler.HandlePhase(QuorumPhase::Contribute, QuorumPhase::Complain, curQuorumHash, 0.05, fContributeStart,
+                        fContributeWait);
+
+    // Complain
+    auto fComplainStart = [curSession, &handler, &active]() {
+        if (auto qc = curSession->VerifyAndComplain(active.connman); qc) {
+            EnqueueOwn(handler.pendingComplaints, *qc);
+        }
+    };
+    auto fComplainWait = [this, curSession, &handler, &active] {
+        return ProcessPendingMessageBatch<CDKGComplaint>(active.connman, *curSession, handler.pendingComplaints,
+                                                         *m_peer_manager, 8);
+    };
+    handler.HandlePhase(QuorumPhase::Complain, QuorumPhase::Justify, curQuorumHash, 0.05, fComplainStart, fComplainWait);
+
+    // Justify
+    auto fJustifyStart = [curSession, &handler]() {
+        if (auto qj = curSession->VerifyAndJustify(); qj) {
+            EnqueueOwn(handler.pendingJustifications, *qj);
+        }
+    };
+    auto fJustifyWait = [this, curSession, &handler, &active] {
+        return ProcessPendingMessageBatch<CDKGJustification>(active.connman, *curSession, handler.pendingJustifications,
+                                                             *m_peer_manager, 8);
+    };
+    handler.HandlePhase(QuorumPhase::Justify, QuorumPhase::Commit, curQuorumHash, 0.05, fJustifyStart, fJustifyWait);
+
+    // Commit
+    auto fCommitStart = [curSession, &handler]() {
+        if (auto qc = curSession->VerifyAndCommit(); qc) {
+            EnqueueOwn(handler.pendingPrematureCommitments, *qc);
+        }
+    };
+    auto fCommitWait = [this, curSession, &handler, &active] {
+        return ProcessPendingMessageBatch<CDKGPrematureCommitment>(active.connman, *curSession,
+                                                                   handler.pendingPrematureCommitments, *m_peer_manager,
+                                                                   8);
+    };
+    handler.HandlePhase(QuorumPhase::Commit, QuorumPhase::Finalize, curQuorumHash, 0.1, fCommitStart, fCommitWait);
+
+    auto finalCommitments = curSession->FinalizeCommitments();
+    for (const auto& fqc : finalCommitments) {
+        if (auto inv_opt = active.qblockman.AddMineableCommitment(fqc); inv_opt.has_value()) {
+            m_peer_manager->PeerRelayInv(inv_opt.value());
+        }
+    }
+}
+
+void NetDKGStub::ProcessMessage(CNode& pfrom, const std::string& msg_type, CDataStream& vRecv)
+{
+    if (msg_type == NetMsgType::QCONTRIB || msg_type == NetMsgType::QCOMPLAINT || msg_type == NetMsgType::QJUSTIFICATION ||
+        msg_type == NetMsgType::QPCOMMITMENT || msg_type == NetMsgType::QWATCH) {
+        m_peer_manager->PeerMisbehaving(pfrom.GetId(), 10);
+    }
+}


💬 Nitpick: NetDKG / NetDKGStub misbehavior contract lacks targeted coverage (carried forward, STILL VALID)

The refactor introduces a mode-dependent misbehavior matrix that is not exercised end-to-end: NetDKG::ProcessMessage assigns 10 points for QWATCH-from-non-masternode (line 273) and 10 points for unknown-quorumHash (line 315), but 100 points for empty vRecv (line 281), invalid llmqType (line 295), disabled llmqType (line 320), invalid quorumIndex (line 329), and missing session handler (line 373); NetDKGStub::ProcessMessage (lines 598-603) applies a flat 10-point punishment to any DKG-typed inbound. test/functional/p2p_quorum_data.py:374-395 covers the QWATCH-from-non-masternode 10-point path, but no test pins the 100-point branches or the NetDKGStub catch-all. These ban-score values are new behavior introduced by the NetDKG/NetDKGStub split and would silently drift in future cleanups. A small unit or functional test exercising each branch would pin the contract cheaply.

_{source: ['claude', 'codex']}

UdjinM6

utACK 2e4e5e4

github-actions · 2026-06-05T16:01:47Z

This pull request has conflicts, please rebase.

thepastaclaw

Code Review

Latest delta (aefe581→32ef6b8f) is a clean continuation of the DKG networking refactor: NetDKG now owns ConnectManagers/DisconnectManagers lifecycle, the heavy commitment.h include is dropped, and an unused m_quorums_watch member is removed. No new defects in the latest delta. Of the three carried-forward findings, two remain valid and in-scope (DSQ-grouping in AlreadyHave was newly introduced by this PR; NetDKG/NetDKGStub misbehavior contract has no targeted tests). The init.cpp double-dispatch finding is dropped — it is real but pre-existing code that this PR does not touch.

🟡 1 suggestion(s)

1 additional finding(s) omitted (not in diff).

1 carried-forward finding(s) already raised on this PR; not re-posting as new inline comments.

🤖 Prompt for all review comments with AI agents

These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `src/net_processing.cpp`:
- [SUGGESTION] src/net_processing.cpp:2374-2383: DKG inventory checks route through the CoinJoin DSQ cache
  The `MSG_QUORUM_CONTRIB`, `MSG_QUORUM_COMPLAINT`, `MSG_QUORUM_JUSTIFICATION`, and `MSG_QUORUM_PREMATURE_COMMITMENT` arms — added by this PR — share a switch case with `MSG_DSQ` and unconditionally call `m_cj_walletman->hasQueue(inv.hash)` before iterating registered `NetHandler`s. For every DKG inv during an active DKG phase this performs unrelated work on a hot path, and structurally couples DKG inventory handling to CoinJoin queue state — the opposite of this PR's stated goal of decoupling DKG networking into `NetDKG`. A false-positive collision against a DSQ entry is astronomically unlikely so this is not a correctness break, but the coupling should be removed: split `MSG_DSQ` into its own case or gate the `m_cj_walletman` pre-check behind `inv.type == MSG_DSQ`.

It shows the hidden circular dependency and tidy up list of includes

- removed method CDKGPendingMessages::Misbehaving(NodeId, int, PeerManager&), ProcessPendingMessageBatch calls peerman.Misbehaving(...) directly - renamed PushPendingMessage<Message>(NodeId, Message&, PeerManager&) to PushOwnPendingMessage for clear distinction of path with node=-1 (self made)

…from PeerManager Re-ordered initialization of PeerManager and ActiveContext / ObserverContext, PeerManager::make now takes nodeman raw ptr (or nullptr). It resolves several circular dependencies over net_processing and removes several unique_ptr<T&> work-arounds from PeerManager

It helps to drop dependency of llmq/dkgsessionhandler on network code

…issing

- moved implementation of ProcessMessage and AlreadyHave to NetDKG - drop usages of MessageProcessingResult in CDKGSessionManager - introduced a new helper DoForHandler

…NetDKG

The NetDKG::m_quorums_watch member was stored by both constructors but never read: the active path forwards quorums_watch into each ActiveDKGSessionHandler (feeding the live QuorumsWatch() accessor), while NetDKG's own copy was dead. Remove the member, drop the now-unused quorums_watch parameter from the observer-mode constructor (and its init.cpp call site), and keep it on the active-mode constructor where it is still forwarded to the handler. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

UdjinM6

utACK 496e479

thepastaclaw

Code Review

Latest delta (32ef6b8→496e479b) is limited to NetDKG cleanup (drop unused m_quorums_watch member), include hygiene (drop llmq/commitment.h from CDKGSession header), DKG bootstrap consolidation into NetDKG, and chainstate dist-source plumbing — no new in-scope defects. Both prior in-scope findings remain STILL VALID against HEAD: DKG inventory checks still share the MSG_DSQ arm calling m_cj_walletman->hasQueue (src/net_processing.cpp:2374-2383), and the NetDKG/NetDKGStub mode-dependent ban-score matrix still lacks targeted coverage beyond the QWATCH-from-non-masternode path (src/llmq/net_dkg.cpp:272-377). No blockers; review action is COMMENT.

🟡 1 suggestion(s)

1 additional finding(s) omitted (not in diff).

1 carried-forward finding(s) already raised on this PR; not re-posting as new inline comments.

🤖 Prompt for all review comments with AI agents

These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `src/net_processing.cpp`:
- [SUGGESTION] src/net_processing.cpp:2374-2383: DKG inventory checks route through the CoinJoin DSQ cache
  The new MSG_QUORUM_CONTRIB / MSG_QUORUM_COMPLAINT / MSG_QUORUM_JUSTIFICATION / MSG_QUORUM_PREMATURE_COMMITMENT arms share the MSG_DSQ case and unconditionally call m_cj_walletman->hasQueue(inv.hash) before iterating registered NetHandlers. For the four DKG inv types this is unrelated CoinJoin work on a hot path (every DKG inv during a DKG phase) and structurally couples DKG inventory handling to CoinJoin queue state — directly counter to this PR's stated goal of decoupling DKG networking into NetDKG. A false-positive DSQ-hash collision is not a practical correctness risk, but the coupling should be removed by splitting MSG_DSQ into its own case (or by gating the cj_walletman pre-check behind inv.type == MSG_DSQ).

…isbehaving() 89fd0a9 Merge bitcoin#25144: refactor: Pass Peer& to Misbehaving() (Konstantin Akimov) Pull request description: ## What was done? After all refactorings to separate network and consensus code it's finally possible to do bitcoin#25144 Depends on #7314, #7332 - both are merged. ## How Has This Been Tested? Run unit & functional tests ## Breaking Changes N/A ## Checklist: - [x] I have performed a self-review of my own code - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have added or updated relevant unit/integration/functional/e2e tests - [ ] I have made corresponding changes to the documentation - [x] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_ ACKs for top commit: PastaPastaPasta: utACK 89fd0a9 Tree-SHA512: b065464e7b47139559319f5249d52ce13824d84befb2fc4165ecf64c72428dd750c4f876dabb085d3122c64bb8c2aa98e734833f0b421efd6685e36d3288565c

knst added this to the 24 milestone May 8, 2026

coderabbitai Bot reviewed May 8, 2026

View reviewed changes

knst force-pushed the refactor-peermanager-handlers-dkg branch from 53be42b to f4b6aae Compare May 10, 2026 18:07

coderabbitai Bot reviewed May 10, 2026

View reviewed changes

Comment thread src/llmq/dkgsession.cpp

Comment thread src/llmq/net_dkg.cpp

Comment thread src/llmq/net_dkg.cpp

PastaPastaPasta marked this pull request as draft May 12, 2026 13:53

github-actions Bot added the needs rebase label May 12, 2026

knst force-pushed the refactor-peermanager-handlers-dkg branch 2 times, most recently from 248ccaf to c1c4e2a Compare May 12, 2026 14:22

knst force-pushed the refactor-peermanager-handlers-dkg branch from c1c4e2a to 22d5cec Compare May 19, 2026 14:09

knst marked this pull request as ready for review May 19, 2026 14:09

coderabbitai Bot reviewed May 19, 2026

View reviewed changes

github-actions Bot removed the needs rebase label May 19, 2026

thepastaclaw reviewed May 19, 2026

View reviewed changes

knst requested review from PastaPastaPasta and UdjinM6 May 19, 2026 14:48

knst mentioned this pull request May 20, 2026

backport: merge bitcoin/bitcoin#25144: refactor: Pass Peer& to Misbehaving() #7331

Merged

5 tasks

This was referenced May 25, 2026

refactor: separate governance from consensus code #7330

Merged

backport: bitcoin/bitcoin#24304: [kernel 0/n] Introduce dash-chainstate #7234

Merged

knst force-pushed the refactor-peermanager-handlers-dkg branch from 22d5cec to 66e6156 Compare May 30, 2026 02:40

coderabbitai Bot reviewed May 30, 2026

View reviewed changes

thepastaclaw reviewed May 30, 2026

View reviewed changes

thepastaclaw reviewed Jun 4, 2026

View reviewed changes

github-actions Bot mentioned this pull request Jun 4, 2026

refactor: drop circular dependency over validation <-> masternode/payment #7339

Merged

5 tasks

UdjinM6 previously approved these changes Jun 5, 2026

View reviewed changes

github-actions Bot added the needs rebase label Jun 5, 2026

knst dismissed UdjinM6’s stale review via 32ef6b8 June 7, 2026 15:34

knst force-pushed the refactor-peermanager-handlers-dkg branch from 2e4e5e4 to 32ef6b8 Compare June 7, 2026 15:34

github-actions Bot removed the needs rebase label Jun 7, 2026

thepastaclaw reviewed Jun 7, 2026

View reviewed changes

knst and others added 15 commits June 10, 2026 02:12

refactor: avoid including dkgsessionhandler in the dkgsessionmgr.h

a5322c0

It shows the hidden circular dependency and tidy up list of includes

refactor: make DkgSession's stages to return std::optional

5da6df6

feat: add a new helper PeerPushInventory to PeerManagerInternal

46d001e

feat: new NetHandler for DKG to separate network and logic

20e0204

refactor: move DKG workers and relevant handlers inside NetDKG

35ef9cc

It helps to drop dependency of llmq/dkgsessionhandler on network code

fmt: clang formatting for moved code

f526076

refactor: clean up unused includes and forward declarations and add m…

9b76445

…issing

refactor: move DKG message routing from CDKGSessionManager into NetDKG

b8ce915

- moved implementation of ProcessMessage and AlreadyHave to NetDKG - drop usages of MessageProcessingResult in CDKGSessionManager - introduced a new helper DoForHandler

fmt: use clang-format for moving DKG message routing

fe9c50c

refactor: move DKG bootstrap from ActiveContext/ObserverContext into …

aeb2c14

…NetDKG

refactor: drop heavy include llmq/commitment.h from CDKGSession header

504ee3e

refactor: fix code review comments

4d7a2ca

knst force-pushed the refactor-peermanager-handlers-dkg branch from 32ef6b8 to 496e479 Compare June 9, 2026 19:12

knst requested a review from UdjinM6 June 9, 2026 19:59

UdjinM6 approved these changes Jun 10, 2026

View reviewed changes

thepastaclaw reviewed Jun 11, 2026

View reviewed changes

github-actions Bot mentioned this pull request Jun 13, 2026

backport: bitcoin#24322, #25064, #25065, #25168, #25281, #25290, #25307, #25487, #25607, #26003 (kernel backports) #7361

Open

5 tasks

PastaPastaPasta merged commit a5c9b27 into dashpay:develop Jun 17, 2026
46 checks passed

This was referenced Jul 1, 2026

chore: merge master into develop #7397

Merged

perf: deserialize DKG messages once #7401

Draft

Uh oh!

Conversation

knst commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue being fixed or feature implemented

What was done?

How Has This Been Tested?

Breaking Changes

Checklist:

Uh oh!

github-actions Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ No Merge Conflicts Detected

Uh oh!

thepastaclaw commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 12, 2026

Uh oh!

github-actions Bot commented May 12, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

thepastaclaw left a comment

Choose a reason for hiding this comment

Code Review

6 finding(s) posted in the review body

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

thepastaclaw left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

thepastaclaw May 30, 2026

Choose a reason for hiding this comment

Uh oh!

knst Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

UdjinM6 commented Jun 4, 2026

Uh oh!

thepastaclaw left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

thepastaclaw Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

UdjinM6 left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 5, 2026

Uh oh!

thepastaclaw left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

UdjinM6 left a comment

knst commented May 8, 2026 •

edited

Loading

github-actions Bot commented May 8, 2026 •

edited

Loading

thepastaclaw commented May 8, 2026 •

edited

Loading

coderabbitai Bot commented May 8, 2026 •

edited

Loading