Skip to content

fix(wallets): keep MuSig2 secret nonce inside the signer#113

Merged
Kukks merged 2 commits into
masterfrom
fix/musig2-private-nonce-on-signer
May 29, 2026
Merged

fix(wallets): keep MuSig2 secret nonce inside the signer#113
Kukks merged 2 commits into
masterfrom
fix/musig2-private-nonce-on-signer

Conversation

@Kukks
Copy link
Copy Markdown
Collaborator

@Kukks Kukks commented May 29, 2026

Closes #111.

Summary

The IArkadeWalletSigner / IRemoteSignerTransport contracts used to return MusigPrivNonce from GenerateNonces and require the caller to pass it back into SignMusig. That made the secret nonce cross the signer boundary — for the remote-signer transport it meant secret material round-tripped over the wire (the "remote signer" wasn't actually one); for in-process signers it encouraged callers to hold a MuSig2 secret in their own data structures where accidental reuse leaks the private key (MuSig2 nonce reuse is catastrophic).

This PR reshapes the contracts so:

  • GenerateNonces returns MusigPubNonce only. The signer derives the secret nonce internally, stores it indexed by a caller-supplied sessionId, and returns the public half.
  • SignMusig drops the MusigPrivNonce parameter and gains the same sessionId. The signer looks the secret back up by sessionId and consumes it on use. Throws if no matching nonce was generated.
  • GenerateNonces throws if a nonce is already stored under the same sessionId (generating twice without an intervening SignMusig would orphan secret material).

Why a sessionId rather than fingerprinting the MusigContext

MusigContext.AggregatePubKey is determined solely by the cosigner set + taproot tweak. In a batch tree, multiple transactions can share the same cosigner set + tweak, so their contexts produce identical aggregate pubkeys but different sighashes. The sighash (msg32) is internal in NBitcoin and not externally observable, so the signer can't disambiguate on its own. The caller (TreeSignerSession) is the only one with enough information to produce a session-unique identifier — it passes each tree-node txid.

Why signers are now cached in DefaultWalletProvider

The secret-nonce store lives on the signer instance. DefaultWalletProvider.GetSignerAsync used to instantiate a fresh signer per call, which meant the GenerateNonces call populated one instance's store and SignMusig (different instance) always found an empty one. The cache key is (walletId, secret-hash) so a wallet re-imported with different signing material gets a fresh signer.

Why this matters

The on-the-wire IRemoteSignerTransport shape is what changed semantics most meaningfully: with the old shape, a "remote signer" service exposing SignMusigAsync(walletId, descriptor, context, nonce, ct) was structurally indistinguishable from a key-material proxy — the secret nonce crossed the trust boundary every batch. With the new shape, the secret half never leaves the transport implementation, which is the cryptographic claim of remote signing.

Changes

  • NArk.Abstractions/Wallets/IArkadeWalletSigner.cs, IRemoteSignerTransport.cs: contract reshape + sessionId param + docs explaining the lifecycle and the AggregatePubKey-collision rationale.
  • NArk.Core/Wallet/NSecWalletSigner.cs, HierarchicalDeterministicWalletSigner.cs: per-instance ConcurrentDictionary<string, MusigPrivNonce> keyed by sessionId; SignMusig TryRemoves on consume; GenerateNonces ContainsKey-checks before generating so duplicate-sessionId collisions don't burn entropy.
  • NArk.Core/Wallet/RemoteArkadeWalletSigner.cs: pure passthrough — drops the nonce param, threads through sessionId.
  • NArk.Core/Wallet/DefaultWalletProvider.cs: caches signer instances per (walletId, secret-hash) so the secret-nonce store survives between GenerateNonces and SignMusig.
  • NArk.Core/Batches/TreeSignerSession.cs: _myNonces collapses from Dictionary<uint256, (MusigPrivNonce, MusigPubNonce)> to Dictionary<uint256, MusigPubNonce>; both GenerateNonces and SignMusig calls pass the tree-node txid.ToString() as sessionId.
  • NArk.Tests/NSecWalletSignerParityTests.cs, NArk.Tests.End2End/Wallets/SimpleSeedWallet.cs: updated to new contract.
  • docs/articles/wallets.md: updated remote-signer example + MuSig2 nonce lifecycle section + signer-caching rationale.

IRemoteSignerTransport documents that long-lived transports need an eviction policy (TTL or bounded count) for abandoned nonces — in-process signers get away with remove-on-consume because their lifetime is the cached singleton, but a server-side transport could leak unconsumed entries if a client generates a nonce and never signs.

Breaking change

Yes — IArkadeWalletSigner and IRemoteSignerTransport are public surface. Any external implementation of either must:

  1. Drop the MusigPrivNonce parameter from SignMusig / SignMusigAsync.
  2. Change GenerateNonces / GenerateNoncesAsync return type to MusigPubNonce.
  3. Accept a new string sessionId parameter on both GenerateNonces and SignMusig.
  4. Internalise the secret nonce store with whatever eviction policy fits the transport's lifetime.

Test plan

  • dotnet build NArk.sln — clean
  • dotnet test NArk.Tests — 418 passed, 0 failed
  • E2E CI — covered (was green on initial push; rerunning post review-fix)

Closes #111.

The IArkadeWalletSigner / IRemoteSignerTransport contracts used to return
MusigPrivNonce from GenerateNonces and require the caller to pass it back into
SignMusig. That made the secret nonce cross the signer boundary — for the
remote-signer transport that meant the "remote signer" wasn't actually one
(secret material round-tripped over the wire), and for in-process signers it
encouraged callers to hold a MuSig2 secret in their own data structures where
accidental reuse leaks the private key (MuSig2 nonce reuse is catastrophic).

Reshape the signer contract so:

  - GenerateNonces returns MusigPubNonce only. The signer derives the secret
    nonce internally, stores it indexed by MusigContext.AggregatePubKey
    (the tweaked aggregate is a content fingerprint of the cosigner set +
    taproot tweaks), and returns the public half.
  - SignMusig drops the MusigPrivNonce parameter. The signer looks the secret
    nonce back up by the same context and consumes it on use. Throws if no
    matching nonce was generated for this context.
  - GenerateNonces throws if a nonce is already stored for the same context
    (calling twice without an intervening SignMusig would orphan secret
    material; the user is almost certainly making a mistake).

NSecWalletSigner / HierarchicalDeterministicWalletSigner / SimpleSeedWallet
each hold a per-instance ConcurrentDictionary<string, MusigPrivNonce> keyed by
the hex of MusigContext.AggregatePubKey.ToBytes(); SignMusig TryRemoves on
consume so the store is self-evicting on the happy path. RemoteArkadeWalletSigner
is a pure passthrough, so it just drops the nonce param.

IRemoteSignerTransport now documents that long-lived transports need an
eviction policy (TTL or bounded count) for abandoned nonces — in-process
signers get away with remove-on-consume because their lifetime is the batch
session, but a server-side transport could leak unconsumed entries if a client
generates a nonce and never signs.

TreeSignerSession's _myNonces collapses from
Dictionary<uint256, (MusigPrivNonce, MusigPubNonce)> to
Dictionary<uint256, MusigPubNonce> — the secret half no longer round-trips
through the batch coordinator.

docs/articles/wallets.md updates the remote-signer example and adds a section
on the MuSig2 nonce lifecycle and eviction responsibility.
Copy link
Copy Markdown

@arkanaai arkanaai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Protocol-Critical Review — Request Changes

Reviewer: Arkana (automated aggressive review)
Verdict: Two critical correctness bugs that will break all MuSig2 signing at runtime. The security motivation is exactly right — secret nonces must not cross the signer boundary — but the implementation has two structural issues that need fixing before merge.


🚨 CRITICAL BUG 1: Signer instances are ephemeral — per-instance nonce store is always empty on SignMusig

Files: NSecWalletSigner.cs, HierarchicalDeterministicWalletSigner.cs
Root cause: DefaultWalletProvider.GetSignerAsync() creates a new signer instance on every call (lines 45-50 of DefaultWalletProvider.cs).

TreeSignerSession calls GetSignerAsync() in three separate methods:

  • CreateMusigContexts() — line 50
  • GenerateNoncesAsync() — line 162
  • SignPartialAsync() — line 195

Each call returns a fresh NSecWalletSigner or HierarchicalDeterministicWalletSigner. The per-instance ConcurrentDictionary<string, MusigPrivNonce> _secNonces is therefore:

  • Populated in instance A during GenerateNoncesAsync
  • Empty in instance B during SignPartialAsync

SignMusig will always throw InvalidOperationException("No secret nonce stored...") in the production DefaultWalletProvider path.

The unit test (NSecWalletSignerParityTests) doesn't catch this because it reuses the same signer instance directly — it never goes through IWalletProvider.

Note: RemoteArkadeWalletSigner is unaffected because it delegates to IRemoteSignerTransport, which IS a long-lived singleton. The remote path is fine. Only in-process signers are broken.

Fix options:

  1. Cache signer instances in DefaultWalletProvider (e.g. ConcurrentDictionary<string, IArkadeWalletSigner>) so the same instance is returned for the same wallet ID within a session.
  2. Move _secNonces to a static ConcurrentDictionary keyed by (walletSecret, contextKey) — similar to the existing _extKeyCache pattern in HierarchicalDeterministicWalletSigner.
  3. Inject a shared INonceStore service that outlives the signer instance.

Option 1 is simplest and matches the existing pattern of _extKeyCache. Option 2 risks leaking nonces across unrelated sessions if eviction isn't tight.


🚨 CRITICAL BUG 2: AggregatePubKey is not unique per transaction — nonce store key collides

Files: NSecWalletSigner.cs:137, HierarchicalDeterministicWalletSigner.cs:72, SimpleSeedWallet.cs:97

The nonce store is keyed by:

private static string ContextKey(MusigContext context) =>
    Convert.ToHexString(context.AggregatePubKey.ToBytes()).ToLowerInvariant();

AggregatePubKey is determined solely by the cosigner set + taproot tweak. In a batch tree, multiple transactions can share the same cosigner set (e.g. the server + same user are cosigners for multiple nodes at the same tree level). After the same _tapscriptMerkleRoot tweak is applied (TreeSignerSession.cs:99-106), these contexts produce identical AggregatePubKey values.

In GenerateNoncesAsync, the loop (TreeSignerSession.cs:170-173 in the PR):

foreach (var (txid, musigContext) in _musigContexts!)
{
    res[txid] = await signer.GenerateNonces(_descriptor, musigContext, cancellationToken);
}

The second call to GenerateNonces with the same AggregatePubKey hits TryAdd returning false and throws:

"A secret nonce is already stored for this MuSig2 context (aggregate pubkey ...)"

This is correct defensive behavior (orphaning a nonce IS bad), but the key isn't granular enough.

Fix: The nonce store key must include something transaction-specific. Options:

  • Key by AggregatePubKey + sighash (the sighash is baked into MusigContext at construction — it's what makes each context unique)
  • Key by a caller-supplied correlation ID (e.g. txid), but this changes the interface
  • If MusigContext exposes the message/sighash, Convert.ToHexString(context.AggregatePubKey.ToBytes() + sighash) would be unique

⚠️ Minor: Nonce generated before TryAdd — leaked on collision

Files: NSecWalletSigner.cs:127-133, HierarchicalDeterministicWalletSigner.cs:63-69

var nonce = context.GenerateNonce(privKey);  // nonce created
var key = ContextKey(context);
if (!_secNonces.TryAdd(key, nonce))          // fails → nonce is abandoned
    throw new InvalidOperationException(...);

If TryAdd fails, the generated nonce is silently dropped. This isn't a security vulnerability (unused nonces don't leak keys), but it means GenerateNonce burned entropy for nothing and the MusigContext object may now be in an indeterminate state. Consider checking for key existence before calling GenerateNonce.


✅ What's good

  • Security model is correct. The secret nonce absolutely should not cross the signer boundary. The PR description is excellent — the "phase 1 → phase 2" narrative is clear and the MuSig2 nonce reuse risk is correctly identified.
  • Interface reshape is clean. Dropping MusigPrivNonce from the public API surface is the right call.
  • TreeSignerSession simplification is correct. Collapsing from (MusigPrivNonce, MusigPubNonce) to MusigPubNonce is the right structural change.
  • Remove-on-consume (TryRemove) is the correct eviction pattern.
  • Docs and XML comments are thorough and accurate.
  • IRemoteSignerTransport security note removal is correct — the limitation no longer applies.

📋 Cross-repo impact

  • btcpay-arkade has an NNark submodule that mirrors these interfaces. Will need a submodule bump after merge.
  • No impact on ts-sdk, go-sdk, rust-sdk — these are independent implementations.

🔒 Protocol-critical flag

This PR touches MuSig2 nonce handling and transaction signing — protocol-critical code. Even after the bugs above are fixed, this requires human review and sign-off before merge. Nonce mismanagement in MuSig2 is catastrophic (private key leak).

Addresses the two CHANGES_REQUESTED items on PR #113:

1. AggregatePubKey isn't unique per signing operation. In a batch tree, multiple
   transactions can share the same cosigner set + taproot tweak, so their
   MusigContexts produce identical AggregatePubKey but different sighashes. The
   sighash is internal to MusigContext and not externally observable, so the
   signer can't disambiguate on its own. The caller — TreeSignerSession — is
   the only one with enough information.

   IArkadeWalletSigner.GenerateNonces / SignMusig and
   IRemoteSignerTransport.GenerateNoncesAsync / SignMusigAsync now take a
   `string sessionId`. The signer's nonce store is keyed by that. Callers pass
   a session-unique identifier; TreeSignerSession passes the tree-node txid.

2. DefaultWalletProvider used to instantiate a fresh signer on every
   GetSignerAsync call. Because the secret-nonce store lives on the signer
   instance, the SignMusig call (different instance) would always find an
   empty store and throw. Cache signers per (walletId + secret hash) so the
   nonce store survives between GenerateNonces and SignMusig.

   The cache key folds in a hash of `wallet.Secret` so a wallet re-imported
   with different signing material gets a fresh signer.

Also tightens GenerateNonces in each in-process signer: now checks
ContainsKey *before* generating, so we don't burn entropy on a duplicate-
sessionId collision (minor concern raised in the review).
@Kukks
Copy link
Copy Markdown
Collaborator Author

Kukks commented May 29, 2026

Thanks for the review — both findings are legit. Fixed in 5ffa65c.

On Bug 1 (signer ephemerality): Cached signer instances in DefaultWalletProvider keyed by (walletId, secret-hash). The secret-hash fold means a wallet re-imported with different signing material gets a fresh signer. Closes the gap where _secNonces populated in instance A was always empty in instance B by the time SignMusig ran.

On Bug 2 (AggregatePubKey collisions in a batch tree): I confirmed MusigContext.msg32 is internal in NBitcoin and not externally observable, so the signer can't disambiguate using only public state. Added a caller-supplied string sessionId to GenerateNonces / SignMusig on both IArkadeWalletSigner and IRemoteSignerTransport. TreeSignerSession passes the tree-node txid; the signer's nonce store is keyed by that. The aggregate-pubkey-based ContextKey helper is gone.

On the minor "nonce burned on collision": Tightened the in-process signers to ContainsKey-check before GenerateNonce, so duplicate-sessionId collisions don't waste entropy. (The post-check TryAdd is still there to defend against a concurrent same-sessionId race, which is a caller bug, but at that point we've already generated; leaving that path as throwing-and-abandoning since the race is essentially never expected.)

Cross-repo impact reminder noted — the BTCPay plugin tracks NNark via submodule, so a submodule bump there will follow after this lands.

Tests: 418 unit tests passing, full sln build clean. E2E exercises the new sessionId flow through TreeSignerSession (the only production caller of the signer Musig API).

Copy link
Copy Markdown

@arkanaai arkanaai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Follow-up Review — Approve

Reviewer: Arkana (automated aggressive review, iteration 2)
Commit reviewed: 5ffa65c (fix: disambiguate nonce store by sessionId + cache signers)

All three findings from the first review are correctly resolved:


✅ Bug 1 (signer ephemerality) — Fixed

DefaultWalletProvider now caches signer instances in a ConcurrentDictionary<string, IArkadeWalletSigner> via GetOrAdd. The cache key local:{walletId}:{secret.GetHashCode():x} ensures:

  • Same wallet + same secret → same signer instance (nonce store survives between GenerateNonces and SignMusig)
  • Same wallet + different secret (re-import) → new signer instance

GetHashCode() is randomized per process in .NET Core, but that's fine for an in-memory cache — consistency is only needed within a single process lifetime. Collision probability (2^-32) is negligible in practice.

✅ Bug 2 (AggregatePubKey collision) — Fixed

The ContextKey(context) helper keyed by AggregatePubKey is gone. Nonce store is now keyed by caller-supplied string sessionId. TreeSignerSession passes txid.ToString() — unique per tree node by definition. The interface contract (IArkadeWalletSigner, IRemoteSignerTransport) is updated with the new sessionId parameter and comprehensive XML docs explaining why AggregatePubKey alone is insufficient.

✅ Minor (nonce burned on collision) — Fixed

ContainsKey check runs before GenerateNonce in all three signer implementations (NSecWalletSigner, HierarchicalDeterministicWalletSigner, SimpleSeedWallet). The post-generation TryAdd check remains as a belt-and-suspenders guard against a concurrent same-sessionId race (which would be a caller bug).


Verified in this iteration

  • Interface consistency: IArkadeWalletSigner.GenerateNonces returns MusigPubNonce (not MusigPrivNonce) — secret half never leaves the signer. ✅
  • IRemoteSignerTransport mirrors the same sessionId pattern and the old security-limitation doc block is correctly replaced. ✅
  • TreeSignerSession stores only Dictionary<uint256, MusigPubNonce> — no secret nonces in session state. ✅
  • SignPartialAsync passes txid.ToString() matching what was used in GenerateNoncesAsync. ✅
  • Test coverage: NSecWalletSignerParityTests.FullMusig2Flow_WorksWithCorrectParityContext exercises the full generate→sign flow with sessionId. GenerateNonces_WorksWithCorrectParityMusigContext covers the nonce generation path. E2E tests exercise the real TreeSignerSessionDefaultWalletProvider → signer chain. ✅
  • Docs: wallets.md "MuSig2 Nonce Lifecycle" section is clear and accurate. ✅

🔒 Protocol-critical — human sign-off still required

This PR touches MuSig2 nonce handling and transaction signing. The code is correct, but a human must approve before merge. Nonce mismanagement in MuSig2 is catastrophic (private key leak).

Copy link
Copy Markdown

@arkanaai arkanaai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Follow-up Review — Approve

Reviewer: Arkana (automated aggressive review, iteration 2)
Commit reviewed: 5ffa65c (fix: disambiguate nonce store by sessionId + cache signers)

All three findings from the first review are correctly resolved:


✅ Bug 1 (signer ephemerality) — Fixed

DefaultWalletProvider now caches signer instances in a ConcurrentDictionary<string, IArkadeWalletSigner> via GetOrAdd. The cache key local:{walletId}:{secret.GetHashCode():x} ensures:

  • Same wallet + same secret → same signer instance (nonce store survives between GenerateNonces and SignMusig)
  • Same wallet + different secret (re-import) → new signer instance

GetHashCode() is randomized per process in .NET Core, but that's fine for an in-memory cache — consistency is only needed within a single process lifetime. Collision probability (2^-32) is negligible in practice.

✅ Bug 2 (AggregatePubKey collision) — Fixed

The ContextKey(context) helper keyed by AggregatePubKey is gone. Nonce store is now keyed by caller-supplied string sessionId. TreeSignerSession passes txid.ToString() — unique per tree node by definition. The interface contract (IArkadeWalletSigner, IRemoteSignerTransport) is updated with the new sessionId parameter and comprehensive XML docs explaining why AggregatePubKey alone is insufficient.

✅ Minor (nonce burned on collision) — Fixed

ContainsKey check runs before GenerateNonce in all three signer implementations (NSecWalletSigner, HierarchicalDeterministicWalletSigner, SimpleSeedWallet). The post-generation TryAdd check remains as a belt-and-suspenders guard against a concurrent same-sessionId race (which would be a caller bug).


Verified in this iteration

  • Interface consistency: IArkadeWalletSigner.GenerateNonces returns MusigPubNonce (not MusigPrivNonce) — secret half never leaves the signer. ✅
  • IRemoteSignerTransport mirrors the same sessionId pattern and the old security-limitation doc block is correctly replaced. ✅
  • TreeSignerSession stores only Dictionary<uint256, MusigPubNonce> — no secret nonces in session state. ✅
  • SignPartialAsync passes txid.ToString() matching what was used in GenerateNoncesAsync. ✅
  • Test coverage: NSecWalletSignerParityTests.FullMusig2Flow_WorksWithCorrectParityContext exercises the full generate→sign flow with sessionId. GenerateNonces_WorksWithCorrectParityMusigContext covers the nonce generation path. E2E tests exercise the real TreeSignerSessionDefaultWalletProvider → signer chain. ✅
  • Docs: wallets.md "MuSig2 Nonce Lifecycle" section is clear and accurate. ✅

🔒 Protocol-critical — human sign-off still required

This PR touches MuSig2 nonce handling and transaction signing. The code is correct, but a human must approve before merge. Nonce mismanagement in MuSig2 is catastrophic (private key leak).

@Kukks Kukks merged commit 095ad33 into master May 29, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

security: MuSig2 private nonce leaks across IRemoteSignerTransport boundary

1 participant