fix(wallets): keep MuSig2 secret nonce inside the signer#113
Conversation
Closes #111. The IArkadeWalletSigner / IRemoteSignerTransport contracts used to return MusigPrivNonce from GenerateNonces and require the caller to pass it back into SignMusig. That made the secret nonce cross the signer boundary — for the remote-signer transport that meant the "remote signer" wasn't actually one (secret material round-tripped over the wire), and for in-process signers it encouraged callers to hold a MuSig2 secret in their own data structures where accidental reuse leaks the private key (MuSig2 nonce reuse is catastrophic). Reshape the signer contract so: - GenerateNonces returns MusigPubNonce only. The signer derives the secret nonce internally, stores it indexed by MusigContext.AggregatePubKey (the tweaked aggregate is a content fingerprint of the cosigner set + taproot tweaks), and returns the public half. - SignMusig drops the MusigPrivNonce parameter. The signer looks the secret nonce back up by the same context and consumes it on use. Throws if no matching nonce was generated for this context. - GenerateNonces throws if a nonce is already stored for the same context (calling twice without an intervening SignMusig would orphan secret material; the user is almost certainly making a mistake). NSecWalletSigner / HierarchicalDeterministicWalletSigner / SimpleSeedWallet each hold a per-instance ConcurrentDictionary<string, MusigPrivNonce> keyed by the hex of MusigContext.AggregatePubKey.ToBytes(); SignMusig TryRemoves on consume so the store is self-evicting on the happy path. RemoteArkadeWalletSigner is a pure passthrough, so it just drops the nonce param. IRemoteSignerTransport now documents that long-lived transports need an eviction policy (TTL or bounded count) for abandoned nonces — in-process signers get away with remove-on-consume because their lifetime is the batch session, but a server-side transport could leak unconsumed entries if a client generates a nonce and never signs. TreeSignerSession's _myNonces collapses from Dictionary<uint256, (MusigPrivNonce, MusigPubNonce)> to Dictionary<uint256, MusigPubNonce> — the secret half no longer round-trips through the batch coordinator. docs/articles/wallets.md updates the remote-signer example and adds a section on the MuSig2 nonce lifecycle and eviction responsibility.
There was a problem hiding this comment.
🔴 Protocol-Critical Review — Request Changes
Reviewer: Arkana (automated aggressive review)
Verdict: Two critical correctness bugs that will break all MuSig2 signing at runtime. The security motivation is exactly right — secret nonces must not cross the signer boundary — but the implementation has two structural issues that need fixing before merge.
🚨 CRITICAL BUG 1: Signer instances are ephemeral — per-instance nonce store is always empty on SignMusig
Files: NSecWalletSigner.cs, HierarchicalDeterministicWalletSigner.cs
Root cause: DefaultWalletProvider.GetSignerAsync() creates a new signer instance on every call (lines 45-50 of DefaultWalletProvider.cs).
TreeSignerSession calls GetSignerAsync() in three separate methods:
CreateMusigContexts()— line 50GenerateNoncesAsync()— line 162SignPartialAsync()— line 195
Each call returns a fresh NSecWalletSigner or HierarchicalDeterministicWalletSigner. The per-instance ConcurrentDictionary<string, MusigPrivNonce> _secNonces is therefore:
- Populated in instance A during
GenerateNoncesAsync - Empty in instance B during
SignPartialAsync
SignMusig will always throw InvalidOperationException("No secret nonce stored...") in the production DefaultWalletProvider path.
The unit test (NSecWalletSignerParityTests) doesn't catch this because it reuses the same signer instance directly — it never goes through IWalletProvider.
Note: RemoteArkadeWalletSigner is unaffected because it delegates to IRemoteSignerTransport, which IS a long-lived singleton. The remote path is fine. Only in-process signers are broken.
Fix options:
- Cache signer instances in
DefaultWalletProvider(e.g.ConcurrentDictionary<string, IArkadeWalletSigner>) so the same instance is returned for the same wallet ID within a session. - Move
_secNoncesto astatic ConcurrentDictionarykeyed by(walletSecret, contextKey)— similar to the existing_extKeyCachepattern inHierarchicalDeterministicWalletSigner. - Inject a shared
INonceStoreservice that outlives the signer instance.
Option 1 is simplest and matches the existing pattern of _extKeyCache. Option 2 risks leaking nonces across unrelated sessions if eviction isn't tight.
🚨 CRITICAL BUG 2: AggregatePubKey is not unique per transaction — nonce store key collides
Files: NSecWalletSigner.cs:137, HierarchicalDeterministicWalletSigner.cs:72, SimpleSeedWallet.cs:97
The nonce store is keyed by:
private static string ContextKey(MusigContext context) =>
Convert.ToHexString(context.AggregatePubKey.ToBytes()).ToLowerInvariant();AggregatePubKey is determined solely by the cosigner set + taproot tweak. In a batch tree, multiple transactions can share the same cosigner set (e.g. the server + same user are cosigners for multiple nodes at the same tree level). After the same _tapscriptMerkleRoot tweak is applied (TreeSignerSession.cs:99-106), these contexts produce identical AggregatePubKey values.
In GenerateNoncesAsync, the loop (TreeSignerSession.cs:170-173 in the PR):
foreach (var (txid, musigContext) in _musigContexts!)
{
res[txid] = await signer.GenerateNonces(_descriptor, musigContext, cancellationToken);
}The second call to GenerateNonces with the same AggregatePubKey hits TryAdd returning false and throws:
"A secret nonce is already stored for this MuSig2 context (aggregate pubkey ...)"
This is correct defensive behavior (orphaning a nonce IS bad), but the key isn't granular enough.
Fix: The nonce store key must include something transaction-specific. Options:
- Key by
AggregatePubKey + sighash(the sighash is baked intoMusigContextat construction — it's what makes each context unique) - Key by a caller-supplied correlation ID (e.g. txid), but this changes the interface
- If
MusigContextexposes the message/sighash,Convert.ToHexString(context.AggregatePubKey.ToBytes() + sighash)would be unique
⚠️ Minor: Nonce generated before TryAdd — leaked on collision
Files: NSecWalletSigner.cs:127-133, HierarchicalDeterministicWalletSigner.cs:63-69
var nonce = context.GenerateNonce(privKey); // nonce created
var key = ContextKey(context);
if (!_secNonces.TryAdd(key, nonce)) // fails → nonce is abandoned
throw new InvalidOperationException(...);If TryAdd fails, the generated nonce is silently dropped. This isn't a security vulnerability (unused nonces don't leak keys), but it means GenerateNonce burned entropy for nothing and the MusigContext object may now be in an indeterminate state. Consider checking for key existence before calling GenerateNonce.
✅ What's good
- Security model is correct. The secret nonce absolutely should not cross the signer boundary. The PR description is excellent — the "phase 1 → phase 2" narrative is clear and the MuSig2 nonce reuse risk is correctly identified.
- Interface reshape is clean. Dropping
MusigPrivNoncefrom the public API surface is the right call. TreeSignerSessionsimplification is correct. Collapsing from(MusigPrivNonce, MusigPubNonce)toMusigPubNonceis the right structural change.- Remove-on-consume (
TryRemove) is the correct eviction pattern. - Docs and XML comments are thorough and accurate.
IRemoteSignerTransportsecurity note removal is correct — the limitation no longer applies.
📋 Cross-repo impact
btcpay-arkadehas anNNarksubmodule that mirrors these interfaces. Will need a submodule bump after merge.- No impact on
ts-sdk,go-sdk,rust-sdk— these are independent implementations.
🔒 Protocol-critical flag
This PR touches MuSig2 nonce handling and transaction signing — protocol-critical code. Even after the bugs above are fixed, this requires human review and sign-off before merge. Nonce mismanagement in MuSig2 is catastrophic (private key leak).
Addresses the two CHANGES_REQUESTED items on PR #113: 1. AggregatePubKey isn't unique per signing operation. In a batch tree, multiple transactions can share the same cosigner set + taproot tweak, so their MusigContexts produce identical AggregatePubKey but different sighashes. The sighash is internal to MusigContext and not externally observable, so the signer can't disambiguate on its own. The caller — TreeSignerSession — is the only one with enough information. IArkadeWalletSigner.GenerateNonces / SignMusig and IRemoteSignerTransport.GenerateNoncesAsync / SignMusigAsync now take a `string sessionId`. The signer's nonce store is keyed by that. Callers pass a session-unique identifier; TreeSignerSession passes the tree-node txid. 2. DefaultWalletProvider used to instantiate a fresh signer on every GetSignerAsync call. Because the secret-nonce store lives on the signer instance, the SignMusig call (different instance) would always find an empty store and throw. Cache signers per (walletId + secret hash) so the nonce store survives between GenerateNonces and SignMusig. The cache key folds in a hash of `wallet.Secret` so a wallet re-imported with different signing material gets a fresh signer. Also tightens GenerateNonces in each in-process signer: now checks ContainsKey *before* generating, so we don't burn entropy on a duplicate- sessionId collision (minor concern raised in the review).
|
Thanks for the review — both findings are legit. Fixed in On Bug 1 (signer ephemerality): Cached signer instances in On Bug 2 ( On the minor "nonce burned on collision": Tightened the in-process signers to Cross-repo impact reminder noted — the BTCPay plugin tracks NNark via submodule, so a submodule bump there will follow after this lands. Tests: 418 unit tests passing, full sln build clean. E2E exercises the new |
There was a problem hiding this comment.
✅ Follow-up Review — Approve
Reviewer: Arkana (automated aggressive review, iteration 2)
Commit reviewed: 5ffa65c (fix: disambiguate nonce store by sessionId + cache signers)
All three findings from the first review are correctly resolved:
✅ Bug 1 (signer ephemerality) — Fixed
DefaultWalletProvider now caches signer instances in a ConcurrentDictionary<string, IArkadeWalletSigner> via GetOrAdd. The cache key local:{walletId}:{secret.GetHashCode():x} ensures:
- Same wallet + same secret → same signer instance (nonce store survives between GenerateNonces and SignMusig)
- Same wallet + different secret (re-import) → new signer instance
GetHashCode() is randomized per process in .NET Core, but that's fine for an in-memory cache — consistency is only needed within a single process lifetime. Collision probability (2^-32) is negligible in practice.
✅ Bug 2 (AggregatePubKey collision) — Fixed
The ContextKey(context) helper keyed by AggregatePubKey is gone. Nonce store is now keyed by caller-supplied string sessionId. TreeSignerSession passes txid.ToString() — unique per tree node by definition. The interface contract (IArkadeWalletSigner, IRemoteSignerTransport) is updated with the new sessionId parameter and comprehensive XML docs explaining why AggregatePubKey alone is insufficient.
✅ Minor (nonce burned on collision) — Fixed
ContainsKey check runs before GenerateNonce in all three signer implementations (NSecWalletSigner, HierarchicalDeterministicWalletSigner, SimpleSeedWallet). The post-generation TryAdd check remains as a belt-and-suspenders guard against a concurrent same-sessionId race (which would be a caller bug).
Verified in this iteration
- Interface consistency:
IArkadeWalletSigner.GenerateNoncesreturnsMusigPubNonce(notMusigPrivNonce) — secret half never leaves the signer. ✅ IRemoteSignerTransportmirrors the same sessionId pattern and the old security-limitation doc block is correctly replaced. ✅TreeSignerSessionstores onlyDictionary<uint256, MusigPubNonce>— no secret nonces in session state. ✅SignPartialAsyncpassestxid.ToString()matching what was used inGenerateNoncesAsync. ✅- Test coverage:
NSecWalletSignerParityTests.FullMusig2Flow_WorksWithCorrectParityContextexercises the full generate→sign flow with sessionId.GenerateNonces_WorksWithCorrectParityMusigContextcovers the nonce generation path. E2E tests exercise the realTreeSignerSession→DefaultWalletProvider→ signer chain. ✅ - Docs:
wallets.md"MuSig2 Nonce Lifecycle" section is clear and accurate. ✅
🔒 Protocol-critical — human sign-off still required
This PR touches MuSig2 nonce handling and transaction signing. The code is correct, but a human must approve before merge. Nonce mismanagement in MuSig2 is catastrophic (private key leak).
There was a problem hiding this comment.
✅ Follow-up Review — Approve
Reviewer: Arkana (automated aggressive review, iteration 2)
Commit reviewed: 5ffa65c (fix: disambiguate nonce store by sessionId + cache signers)
All three findings from the first review are correctly resolved:
✅ Bug 1 (signer ephemerality) — Fixed
DefaultWalletProvider now caches signer instances in a ConcurrentDictionary<string, IArkadeWalletSigner> via GetOrAdd. The cache key local:{walletId}:{secret.GetHashCode():x} ensures:
- Same wallet + same secret → same signer instance (nonce store survives between GenerateNonces and SignMusig)
- Same wallet + different secret (re-import) → new signer instance
GetHashCode() is randomized per process in .NET Core, but that's fine for an in-memory cache — consistency is only needed within a single process lifetime. Collision probability (2^-32) is negligible in practice.
✅ Bug 2 (AggregatePubKey collision) — Fixed
The ContextKey(context) helper keyed by AggregatePubKey is gone. Nonce store is now keyed by caller-supplied string sessionId. TreeSignerSession passes txid.ToString() — unique per tree node by definition. The interface contract (IArkadeWalletSigner, IRemoteSignerTransport) is updated with the new sessionId parameter and comprehensive XML docs explaining why AggregatePubKey alone is insufficient.
✅ Minor (nonce burned on collision) — Fixed
ContainsKey check runs before GenerateNonce in all three signer implementations (NSecWalletSigner, HierarchicalDeterministicWalletSigner, SimpleSeedWallet). The post-generation TryAdd check remains as a belt-and-suspenders guard against a concurrent same-sessionId race (which would be a caller bug).
Verified in this iteration
- Interface consistency:
IArkadeWalletSigner.GenerateNoncesreturnsMusigPubNonce(notMusigPrivNonce) — secret half never leaves the signer. ✅ IRemoteSignerTransportmirrors the same sessionId pattern and the old security-limitation doc block is correctly replaced. ✅TreeSignerSessionstores onlyDictionary<uint256, MusigPubNonce>— no secret nonces in session state. ✅SignPartialAsyncpassestxid.ToString()matching what was used inGenerateNoncesAsync. ✅- Test coverage:
NSecWalletSignerParityTests.FullMusig2Flow_WorksWithCorrectParityContextexercises the full generate→sign flow with sessionId.GenerateNonces_WorksWithCorrectParityMusigContextcovers the nonce generation path. E2E tests exercise the realTreeSignerSession→DefaultWalletProvider→ signer chain. ✅ - Docs:
wallets.md"MuSig2 Nonce Lifecycle" section is clear and accurate. ✅
🔒 Protocol-critical — human sign-off still required
This PR touches MuSig2 nonce handling and transaction signing. The code is correct, but a human must approve before merge. Nonce mismanagement in MuSig2 is catastrophic (private key leak).
Closes #111.
Summary
The
IArkadeWalletSigner/IRemoteSignerTransportcontracts used to returnMusigPrivNoncefromGenerateNoncesand require the caller to pass it back intoSignMusig. That made the secret nonce cross the signer boundary — for the remote-signer transport it meant secret material round-tripped over the wire (the "remote signer" wasn't actually one); for in-process signers it encouraged callers to hold a MuSig2 secret in their own data structures where accidental reuse leaks the private key (MuSig2 nonce reuse is catastrophic).This PR reshapes the contracts so:
GenerateNoncesreturnsMusigPubNonceonly. The signer derives the secret nonce internally, stores it indexed by a caller-suppliedsessionId, and returns the public half.SignMusigdrops theMusigPrivNonceparameter and gains the samesessionId. The signer looks the secret back up bysessionIdand consumes it on use. Throws if no matching nonce was generated.GenerateNoncesthrows if a nonce is already stored under the samesessionId(generating twice without an interveningSignMusigwould orphan secret material).Why a sessionId rather than fingerprinting the MusigContext
MusigContext.AggregatePubKeyis determined solely by the cosigner set + taproot tweak. In a batch tree, multiple transactions can share the same cosigner set + tweak, so their contexts produce identical aggregate pubkeys but different sighashes. The sighash (msg32) isinternalin NBitcoin and not externally observable, so the signer can't disambiguate on its own. The caller (TreeSignerSession) is the only one with enough information to produce a session-unique identifier — it passes each tree-node txid.Why signers are now cached in DefaultWalletProvider
The secret-nonce store lives on the signer instance.
DefaultWalletProvider.GetSignerAsyncused to instantiate a fresh signer per call, which meant theGenerateNoncescall populated one instance's store andSignMusig(different instance) always found an empty one. The cache key is(walletId, secret-hash)so a wallet re-imported with different signing material gets a fresh signer.Why this matters
The on-the-wire
IRemoteSignerTransportshape is what changed semantics most meaningfully: with the old shape, a "remote signer" service exposingSignMusigAsync(walletId, descriptor, context, nonce, ct)was structurally indistinguishable from a key-material proxy — the secret nonce crossed the trust boundary every batch. With the new shape, the secret half never leaves the transport implementation, which is the cryptographic claim of remote signing.Changes
NArk.Abstractions/Wallets/IArkadeWalletSigner.cs,IRemoteSignerTransport.cs: contract reshape +sessionIdparam + docs explaining the lifecycle and the AggregatePubKey-collision rationale.NArk.Core/Wallet/NSecWalletSigner.cs,HierarchicalDeterministicWalletSigner.cs: per-instanceConcurrentDictionary<string, MusigPrivNonce>keyed bysessionId;SignMusigTryRemoves on consume;GenerateNoncesContainsKey-checks before generating so duplicate-sessionId collisions don't burn entropy.NArk.Core/Wallet/RemoteArkadeWalletSigner.cs: pure passthrough — drops the nonce param, threads throughsessionId.NArk.Core/Wallet/DefaultWalletProvider.cs: caches signer instances per(walletId, secret-hash)so the secret-nonce store survives betweenGenerateNoncesandSignMusig.NArk.Core/Batches/TreeSignerSession.cs:_myNoncescollapses fromDictionary<uint256, (MusigPrivNonce, MusigPubNonce)>toDictionary<uint256, MusigPubNonce>; bothGenerateNoncesandSignMusigcalls pass the tree-nodetxid.ToString()assessionId.NArk.Tests/NSecWalletSignerParityTests.cs,NArk.Tests.End2End/Wallets/SimpleSeedWallet.cs: updated to new contract.docs/articles/wallets.md: updated remote-signer example + MuSig2 nonce lifecycle section + signer-caching rationale.IRemoteSignerTransportdocuments that long-lived transports need an eviction policy (TTL or bounded count) for abandoned nonces — in-process signers get away with remove-on-consume because their lifetime is the cached singleton, but a server-side transport could leak unconsumed entries if a client generates a nonce and never signs.Breaking change
Yes —
IArkadeWalletSignerandIRemoteSignerTransportare public surface. Any external implementation of either must:MusigPrivNonceparameter fromSignMusig/SignMusigAsync.GenerateNonces/GenerateNoncesAsyncreturn type toMusigPubNonce.string sessionIdparameter on bothGenerateNoncesandSignMusig.Test plan
dotnet build NArk.sln— cleandotnet test NArk.Tests— 418 passed, 0 failed