Skip to content

Feat: add defensive memory allocation for miners/signers#7169

Merged
aaronb-stacks merged 12 commits intostacks-network:developfrom
aaronb-stacks:feat/defensive-memcheck
May 8, 2026
Merged

Feat: add defensive memory allocation for miners/signers#7169
aaronb-stacks merged 12 commits intostacks-network:developfrom
aaronb-stacks:feat/defensive-memcheck

Conversation

@aaronb-stacks
Copy link
Copy Markdown
Contributor

@aaronb-stacks aaronb-stacks commented Apr 24, 2026

Description

This introduces a defense-in-depth technique for miners and signers for memory allocations. During either block assembly or block proposal evaluation, transactions which exceed the (high) bounds for memory consumption are aborted and treated as bad transactions.

This uses a custom wrapper around GlobalAlloc for allocation tracking.

We set two different defaults for miners and signers, so that issues from environment skew can be avoided.

@coveralls
Copy link
Copy Markdown

coveralls commented Apr 24, 2026

Coverage Report for CI Build 25566848419

Warning

Build has drifted: This PR's base is out of sync with its target branch, so coverage data may include unrelated changes.
Quick fix: rebase this PR. Learn more →

Coverage decreased (-0.06%) to 85.653%

Details

  • Coverage decreased (-0.06%) from the base build.
  • Patch coverage: 8 uncovered changes across 3 files (151 of 159 lines covered, 94.97%).
  • 4908 coverage regressions across 99 files.

Uncovered Changes

File Changed Covered %
stacks-node/src/main.rs 4 0 0.0%
stacks-common/src/alloc_tracker.rs 69 66 95.65%
stackslib/src/config/mod.rs 7 6 85.71%

Coverage Regressions

4908 previously-covered lines in 99 files lost coverage.

Top 10 Files by Coverage Loss Lines Losing Coverage Coverage
stackslib/src/chainstate/stacks/index/storage.rs 262 80.55%
stackslib/src/net/inv/epoch2x.rs 213 79.95%
stackslib/src/config/mod.rs 206 69.04%
stackslib/src/chainstate/stacks/db/transactions.rs 203 97.13%
stackslib/src/net/chat.rs 201 93.01%
stackslib/src/chainstate/stacks/miner.rs 195 83.3%
stackslib/src/chainstate/stacks/index/trie_sql.rs 190 69.7%
stackslib/src/core/mempool.rs 169 86.93%
stackslib/src/chainstate/stacks/index/node.rs 161 87.28%
stacks-node/src/nakamoto_node/miner.rs 149 87.34%

Coverage Stats

Coverage Status
Relevant Lines: 219462
Covered Lines: 187976
Line Coverage: 85.65%
Coverage Strength: 19037995.0 hits per line

💛 - Coveralls

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a defense-in-depth mechanism to abort transaction execution when heap allocations exceed configurable thresholds during (1) miner block assembly and (2) signer block proposal validation, using jemalloc per-thread allocation counters when available.

Changes:

  • Add a Clarity VM “abort callback” that is polled during evaluation and can stop execution with a caller-provided reason.
  • Wire per-transaction heap allocation limits into miner block assembly and block proposal validation, with new config options and defaults.
  • Add jemalloc allocation-counter plumbing (tikv-jemalloc-ctl) and tests validating abort behavior.

Reviewed changes

Copilot reviewed 16 out of 17 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
stackslib/src/net/connection.rs Adds connection option for per-tx heap allocation limit during block proposal validation.
stackslib/src/net/api/postblock_proposal.rs Passes config limit into proposal validation and sets per-tx abort callback.
stackslib/src/config/mod.rs Introduces config defaults + new miner/signer memory limit options and TOML parsing.
stackslib/src/clarity_vm/clarity.rs Propagates an abort callback through block/transaction connections into the VM environment.
stackslib/src/chainstate/stacks/miner.rs Applies per-tx abort callback during miner block assembly based on new setting.
stackslib/src/chainstate/stacks/db/mod.rs Exposes set_abort_callback() on ClarityTx to support miner/proposal paths.
stackslib/src/chainstate/nakamoto/miner.rs Implements make_mem_abort_callback() using jemalloc per-thread allocation deltas.
stackslib/Cargo.toml Adds conditional dependency on tikv-jemalloc-ctl for supported targets.
stacks-node/src/tests/mod.rs Registers new mem-abort test module.
stacks-node/src/tests/mem_abort.rs Adds integration tests validating jemalloc-based abort callback behavior.
clarity/src/vm/tests/simple_apply_eval.rs Adds unit tests covering VM abort callback behavior.
clarity/src/vm/mod.rs Polls the abort callback during eval-time checks to stop execution early.
clarity/src/vm/costs/mod.rs Adds (currently misleading) documentation text near cost tracker defaults.
clarity/src/vm/contexts.rs Defines AbortCallback and stores it in GlobalContext; adds setter on environment.
changelog.d/defensive-mem-allocation.added Changelog entry documenting the new config options.
Cargo.toml Adds workspace dependency for tikv-jemalloc-ctl.
Cargo.lock Updates lockfile for new dependency (and lock format version).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread clarity/src/vm/tests/simple_apply_eval.rs Outdated
Comment thread stackslib/src/net/connection.rs Outdated
Comment thread stackslib/src/config/mod.rs Outdated
Comment thread clarity/src/vm/mod.rs Outdated
Comment thread clarity/src/vm/costs/mod.rs Outdated
@cylewitruk-stacks
Copy link
Copy Markdown
Contributor

cylewitruk-stacks commented Apr 27, 2026

I've started looking at this one and threw a Copilot at it for an extra pair of eyes -- feel free to dismiss any false positives.

I have a few high-level questions/concerns around things like:

  • The lock-in to jemalloc (which has limited platform support),
  • The potential skew between (1) platforms (which you try to account for in the miner vs. signer, but only applies to Linux) and (2) jemalloc versions (even on same platform),
  • This counts all allocations and counters for the same transaction across two node releases will most certainly differ somewhat,
  • What this could potentially mean for determinism/consensus? Is it OK that the "rules" will likely differ between active signers/miners at any given point in time? I know the limit here is very conservative and will likely never be hit, so I'm asking more from a hypothetical perspective.

If we're going to go this route, maybe we should consider switching to mimalloc, which has broader platform support (and is also widely known for its performance profile)?

We could also consider using a custom allocator wrapper like I do here (but with TLS counters instead of statics) so that we always have a count available, regardless of allocator -- maybe that would be closer to a "defense in depth" end-goal of a failsafe against malicious (or otherwise) OOM during tx processing?

@aaronb-stacks
Copy link
Copy Markdown
Contributor Author

The lock-in to jemalloc (which has limited platform support),

I don't think this locks-in jemalloc. Rather, it depends on the current usage of jemalloc. If we want to replace jemalloc on those platforms at some point, we'll need to add a new memory tracking mechanism. But I don't think that really locks the choice.

The potential skew between (1) platforms (which you try to account for in the miner vs. signer, but only applies to Linux) and (2) jemalloc versions (even on same platform),

Right -- this is why there's a (quite sizable) buffer between the miner and the signer. The miner should propose a block that "fits" within the signers' limits quite comfortably.

This counts all allocations and counters for the same transaction across two node releases will most certainly differ somewhat.

I'm not quite sure how to interpret this -- you mean that the allocations and counters for the same transaction will differ across releases? I think that's likely true in many cases, but that shouldn't really matter -- if miners and signers are running a release which consumes more memory for a given transaction, we'd still want them to defensively reject transactions that consume lots of memory.

What this could potentially mean for determinism/consensus? Is it OK that the "rules" will likely differ between active signers/miners at any given point in time? I know the limit here is very conservative and will likely never be hit, so I'm asking more from a hypothetical perspective.

Right -- this is structured specifically to avoid consensus problems. This limit only applies during block assembly and proposal evaluation. Once a block is signed and broadcasted to the network, nodes (including the signers and miners themselves) do not apply any limit to the blocks. So this behavior cannot have consensus impacts like chain splits, non-determinism in replays, rejection of old blocks, etc.

If we're going to go this route, maybe we should consider switching to mimalloc, which has broader platform support (and is also widely known for its performance profile)?

I want to avoid questions about switching the allocator in this PR -- it seems like they're fairly separable questions.

We could also consider using a custom allocator wrapper like I do here (but with TLS counters instead of statics) so that we always have a count available, regardless of allocator -- maybe that would be closer to a "defense in depth" end-goal of a failsafe against malicious (or otherwise) OOM during tx processing?

I considered a custom allocator wrapper -- my worry was that it would end up adding overhead to the allocation invocations while requiring unsafe blocks. In this case though, that's probably fine. I'll poke around at re-implementing with that.

@aaronb-stacks
Copy link
Copy Markdown
Contributor Author

Okay -- pushed a version with a custom allocator wrapper -- it turned out to be quite a bit simpler than I expected. I now feel like that's probably the better approach -- thanks @cylewitruk-stacks for the push in this direction !

@aaronb-stacks aaronb-stacks changed the title Feat: add defensive memory allocation for miners/signers using jemalloc Feat: add defensive memory allocation for miners/signers Apr 28, 2026
Copy link
Copy Markdown
Contributor

@cylewitruk-stacks cylewitruk-stacks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better solution imo! This way it becomes much more platform/allocator independent, with remaining skew limited to platform-specific Rust subtleties, Rust compiler versions, and our own changes between releases 💯

Here's my comments so far!

Comment thread clarity/src/vm/tests/simple_apply_eval.rs Outdated
Comment thread clarity/src/vm/contexts.rs Outdated
Comment thread clarity/src/vm/mod.rs Outdated
Comment thread stacks-common/src/alloc_tracker.rs
Comment thread stackslib/src/config/mod.rs Outdated
Comment thread stackslib/src/chainstate/stacks/miner.rs
Comment thread stackslib/src/chainstate/stacks/db/mod.rs Outdated
Comment thread stackslib/src/chainstate/nakamoto/miner.rs Outdated
Comment thread clarity/src/vm/mod.rs Outdated
Comment thread clarity/src/vm/mod.rs
Comment thread stacks-common/src/alloc_tracker.rs Outdated
Comment thread stacks-common/src/alloc_tracker.rs
Comment thread stacks-common/src/alloc_tracker.rs Outdated
Comment thread stacks-common/src/alloc_tracker.rs Outdated
Comment thread stacks-common/src/alloc_tracker.rs
Comment thread stackslib/src/chainstate/stacks/miner.rs Outdated
benjamin-stacks
benjamin-stacks previously approved these changes May 7, 2026
Copy link
Copy Markdown
Contributor

@benjamin-stacks benjamin-stacks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, ship it!

The only thing I think we should add is a regression test that ensures the MemAbort check does happen during mempool sweeps and proposal validation, but does not happen during replay or appending.

That can be a follow-up though, and doesn't have to block this PR from being merged.

Copy link
Copy Markdown
Contributor

@francesco-stacks francesco-stacks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment thread stackslib/src/chainstate/nakamoto/miner.rs
Comment thread stacks-common/src/alloc_tracker.rs
Copy link
Copy Markdown
Contributor

@cylewitruk-stacks cylewitruk-stacks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just had a few more comments/Q's, but looking good! Feel free to resolve nits if you don't want to address them here and now.

Comment thread clarity/src/vm/contexts.rs
Comment thread stackslib/src/chainstate/nakamoto/miner.rs
Comment thread clarity/src/vm/analysis/errors.rs
Comment thread stacks-node/src/main.rs
Copy link
Copy Markdown
Contributor

@cylewitruk-stacks cylewitruk-stacks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@aaronb-stacks aaronb-stacks added this pull request to the merge queue May 8, 2026
Merged via the queue into stacks-network:develop with commit 5afe380 May 8, 2026
670 of 674 checks passed
@aaronb-stacks aaronb-stacks deleted the feat/defensive-memcheck branch May 8, 2026 21:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants