Skip to content

Add PubsubRpcLimits decoding count-cap to reduce memory footprint#477

Merged
lucassaldanha merged 20 commits into
libp2p:developfrom
lucassaldanha:feature/fix-oom-issue
May 19, 2026
Merged

Add PubsubRpcLimits decoding count-cap to reduce memory footprint#477
lucassaldanha merged 20 commits into
libp2p:developfrom
lucassaldanha:feature/fix-oom-issue

Conversation

@lucassaldanha
Copy link
Copy Markdown
Collaborator

No description provided.

Addresses the gossipsub OOM amplification reported in
ethereum-bounty/teku#7 / libp2p#6 by enforcing repeated-field count limits in
the pubsub decode pipeline before Rpc.RPC materialisation.
Add Result.Malformed sealed variant so RpcCountFrameDecoder can dispatch
on type rather than inspecting the reason string prefix.
Fix Detekt SwallowedException warning in RpcMessageCountValidator by
including the exception message in the Malformed result string.
… gap

Empty idontwant entries (0x2a 0x00) are the same cheap amplification
gadget as empty publish entries: each materialises an Rpc$ControlIDontWant
allocation. Add rejectEmptyIDontWantEntries (default true) to
PubsubRpcLimits, enforce it in the CTRL_IDONTWANT arm of
RpcMessageCountValidator, enable it in GossipRouter.rpcLimits, and
cover both the rejection and the NONE-is-a-no-op cases in tests.
…fields

Hoist IHAVE/IWANT/GRAFT/PRUNE/IDONTWANT counters out of validateControl
into a shared ControlCounters object allocated once per validateRpc call,
so multiple top-level control fields (valid per proto2 merge semantics)
accumulate into the same counters rather than resetting them each time.
Prevents a peer from splitting crafted control messages across N fields
each just under the configured cap to bypass the pre-decode OOM guard.
Adds two tests to lock in that RpcMessageCountValidator is off by default:
- FloodRouter inherits PubsubRpcLimits.NONE from AbstractRouter, so non-Gossip
  routers see no change in wire behaviour.
- RpcCountFrameDecoder forwards a frame that would otherwise be rejected
  (empty publish entry + extra publishes) when constructed with NONE, proving
  pass-through at the pipeline level — not just at the validator function.
Two changes that together force any future change to rpc.proto to be
acknowledged in RpcMessageCountValidator:

- Switch hardcoded field-number constants to the protobuf-generated
  *_FIELD_NUMBER constants, so renames or removals in rpc.proto break
  compilation.
- Add ACKNOWLEDGED_REPEATED_FIELDS as the single source of truth for the
  repeated fields the validator inspects, plus a coverage test that
  recursively walks every message descriptor reachable from Rpc.RPC and
  asserts the acknowledged set equals the actual set. A new repeated
  field anywhere in the proto fails the test until the author adds it
  (and the corresponding decode-time guard).
@lucassaldanha lucassaldanha changed the title Add PubsubRpcLimits decoding count-cap to reduce memory footpring Add PubsubRpcLimits decoding count-cap to reduce memory footprint May 19, 2026
Copy link
Copy Markdown
Collaborator

@StefanBratanov StefanBratanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, apart from one thing


override fun decode(ctx: ChannelHandlerContext, msg: ByteBuf, out: MutableList<Any>) {
val result = try {
RpcMessageCountValidator.validate(msg, limits)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it better when PubsubRpcLimits is NONE we don't walk the buffer at all, it will fail when converting to the objecy anyways in this case

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. I'll add it.

When PubsubRpcLimits has no configured caps and no reject-empty flags
(e.g. PubsubRpcLimits.NONE), the validator walks the buffer and always
returns Accepted. Skip the walk entirely on that path — malformed bytes
still surface from the downstream ProtobufDecoder, which already drives
the same wire-exception handling.

A new `PubsubRpcLimits.isNoop` predicate gates the fast-path. Defining
it as a property of the limits (rather than checking equality with NONE)
keeps the fast-path correct even if a caller hand-builds an all-null
PubsubRpcLimits and forces future field additions to be considered.

Adds PubsubRpcLimitsTest covering isNoop semantics and an
RpcCountFrameDecoder test that pipes a truncated frame through with
NONE and asserts it is forwarded — proof that the validator did not run.
@lucassaldanha lucassaldanha merged commit c34cc11 into libp2p:develop May 19, 2026
2 checks passed
@lucassaldanha lucassaldanha deleted the feature/fix-oom-issue branch May 19, 2026 23:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants