Skip to content

Commit b876712

Browse files
committed
fix(evnode-fibre): wire sequencer queue cap + lift ingest queue caps
Two runner-side changes paired with the SoloSequencer bound: - After constructing the SoloSequencer, call SetMaxQueueBytes with 10× the per-block tx budget (= 1 GiB at the current 100 MiB MaxBlobSize). 10× is the sweet spot: large enough that a short burst above steady-state ingest doesn't trigger backpressure (we want to absorb that), small enough that the worst-case retained bytes fit comfortably under the box's RAM budget alongside the pending cache + DA in-flight buffers. - Lift the inMemExecutor's hardcoded ingest caps. txChan and maxBlockTxs were sized at 500 (5 MB / 5K txs per reaper poll) back when those were the only memory bound on the runner. With the SetMaxQueueBytes cap and the FilterTxs-enforced per-block budget now actually doing the bounding, the ingest queue can hold a full 100 MiB block-worth of txs (10K slots at 10 KB) without burdening memory — and a single reaper poll can drain that whole batch in one GetTxs call instead of needing 20× cycles. This was the binding constraint at ~5,000 tx/s = 50 MB/s in earlier runs.
1 parent f8102f9 commit b876712

1 file changed

Lines changed: 27 additions & 14 deletions

File tree

  • tools/celestia-node-fiber/cmd/evnode-fibre

tools/celestia-node-fiber/cmd/evnode-fibre/main.go

Lines changed: 27 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -322,6 +322,19 @@ func run(cli cliFlags) error {
322322

323323
executor := newInMemExecutor()
324324
sequencer := solo.NewSoloSequencer(logger, []byte(genesis.ChainID), executor)
325+
// Cap the sequencer's in-memory queue at 10× the per-block tx
326+
// budget. Above this, SubmitBatchTxs returns ErrQueueFull and the
327+
// runner's reaper-bridge / tx-ingress applies backpressure (txs
328+
// stay in the executor's txChan until the sequencer drains, and
329+
// the chan's bound 503's /tx). Without this cap a fast loadgen
330+
// (32 vCPU pushing >100 MB/s) outruns the 1 block/s drain and
331+
// the queue grows monotonically — observed pre-fix as 24 GB of
332+
// retained io.ReadAll bytes in heap snapshots before the daemon
333+
// hit the 64 GiB box ceiling and OOM-killed.
334+
// Sized at 10× the per-block tx budget (matches SetMaxBlobSize
335+
// above; both anchor at the per-blob Fibre cap).
336+
const seqQueueBytes = 10 * 100 * 1024 * 1024 // 1 GiB
337+
sequencer.SetMaxQueueBytes(seqQueueBytes)
325338
daClient := block.NewFiberDAClient(adapter, cfg, logger, 0)
326339
p2pClient, err := p2p.NewClient(cfg.P2P, nodeKey.PrivKey, datastore.NewMapDatastore(), genesis.ChainID, logger, nil)
327340
if err != nil {
@@ -474,23 +487,23 @@ type inMemExecutor struct {
474487
totalTxs atomic.Uint64
475488
}
476489

477-
// txChan capacity caps in-flight memory: at 10 KB tx and 500 slots
478-
// we hold ≤ 5 MB queued before /tx blocks the ingress goroutine —
479-
// which is exactly the backpressure we want against a hot loadgen.
480-
// Reaper drains every 100 ms into the solo sequencer, which then
481-
// accumulates batches between block-production ticks; without a tight
482-
// cap a single block can balloon past the 120 MiB DA blob limit and
483-
// the rest of the daemon's per-block allocations push the box past
484-
// its RAM budget within seconds.
490+
// txChan capacity bounds the HTTP /tx ingest queue. Sized at 10K
491+
// slots (~100 MiB at 10 KB tx-size) so a 100 ms reaper cycle can
492+
// absorb a full max-size block's worth of txs without /tx blocking
493+
// the loadgen. Earlier we used 500 slots (~5 MiB) which forced
494+
// backpressure at ~5,000 tx/s — that turned txsim into the limiting
495+
// factor at ~22 MB/s rather than DA upload. With the per-block
496+
// FilterTxs cap (executor.go:RetrieveBatch via DefaultMaxBlobSize=
497+
// 100 MiB) and the submitter chunker now enforcing the actual blob
498+
// budget, the executor doesn't need an extra ingest-side cap.
485499
//
486-
// maxBlockTxs caps GetTxs's per-call return so reaper-cycle batches
487-
// are bounded too. With 500 ≤ 5 MB per block at 10 KB tx-size, we
488-
// stay an order of magnitude under the DA cap so headers/data signing
489-
// + envelope cache + retry buffers all fit.
500+
// maxBlockTxs caps GetTxs's per-call return; pairs with the channel
501+
// size so a reaper poll can fully drain a 100 MiB-block-worth of
502+
// queued txs in a single call instead of needing 20× cycles.
490503
func newInMemExecutor() *inMemExecutor {
491504
return &inMemExecutor{
492-
txChan: make(chan []byte, 500),
493-
maxBlockTxs: 500,
505+
txChan: make(chan []byte, 10000),
506+
maxBlockTxs: 10000,
494507
}
495508
}
496509

0 commit comments

Comments
 (0)