Skip to content

perf(skl): embed Iterator by value in UniIterator#2288

Open
shaunpatterson wants to merge 1 commit into
dgraph-io:mainfrom
shaunpatterson:perf/skl-uniiterator-embed
Open

perf(skl): embed Iterator by value in UniIterator#2288
shaunpatterson wants to merge 1 commit into
dgraph-io:mainfrom
shaunpatterson:perf/skl-uniiterator-embed

Conversation

@shaunpatterson
Copy link
Copy Markdown

Summary

Halves the per-memtable iterator construction allocation count by embedding the inner Iterator inside UniIterator instead of holding a *Iterator. Previously NewUniIterator allocated twice: once for the inner Iterator (via NewIterator) and once for the UniIterator wrapper. Embedding by value collapses these into a single heap allocation.

Extracts the shared (*Iterator).init helper so future setup added in NewIterator applies uniformly to UniIterator's embedded construction path.

Motivation

In dgraph's posting list rollup, ~6 UniIterators are constructed per cache miss (1 mutable + 5 immutable memtables). This scales to roughly 6 fewer allocs per rollup miss.

Benchmark — BenchmarkRollupKeyIterator

B/op allocs/op
before 826 16
after 817 15

Test plan

  • go test ./skl/... passes
  • go test ./... passes
  • Existing iterator tests cover the embedded value semantics

🤖 Generated with Claude Code

Halves the per-memtable iterator construction allocation count by
embedding the inner Iterator inside UniIterator instead of holding a
*Iterator. Previously NewUniIterator allocated twice: once for the
inner Iterator (via NewIterator) and once for the UniIterator wrapper.
Embedding by value collapses these into a single heap allocation.

Extracts the shared (*Iterator).init helper so future setup added in
NewIterator applies uniformly to UniIterator's embedded construction
path.

Saves 1 alloc and 9 B per UniIterator construction on
BenchmarkRollupKeyIterator:

  before:  826 B/op, 16 allocs/op
  after:   817 B/op, 15 allocs/op

In dgraph's posting list rollup, ~6 of these are constructed per
cache miss (1 mutable + 5 immutable memtables), so this scales
to roughly 6 fewer allocs per rollup miss.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@shaunpatterson shaunpatterson requested a review from a team as a code owner May 26, 2026 00:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant