Skip to content

perf(table): skip redundant first bytes.Equal in MergeIterator.Next#2281

Open
shaunpatterson wants to merge 2 commits into
dgraph-io:mainfrom
shaunpatterson:perf/merge-next-skip-first-equal
Open

perf(table): skip redundant first bytes.Equal in MergeIterator.Next#2281
shaunpatterson wants to merge 2 commits into
dgraph-io:mainfrom
shaunpatterson:perf/merge-next-skip-first-equal

Conversation

@shaunpatterson
Copy link
Copy Markdown

Summary

MergeIterator.Next is called after setCurrent() has already copied mi.small.key into mi.curKey, so the very first bytes.Equal(mi.small.key, mi.curKey) comparison inside the loop is always true. The same is true after every mi.small.next() + fix() advance that we then re-loop on: at that point we know the previous-small key equals curKey by virtue of how we entered the loop, but we haven't re-checked the new small.key.

Restructure Next to make the first advance unconditional and only test bytes.Equal on subsequent passes. This drops one bytes.Equal call (typically a SIMD-vectorized but still non-trivial library call) from every Next() invocation — and Next is called for every key in a forward scan.

The loop invariants are preserved: after the unconditional first advance, we loop while the new small.key still equals the saved curKey (i.e. another duplicate of the same user-key remains on either child).

Measurement

  • BenchmarkReadMerged: -5.0%
  • Composite (74-benchmark stable subset): -2.36% ns/op, median of 3 runs

Test plan

  • go test -short -race ./table/TestMergeIteratorDuplicate, TestMergeDuplicates (forward + reverse), TestMergeMore, TestMergeIteratorSeek* all pass
  • go vet ./...
  • CI

MergeIterator.Next is at 100% statement coverage under the existing test suite (verified with -coverprofile).

🤖 Generated with Claude Code

Invariant: every public method that leaves MergeIterator Valid ends
with setCurrent(), which sets curKey = small.key. So on entry to
Next(), curKey == small.key byte-for-byte, and the first iteration of
the original duplicate-skip loop's bytes.Equal check is always true —
we always need to advance once before checking for duplicates.

Hoist the first small.next() + fix() out of the loop and keep checking
for duplicates only on subsequent iterations. Saves one bytes.Equal
call per emitted key on the merged-read hot path.

Existing TestMergeDuplicates (forward + reverse) and TestMergeIterator*
cover this code path.

Benchmark impact (M4 Max, median of 3 runs):
- BenchmarkReadMerged: -2.4% (~523M -> ~510M ns)
@shaunpatterson shaunpatterson requested a review from a team as a code owner May 23, 2026 18:08
Calling Next() after a Seek past the end (iterator already invalid) must
be a safe no-op. The early-return guard at the top of Next() previously
had no coverage. Add a focused regression that Seeks past the end then
calls Next(), verifying no panic and Valid()=false.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant