perf(y): same-length fast path in CompareKeys#2283
Open
shaunpatterson wants to merge 2 commits into
Open
Conversation
When both keys have identical total length, the user-key portion has the same length too, so a single bytes.Compare over the full key buffers is equivalent to comparing user-keys then timestamps. This short-circuits the common case (identical timestamp width on both sides) into one SIMD-vectorized compare instead of two sub-slice + two calls. When lengths differ, fall back to the previous split-compare path. Adds TestCompareKeys with hand-picked cases that exercise: - identical keys - same-length / different user-key - same-length / same user-key / different timestamps - different-length user keys (a<ts> vs aa<ts>) - timestamp tie-break only triggered when user-keys match And TestCompareKeysFuzz cross-checking 5000 randomized pairs against a reference implementation that always splits user-key from ts. Measured -1.88% composite ns/op across the 74-benchmark stable subset.
When total lengths differ, the user-key lengths also differ, so the user-key compare can never return 0. The trailing ts tiebreak was dead code after the same-length fast path. Drop it for clarity. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
y.CompareKeysis the dominant comparator for the LSM merge path and is called on every step of every forward/reverse scan that spans multiple iterators.When both keys have identical total length, the user-key portion has the same length too (since every key carries an 8-byte timestamp suffix), so a single
bytes.Compareover the full key buffers is equivalent to comparing user-keys then timestamps. This short-circuits the common case (matching timestamp widths) into one SIMD-vectorized library call instead of two sub-slice operations + two calls.When lengths differ, falls back to the previous split-compare path.
Measurement
Composite (74-benchmark stable subset): -1.88%
ns/op, median of 3 runs. Improvements concentrate in iterator/merge-heavy benchmarks (BenchmarkReadMerged,BenchmarkRead,BenchmarkReadAndBuild).Test plan
go test -short -race ./y/ ./table/— all existing tests passgo vet ./...New tests in
y/y_test.goTestCompareKeys— 7 hand-picked cases covering:a<ts>vsaa<ts>— newer should sort higher)CompareKeys(b, a) == -CompareKeys(a, b)TestCompareKeysFuzz— 5000 randomized pairs cross-checked against areferenceCompareKeysthat always splits user-key from ts. ~30% of cases force equal user-key lengths to exercise the fast path; ~20% force matching timestamps to exercise the tie-break.🤖 Generated with Claude Code