feat: per-key cache miss tracking for arbitrary commands#374
feat: per-key cache miss tracking for arbitrary commands#374fcostaoliveira wants to merge 13 commits into
Conversation
Adds the codegen infrastructure that lets memtier_benchmark statically
resolve key positions and reply shapes for arbitrary Redis commands
(needed by the upcoming per-key miss tracking feature).
- scripts/fetch_commands_meta.sh: resolves a Redis ref (branch/tag/sha)
to a commit, downloads the tarball, extracts src/commands/*.json into
deps/commands_json/, and pins the sha in REDIS_TAG.txt.
- scripts/gen_command_meta.py: parses each JSON, emits a single
command_meta_data.h with static CommandSpec initializers (key_specs as
a tagged union over Index/Keyword x Range/Keynum, reply_shape inferred
from reply_schema with a curated override table for the well-known set).
- command_meta.{h,cpp}: typed surface (BeginSearch, FindKeys, KeySpec,
CommandSpec, ReplyShape) with a case-insensitive lookup() over the
static table. Subcommands are looked up via the canonical
"CONTAINER SUB" form (e.g. "XGROUP CREATE").
Make targets (maintainer-only; `make all` is unaffected):
make commands-meta # default REDIS_VERSION=8.8
make commands-meta REDIS_VERSION=8.6 # override branch/tag/sha
make commands-meta-fetch # JSONs only
make commands-meta-gen # regen .h from JSONs
make commands-meta-check # CI guard for stale .h
Vendored snapshot: redis/redis@8.8 (sha 369cbad8...), 425 commands;
distribution: 342 NotMissable, 37 SingleNullBulk, 9 ArrayPerElementNulls,
6 EmptyCollection, 3 IntegerMembership, 28 Unknown.
End-user builds do not require Python; AM_PATH_PYTHON resolves to ":" if
absent. The 425-file directory is built into a single static array at
codegen time, no runtime JSON parser is linked in.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends the existing GET/MGET hit/miss accounting to any --command the
user supplies, with per-key bucket granularity. Driven by the static
command_meta registry (vendored Redis 8.8 commands.json) that resolves
key positions and reply shape at startup.
What gets tracked, by reply shape:
- SingleNullBulk (GET, HGET, ZSCORE, GETEX, GETDEL, LPOP, RPOP, SPOP,
RANDOMKEY, LINDEX): null reply == miss; one bucket.
- ArrayPerElementNulls (MGET, HMGET, ZMSCORE): walk the top-level
mbulk; per-position null mask is the per-key miss vector.
- EmptyCollection (SMEMBERS, LRANGE, HGETALL, HKEYS, HVALS, ZRANGE):
empty array heuristically counted as miss.
- IntegerMembership (EXISTS, SISMEMBER, HEXISTS): integer reply N is
the hit count; first N positions marked as hits.
CLI:
--command-miss-tracking={auto,off} default: auto
JSON output (mb.json) under "ALL STATS":
- Existing per-arbitrary-command entries gain real Hits/sec and
Misses/sec scalars (previously hardcoded to 0.0).
- New nested "Per-Key Misses" section with Total Hits / Total Misses
and key[N] Hits / key[N] Misses for each command that produced
per-position data. Section is omitted entirely when
--command-miss-tracking=off, keeping legacy consumers stable.
Plumbing:
- arbitrary_command resolves spec_key_positions via key_specs evaluation
(Index/Keyword x Range/Keynum, including variadic and EVAL-style
keynum). Mismatched __key__ counts emit a startup warning but don't
fail.
- arbitrary_request carries a back-pointer to the source command so the
reply handler can inspect shape without lookup.
- m_keep_value=true is forced on connections only when ArrayPerElement
Nulls or EmptyCollection commands are configured (SingleNullBulk and
IntegerMembership use the parser's existing scalar counters and pay
no allocation cost).
- Memcached / module / unknown commands resolve to spec=NULL and the
feature stays silently disabled.
Tests: tests/test_arbitrary_command_misses.py covers all four shapes,
the off switch, and a NotMissable command (SET) producing no Per-Key
entry.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
🛡️ Jit Security Scan Results✅ No security findings were detected in this PR
Security scan by Jit
|
…ader
Two CI failures from the initial PR run:
1. macOS Apple Clang rejected `enum class` case labels with
"integral constant expression must have integral or unscoped
enumeration type" because the compiler defaulted to C++03 mode.
We were relying on Linux GCC's modern default. Add `-std=c++11` to
CXXFLAGS in configure.ac so every build uses the language version
declared in AGENTS.md.
2. The `C++ Format Check` GitHub Action does
find . -name "*.cpp" -o -name "*.h" | grep -v deps/ | grep -v autom4te.cache/
without excluding command_meta_data.h. The local Makefile.am had
carried an exclusion to skip clang-format on the generated header,
which let it drift out of style. Drop the local exclusion (so local
and CI agree) and have `make commands-meta-gen` clang-format its own
output going forward, then re-format the existing checked-in header
to match.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cursor Bugbot caught a HIGH severity crash: the per-key miss handler
called el->as_bulk() unconditionally on every top-level reply element,
which assert(0)s when the element is an mbulk_size_el. This fires for
GEOPOS, COMMAND INFO, SORT_RO and any other command whose
reply_schema infers ArrayPerElementNulls (array items oneOf containing
null) but whose non-null items are nested arrays rather than bulks.
Add public is_bulk() / is_mbulk_size() type predicates on mbulk_element
so callers can avoid as_bulk()/as_mbulk_size() asserts when walking
heterogeneous reply arrays. The handler now distinguishes:
- bulk_el with value/value_len > 0 -> hit
- bulk_el with value==NULL/len==0 -> miss (null bulk, $-1)
- mbulk_size_el with non-empty children -> hit (nested array value)
- mbulk_size_el with empty children -> miss (null array, *-1, or
empty array, *0)
GEOPOS replies *1\r\n*-1\r\n on missing keys; the parser materializes
the inner *-1 as an empty mbulk_size_el (indistinguishable from *0),
so empty-array-in-position is the right miss signal there.
New regression test exercises GEOPOS against pre-populated geo keys
across a wider range to assert the walker counts both hits and misses
without crashing.
Reported-by: cursor[bot] (Bugbot review on PR #374)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two issues caught by Cursor Bugbot on commit af91a0f: 1. (Medium) `client.cpp`: ArrayPerElementNulls capped to one element for HMGET. The walk used `request->m_keys` (== spec key count) as the bucket cap, but HMGET / ZMSCORE / HGETDEL / HGETEX have 1 Redis key and N reply elements (one per field/member). Capping to 1 silently discarded all but the first position's hit/miss attribution and broke the aggregate. Fix: size per_key_hit to the reply length directly and update num_keys to match, so bucket count tracks the semantically interesting axis (the field/member positions) rather than the spec key count. MGET continues to work 1:1 because there the spec key count and reply length already match. 2. (Medium) `run_stats.cpp`: lazy resize via `assign()` zeroed accumulated counters whenever the per-key vector size differed from the previous call. Combined with #1's variable-size replies (and future merges across threads with different observed sizes), this would silently discard stats. Fix: switch to `resize()` with grow-only semantics, mirroring the merge() path that already uses resize for the same reason. New regression test exercises HMGET with three fields, only two of which exist on populated hashes; asserts exactly three buckets and that the never-present field reports 0 hits / all misses. Reported-by: cursor[bot] (Bugbot review on PR #374, commit af91a0f) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The redis-py 4.x signature changed `geoadd(name, *args)` -> `geoadd(name, values, nx=False, xx=False, ch=False)`, which made `conn.geoadd(key, lon, lat, member)` raise `DataError: GEOADD allows either 'nx' or 'xx', not both` because the "member" string is interpreted as a flag. CI runs use redis-py 4.x and this is what surfaced as "Exception raised during test execution" on PR #374's UBSan / ASAN / CI test workflows. Use `conn.execute_command("GEOADD", ...)` to stay version-agnostic.
…_key_spec Two more Cursor Bugbot findings on commit 46b9b2d, both legitimate: 1. (Medium) `config_types.cpp`: off-by-one in evaluate_key_spec. Redis commands.json positions are 1-based with position 0 == command name; in our 0-based command_args vector that maps directly to args[N], no -1 offset. The previous `args[idx - 1]` accesses read the wrong slot in the Keyword path (always read the command name on the first iteration) and in the Keynum path (read EVAL's "script" instead of "numkeys", failing strtol and leaving spec_key_positions empty). Range never accessed args[] so the bug was latent. Also fix the Range / Keynum bounds: `last > argc` -> `last >= argc` (clamp to argc-1, the last valid index), `numidx > argc` -> `numidx >= argc`, and `p > argc` -> `p >= argc` for the same reason. Caught by Bugbot; not yet exposed by tests (no test exercises Keyword/Keynum specs). 2. (Low) `client.cpp`: empty-string bulk could be classified as miss. Use `value != NULL` (not `value_len > 0`) so that any non-null bulk slot counts as a hit, including $0 empty strings. Today the redis_protocol parser collapses $0 and $-1 into the same in-memory shape (value=NULL, value_len=0), so the change is a no-op until the parser preserves the distinction; documented inline. Forward- compatible with a future parser fix. Verified locally: EVAL (Keynum path) parses numkeys correctly and runs without crashing. All 8 tests in test_arbitrary_command_misses.py pass. Reported-by: cursor[bot] (Bugbot review on PR #374, commit 46b9b2d) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ot get_hits() Cursor Bugbot finding on commit 2930249 (Medium): response->get_hits() counts every non-null bulk the parser walks. For SingleNullBulk commands that return arrays on success — BLPOP/BRPOP/ BZPOPMAX/BZPOPMIN return [key, value] (get_hits() == 2), and LPOP key COUNT N returns an N-element array (get_hits() == N) — that overcounts hits and underflows misses. Switch SingleNullBulk to look at the top-level RESP type header (carried in m_status) directly: $-1 / *-1 == miss, anything else == hit. Always emit a single bucket because variadic-key blocking commands (BLPOP) put the winning key inside the reply array and we don't currently parse it, so per-key attribution isn't meaningful beyond aggregate hit/miss. Verified locally: GET (1-key, established case) still reports correct hit rate; LPOP key COUNT N (the array-reply scenario) now records one hit per successful pop instead of N. All 8 tests in test_arbitrary_command_misses.py pass. Reported-by: cursor[bot] (Bugbot review on PR #374, commit 2930249) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…size Cursor Bugbot findings on commit 8d40cfc: 1. (Medium) SingleNullBulk null sentinels were $-1 and *-1, but Redis replies *0 (empty array) on missing key for SPOP/SRANDMEMBER when used with the count argument (e.g. `SPOP key 1`). My check misclassified the empty array as a hit. Add *0 to the null-sentinel set. Verified live: SPOP __key__ 1 with 3/10 keys populated and destructive depletion now reports 15 hits / 385 misses (3 keys * 5 items per set = 15 successful pops, rest empty/missing). 2. (Low) per_key_hit was unconditionally pre-resized to num_keys at the top of the if-block, then re-assigned in 3 of the 4 cases — a wasted allocation on every response. Drop the pre-resize and let each case assign() to its actual bucket count once. EmptyCollection and IntegerMembership now do their own assign(num_keys, ...) explicitly (they previously relied on the pre-resize for indexing). Reported-by: cursor[bot] (Bugbot review on PR #374, commit 8d40cfc) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ply isn't an array Cursor Bugbot finding on commit 16f5054 (Medium): In the ArrayPerElementNulls path, when response->get_mbulk_value() returns NULL (top-level reply was +OK / -ERR / a single bulk / or any non-mbulk shape — possible if a misshape inferred ArrayPerElementNulls or for some module commands), the else branch incremented misses by num_keys but left per_key_hit empty. update_arbitrary_op_misses then recorded the aggregate but skipped the per-key vector update, drifting the two counters out of sync. Account uniformly: collapse to a single miss bucket so aggregate and per-key stay consistent regardless of reply shape. Reported-by: cursor[bot] (Bugbot review on PR #374, commit 16f5054) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ault branch Cursor Bugbot finding on commit e0b7272 (Low): The default branch in the ReplyShape switch is unreachable today — miss_tracking_enabled is only set for the four handled miss-bearing shapes — but if a future shape is added without a case, the call to update_arbitrary_op_misses below would silently record zero-valued hits/misses with an empty per_key_hit, drifting the stats without any signal. Guard the call on a non-empty per_key_hit so the intent is explicit and a future unhandled shape becomes a clear no-op rather than a silent "all zero" entry. Reported-by: cursor[bot] (Bugbot review on PR #374, commit e0b7272) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cursor Bugbot finding on commit ca8a20c (Low): When find_keys.range.limit > 0, the previous formula last = start_pos + total - 1 treated `total` as a contiguous count, which is wrong once step > 1 (positions are strided: start_pos, +step, +2*step, ...). The corrected formula is last = start_pos + (total - 1) * step Also tighten the initial total calculation to be stride-aware: total = (last - start_pos) / step + 1 instead of `last - start_pos + 1`, which previously over-counted slots between the strided positions. Latent today — Bugbot itself notes that no registered Redis command combines limit > 0 with step > 1 (XREAD/XREADGROUP, the only commands using limit, both have step=1) — so this is a defensive correctness fix that future-proofs the evaluator. Reported-by: cursor[bot] (Bugbot review on PR #374, commit ca8a20c) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit dbff900. Configure here.
| num_keys = 1; | ||
| hits = is_null ? 0 : 1; | ||
| misses = is_null ? 1 : 0; | ||
| if (!is_null) per_key_hit[0] = true; |
There was a problem hiding this comment.
Inconsistent NULL status handling between reply shape cases
Low Severity
The SingleNullBulk case treats a NULL status pointer (from response->get_status()) as a cache HIT because the is_null condition short-circuits to false. In contrast, the IntegerMembership case treats NULL status as all MISSES (since n stays 0). This asymmetry means that if the parser ever fails to populate the status for a non-error response, SingleNullBulk silently records phantom hits while IntegerMembership records phantom misses.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit dbff900. Configure here.
Adds four RLTest cases that observably exercise the Bugbot-found bugs that previously had no automated coverage (verified only via local smoke tests): - test_arbitrary_spop_with_count_empty_array_is_miss (#7) Pre-loads 3 sets * 5 members, runs SPOP key COUNT 1; without the *0 null-sentinel fix, every call would record a hit. With the fix only ~15 destructive pops succeed and the rest report misses. - test_arbitrary_blpop_variadic_keys_single_bucket (#6) Pre-loads memtier-1 with enough items for every BLPOP call to return immediately, and verifies "Per-Key Misses" carries exactly key[0] (not key[0..2]). Without the fix, response->get_hits()=2 from the [key, value] reply would mark two of three buckets as hit and one as miss per call. - test_arbitrary_xread_keyword_spec_evaluates_correctly (#4 Keyword) Pre-loads 3 streams; XREAD COUNT 1 STREAMS __key__ 0 across a 10-key range must produce both hits and misses, proving the Keyword begin_search resolved correctly. Without the args[idx] fix, the STREAMS keyword search would land on the wrong slot. - test_arbitrary_eval_keynum_spec_runs_cleanly (#4 Keynum) EVAL return 1 1 __key__ exercises the Keynum begin_search path. EVAL is NotMissable, so we assert no Per-Key entry but Count > 0, proving resolve_command_meta read numkeys from the right slot. Findings #2 (lazy resize), #5 (empty bulk), #8 (alloc hygiene), #9 (mbulk NULL), #10 (default switch) and #11 (Range step>1) remain without dedicated tests because they're either parser-collapsed (#5), defensive/unreachable (#9, #10, #11), or non-functional (#8); documented in PR comments. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>


Summary
Extends miss/hit accounting (today only available for GET/MGET) to any Redis command supplied via
--command, with per-key bucket granularity. Driven by a vendored snapshot of Redis'src/commands/*.json(8.8) processed at build time into a static C++ registry — no JSON parser linked into the binary, no runtime probe required for the common case.Two commits for review:
build: vendor Redis 8.8 commands.json + command_meta registry— pure infrastructure; introducesmake commands-metatargets and the codegen pipeline. No behavior change.feat: per-key cache miss tracking for arbitrary commands— the actual feature; ~700 lines of C++.Reply-shape coverage
SingleNullBulkArrayPerElementNullsEmptyCollectionIntegerMembershipNotMissable/UnknownCLI
autoenables tracking for any--commandwhose first token (orCONTAINER SUBfor subcommands) resolves to a miss-bearing reply shape in the registry.offkeeps the old JSON shape (no Hits/Misses fields, no Per-Key section) for back-compat with downstream tooling.JSON output (mb.json, under
ALL STATS)Hits/sec/Misses/sec(previously hardcoded to0.0).Per-Key Missessection:Make targets (maintainer-only)
End-user builds do not require Python;
AM_PATH_PYTHONresolves to:if absent.Smoke results
3 keys preloaded (
memtier-1..3), 10-key uniform-random range, 200 reqs × 2 clients (~30% expected hit rate):GET __key__MGET __key__ __key__ __key__EXISTS __key__SMEMBERS __key__GET __key__w/--command-miss-tracking=offImplementation notes
m_keep_value=trueis forced only on connections that need reply-array inspection (ArrayPerElementNulls/EmptyCollection).SingleNullBulkandIntegerMembershipreuse the parser's existing scalarm_hits/m_statusand pay no allocation cost.spec=NULLand the feature is silently disabled — no changes to the memcached path.run_statscarry per-command miss totals and per-key vectors; merged into the printer'srun_statsafter threads join.__key__placeholder count and spec-resolved key count emit a startup warning but do not fail (user might be testing custom modules).Test plan
make format,make format-checkmake commands-meta-check(codegen idempotent)make -j$(nproc)clean./memtier_benchmark --helpshows the new flagtests/test_arbitrary_command_misses.py)🤖 Generated with Claude Code
Note
Medium Risk
Adds non-trivial response-shape parsing and new stats aggregation on the hot arbitrary-command reply path, which could affect correctness/performance for some commands and reply edge cases. Build now vendors a large Redis command metadata snapshot and introduces codegen targets, increasing maintenance surface area.
Overview
Adds per-key hit/miss accounting for
--commandarbitrary Redis commands, expanding beyond the existing GET/MGET-only tracking. At startup, commands are resolved against a new staticcommand_metaregistry (generated from a vendored Rediscommands.jsonsnapshot) to determine key positions and reply “shapes”, and at runtimeclient.cppinspects responses to compute per-command and per-bucket hits/misses (with optional value materialization only when array-walking is needed).Introduces the
command_metasubsystem (command_meta.{h,cpp}+ generatedcommand_meta_data.h) and a maintainer workflow (make commands-meta*, optional Python) to fetch/regen the metadata; the Redis JSON snapshot is committed underdeps/commands_jsonand un-ignored in.gitignore. Stats plumbing is extended viarun_stats::update_arbitrary_op_misses()to accumulate totals and per-key vectors, and shell completion is updated to include the new--command-miss-trackingoption.Reviewed by Cursor Bugbot for commit f4716fd. Bugbot is set up for automated code reviews on this repo. Configure here.