fix: Specia Redis directives can cause pika service to crash#3100
Conversation
|
""" WalkthroughThe updates introduce enhanced range handling and validation for cache and key-value operations. The Changes
Suggested reviewers
Poem
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
src/pika_cache.cc (2)
367-371: Consider removing commented code or documenting the reason for keeping it.The commented-out code should either be removed to avoid confusion or include a comment explaining why it's preserved for reference.
-// Status PikaCache::GetRange(std::string& key, int64_t start, int64_t end, std::string *value) { -// int cache_index = CacheIndex(key); -// std::lock_guard lm(*cache_mutexs_[cache_index]); -// return caches_[cache_index]->GetRange(key, start, end, value); -// } +// Previous implementation delegated directly to cache GetRange - removed due to crash issues
374-406: Manual range implementation looks correct but has performance implications.The new implementation properly handles negative indices and bounds checking, which should prevent crashes. However, there are some considerations:
- Performance impact: Retrieving the full string for every range operation may be inefficient for large strings with small ranges.
- Edge case in line 383: Using
full_value.size()directly asint64_tis safe but consider explicit casting for clarity.Consider adding a size threshold to decide between full retrieval vs. direct range access:
Status PikaCache::GetRange(std::string& key, int64_t start, int64_t end, std::string *value) { int cache_index = CacheIndex(key); std::lock_guard lm(*cache_mutexs_[cache_index]); + // For small ranges on potentially large strings, consider optimized path + // Current implementation prioritizes safety over performance std::string full_value; auto s = caches_[cache_index]->Get(key, &full_value); if (!s.ok()) { return s; } - int64_t strlen = full_value.size(); + int64_t strlen = static_cast<int64_t>(full_value.size());src/pika_kv.cc (1)
1219-1230: Offset validation logic is sound but magic number needs documentation.The validation correctly prevents negative offsets and potential integer overflow, which should help prevent crashes. However, the magic number
9223372036854775757should be explained or defined as a constant.Consider defining the magic number as a named constant:
+ // Maximum string size limit (near INT64_MAX with safety margin) + const int64_t MAX_STRING_SIZE = 9223372036854775757; + //6.9号新增 if (offset_ < 0) { res_.SetRes(CmdRes::kInvalidInt, "offset is out of range"); return; } - const int64_t MAX_STRING_SIZE = 9223372036854775757; value_ = argv_[3]; if (offset_ > MAX_STRING_SIZE - static_cast<int64_t>(value_.size())) { res_.SetRes(CmdRes::kInvalidInt, "string exceeds maximum allowed size"); return; }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/pika_cache.cc(1 hunks)src/pika_kv.cc(2 hunks)
🔇 Additional comments (1)
src/pika_kv.cc (1)
1143-1151: Clarify why range size validation is commented out.The commented-out range size validation (512MB limit) should either be removed or documented with the reason for disabling it. If the validation was causing legitimate use cases to fail, consider a higher limit or different approach.
- //6.6号新增 - // 添加最大值限制 - // const int64_t MAX_RANGE_SIZE = 512 * 1024 * 1024; // 512MB - // if (start_ < -MAX_RANGE_SIZE || start_ > MAX_RANGE_SIZE || - // end_ < -MAX_RANGE_SIZE || end_ > MAX_RANGE_SIZE) { - // res_.SetRes(CmdRes::kInvalidInt, "Range index out of bounds"); - // return; - // } + // Range size validation removed - consider if input validation is needed elsewhere
| //6.9号新增的 | ||
| const size_t MAX_RESPONSE_SIZE = 100 * 1024 * 1024; //100MB | ||
| if (substr.size() > MAX_RESPONSE_SIZE) { | ||
| substr.resize(MAX_RESPONSE_SIZE); | ||
| LOG(WARNING) << "GETRANGE result truncated from " << substr.size() | ||
| << " to " << MAX_RESPONSE_SIZE << " bytes"; | ||
| } |
There was a problem hiding this comment.
Response size truncation logic has a bug.
There's a logical error in the truncation warning log - it logs the wrong size value. The substr.size() after truncation will always be MAX_RESPONSE_SIZE, not the original size.
if (s_.ok() || s_.IsNotFound()) {
//6.9号新增的
const size_t MAX_RESPONSE_SIZE = 100 * 1024 * 1024; //100MB
+ size_t original_size = substr.size();
if (substr.size() > MAX_RESPONSE_SIZE) {
substr.resize(MAX_RESPONSE_SIZE);
- LOG(WARNING) << "GETRANGE result truncated from " << substr.size()
+ LOG(WARNING) << "GETRANGE result truncated from " << original_size
<< " to " << MAX_RESPONSE_SIZE << " bytes";
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| //6.9号新增的 | |
| const size_t MAX_RESPONSE_SIZE = 100 * 1024 * 1024; //100MB | |
| if (substr.size() > MAX_RESPONSE_SIZE) { | |
| substr.resize(MAX_RESPONSE_SIZE); | |
| LOG(WARNING) << "GETRANGE result truncated from " << substr.size() | |
| << " to " << MAX_RESPONSE_SIZE << " bytes"; | |
| } | |
| if (s_.ok() || s_.IsNotFound()) { | |
| //6.9号新增的 | |
| const size_t MAX_RESPONSE_SIZE = 100 * 1024 * 1024; //100MB | |
| size_t original_size = substr.size(); | |
| if (substr.size() > MAX_RESPONSE_SIZE) { | |
| substr.resize(MAX_RESPONSE_SIZE); | |
| LOG(WARNING) << "GETRANGE result truncated from " << original_size | |
| << " to " << MAX_RESPONSE_SIZE << " bytes"; | |
| } | |
| // …rest of the code… | |
| } |
🤖 Prompt for AI Agents
In src/pika_kv.cc around lines 1160 to 1166, the warning log after truncating
substr incorrectly logs substr.size() which is the size after truncation, not
the original size. To fix this, store the original size of substr before
resizing, then use that original size in the log message to accurately report
the truncation from the original size to MAX_RESPONSE_SIZE.
| return Status::NotFound("key not exist"); | ||
| } | ||
|
|
||
| // Status PikaCache::GetRange(std::string& key, int64_t start, int64_t end, std::string *value) { |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (3)
tests/integration/string_test.go (3)
304-310: Add the usual error-check before asserting the valueThe surrounding tests always assert that
.Err()isnilbefore checking.Val().
Doing the same here avoids silently passing when the command actually failed.- getRange := client.GetRange(ctx, "key1", 1, 4294967296) - Expect(getRange.Val()).To(Equal("bc")) + getRange := client.GetRange(ctx, "key1", 1, 4294967296) + Expect(getRange.Err()).NotTo(HaveOccurred()) + Expect(getRange.Val()).To(Equal("bc"))
311-317: Tighten the negative-test assertionRight now we only assert some error happened.
Matching the error string (orredis.Err*type) makes the test more robust and guards against unrelated failures.- setRange := client.SetRange(ctx, "key1", 9223372036854775757, "value2") - Expect(setRange.Err()).To(HaveOccurred()) + setRange := client.SetRange(ctx, "key1", 9223372036854775757, "value2") + Expect(setRange.Err()).To(MatchError(ContainSubstring("ERR")))
816-833: Shrink the TTL & replace hardSleepwithEventuallyto speed-up the suiteA fixed 11-second
Sleepslows the integration run noticeably.
Using a shorter TTL (e.g.2s) together with Gomega’sEventuallykeeps the intent while cutting wall-clock time by >80 %.- err := client.SetEx(ctx, "x", "y", 10*time.Second).Err() + err := client.SetEx(ctx, "x", "y", 2*time.Second).Err() @@ - time.Sleep(11 * time.Second) - //sleep 10 second x still exists + Eventually(func() error { + return client.Get(ctx, "x").Err() + }, "3s", "100ms").Should(Equal(redis.Nil)) + + // ensure compaction is harmless in both Pika and vanilla Redis + _ = client.Do(ctx, "compact").Err()This change keeps behaviour identical (key expires and is purged) while keeping the full suite much faster and less brittle.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/pika_cache.cc(1 hunks)src/pika_kv.cc(2 hunks)tests/integration/string_test.go(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- src/pika_cache.cc
- src/pika_kv.cc
| } | ||
|
|
||
| if (start < 0) start = 0; | ||
| if (end < 0) { |
| } | ||
| if (end >= strlen) end = strlen - 1; | ||
|
|
||
| if (start > end) { |
There was a problem hiding this comment.
这里少了一层判断start > end || strlen == 0
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
include/pika_conf.h (1)
170-173: Make the getterconstfor consistencyAlmost every getter could technically be
const, but addingconsthere at least keeps the new API tidy and signals immutability:- int64_t proto_max_bulk_len() { + int64_t proto_max_bulk_len() const { std::shared_lock l(rwlock_); return proto_max_bulk_len_; }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
conf/pika.conf(1 hunks)include/pika_conf.h(2 hunks)src/pika_kv.cc(2 hunks)
✅ Files skipped from review due to trivial changes (1)
- conf/pika.conf
🚧 Files skipped from review as they are similar to previous changes (1)
- src/pika_kv.cc
| int64_t proto_max_bulk_len() { | ||
| std::shared_lock l(rwlock_); | ||
| return proto_max_bulk_len_; | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion
proto_max_bulk_len_ defaults to 0 and has no setter – this will break SETRANGE/GETRANGE limits
SetrangeCmd::DoInitial() now checks
offset + value_len > conf->proto_max_bulk_len().
With the member default-initialised to 0 and no setter ever exposed, any non-zero request will trigger the “value is too long” error unless the constructor / Load() explicitly overwrites the field — easy to miss and hard to change at runtime.
Action items
- int64_t proto_max_bulk_len_ = 0;
+ // Default 512 MiB – consistent with Redis `proto-max-bulk-len`
+ int64_t proto_max_bulk_len_ = 512 * 1024 * 1024;and expose a corresponding setter / config-rewrite path:
+ void SetProtoMaxBulkLen(const int64_t value) {
+ std::lock_guard l(rwlock_);
+ TryPushDiffCommands("proto-max-bulk-len", std::to_string(value));
+ proto_max_bulk_len_ = value;
+ }Without this, large-range operations will incorrectly fail in fresh deployments and the limit cannot be tuned dynamically.
Also applies to: 1042-1043
🤖 Prompt for AI Agents
In include/pika_conf.h around lines 170 to 173 and also lines 1042 to 1043, the
proto_max_bulk_len_ member defaults to 0 and lacks a setter, causing all
non-zero length SETRANGE/GETRANGE requests to fail. To fix this, add a setter
method for proto_max_bulk_len_ and ensure it can be updated dynamically through
configuration reload or a dedicated config rewrite path, so the limit can be
properly initialized and adjusted at runtime.
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
include/pika_conf.h(3 hunks)src/pika_conf.cc(1 hunks)tests/integration/string_test.go(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- include/pika_conf.h
- tests/integration/string_test.go
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/pika_conf.cc (1)
src/pstd/src/base_conf.cc (2)
GetConfInt64Human(133-153)GetConfInt64Human(133-133)
| GetConfInt64Human("proto-max-bulk-len", &proto_max_bulk_len_); | ||
| if (proto_max_bulk_len_ <= 0) { | ||
| proto_max_bulk_len_ = 512 * 1024 * 1024; // 512MB | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Add an upper-bound and safeguard against huge user-supplied values
Good to see proto-max-bulk-len being parsed with GetConfInt64Human and defaulting to Redis’ 512 MB.
However, if an operator mis-configures the value (e.g. 10G) nothing prevents the server from accepting it and later trying to allocate >10 GB for a single bulk string, defeating the DoS protection this PR intends to add.
Consider clamping the value to a reasonable hard ceiling (e.g. PIKA_MAX_PROTO_BULK_LEN, 1 GB?) and logging a warning when the user-provided value exceeds that ceiling:
+static const int64_t kMaxProtoBulkLen = 1LL << 30; // 1 GB
GetConfInt64Human("proto-max-bulk-len", &proto_max_bulk_len_);
if (proto_max_bulk_len_ <= 0) {
proto_max_bulk_len_ = 512 * 1024 * 1024; // 512MB
+} else if (proto_max_bulk_len_ > kMaxProtoBulkLen) {
+ LOG(WARNING) << "proto-max-bulk-len capped from " << proto_max_bulk_len_
+ << " to " << kMaxProtoBulkLen;
+ proto_max_bulk_len_ = kMaxProtoBulkLen;
}This keeps memory usage predictable and aligns with the protective spirit of the patch.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| GetConfInt64Human("proto-max-bulk-len", &proto_max_bulk_len_); | |
| if (proto_max_bulk_len_ <= 0) { | |
| proto_max_bulk_len_ = 512 * 1024 * 1024; // 512MB | |
| } | |
| // Hard cap to prevent unbounded bulk allocations (1 GB) | |
| static const int64_t kMaxProtoBulkLen = 1LL << 30; | |
| GetConfInt64Human("proto-max-bulk-len", &proto_max_bulk_len_); | |
| if (proto_max_bulk_len_ <= 0) { | |
| proto_max_bulk_len_ = 512 * 1024 * 1024; // 512MB | |
| } else if (proto_max_bulk_len_ > kMaxProtoBulkLen) { | |
| LOG(WARNING) << "proto-max-bulk-len capped from " << proto_max_bulk_len_ | |
| << " to " << kMaxProtoBulkLen; | |
| proto_max_bulk_len_ = kMaxProtoBulkLen; | |
| } |
🤖 Prompt for AI Agents
In src/pika_conf.cc around lines 369 to 372, the code sets proto_max_bulk_len_
from user input without an upper limit, risking excessive memory allocation. Add
a constant defining a maximum allowed value (e.g., PIKA_MAX_PROTO_BULK_LEN as 1
GB), then clamp proto_max_bulk_len_ to this maximum if the user-supplied value
exceeds it. Also, log a warning message when clamping occurs to inform the
operator about the adjustment.
…mFoundation#3100) * Specia Redis directives can cause pika service to crash * Special Redis directives can cause pika service to crash * Special Redis directives can cause pika service to crash * Add support for reading proto-max-bulk-len from configuration file * optimize boundary checks and empty string handling in GetRange function of pika_cache * Complete the required items for pika_conf.cc and pika_conf.h files --------- Co-authored-by: caiyu <15260903118@163.com>
修复bug #3092:解决导致pika服务崩溃的错误漏洞
执行GETRANGE key1 1 4294967296两次后服务崩溃的原因
执行SETRANGE key1 9223372036854775757 value2后服务崩溃的原因
修改文件部分
Summary by CodeRabbit
Summary by CodeRabbit
New Features
Bug Fixes
Tests