Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
81 commits
Select commit Hold shift + click to select a range
e0d00b6
Bumped zlib version
hammerface Oct 31, 2025
62f71af
Added Raft files, currently copies of PoE files per Dakai's guide.
hammerface Nov 4, 2025
9aed7e7
set PoE script to use pbft's kvtools
hammerface Nov 4, 2025
ba5de1c
Replace view change flag with leader election flag
Nachiket-ML Nov 12, 2025
b8f6799
WIP add initial happy path raft implementation
JoshHuttonCode Nov 12, 2025
6958cdc
WIP some bug fixes
JoshHuttonCode Nov 13, 2025
2940bcf
WIP log statements to try and figure things out
JoshHuttonCode Nov 13, 2025
f8453f3
WIP fix using wrong value for leader id
JoshHuttonCode Nov 13, 2025
38a3e57
WIP more print statements
JoshHuttonCode Nov 13, 2025
03e4304
Fixed errors in build files to support Intellisense
hammerface Nov 13, 2025
cf06093
WIP temp change to broadcast
JoshHuttonCode Nov 13, 2025
71829da
WIP add more prints, leader commits and executes
JoshHuttonCode Nov 14, 2025
0b7ab59
WIP switch back to SendMessage
JoshHuttonCode Nov 14, 2025
258c7db
WIP fixed committing condition
JoshHuttonCode Nov 14, 2025
7d3c979
Added Heartbeat and Elections triggered by timeout
hammerface Nov 20, 2025
d6b9c71
Merge branch 'jim-raft' of github.com:hammerface/incubator-resilientd…
hammerface Nov 20, 2025
a00365e
Merge branch 'jim-raft' into happy_path_raft
hammerface Nov 20, 2025
aae5bbf
Merge branch 'happy_path_raft' into raft
hammerface Nov 20, 2025
df51c82
Update variable names to match the Raft paper
JoshHuttonCode Nov 21, 2025
3f15848
in progress requestVote and requestVoteResponse handlers
hammerface Nov 21, 2025
9754839
Merge branch 'raft' into jim-raft
hammerface Nov 21, 2025
8ae8d03
finished adding leader election logic, it seems buggy though.
hammerface Nov 21, 2025
707f5ea
blocked input from client, set appendEntries handler to always be happy
hammerface Nov 21, 2025
81c9106
testing
hammerface Nov 21, 2025
880287a
testing leader election
hammerface Nov 24, 2025
2108064
staging for merge into raft branch
hammerface Dec 1, 2025
331f3a3
Added logic to redirect client proxy to current leader
hammerface Dec 2, 2025
2a808dc
Fix AppendEntries failure path to back off nextIndex and retry
yhuan331 Dec 2, 2025
6f626ed
redirect to leader working even on later terms
hammerface Dec 3, 2025
810ca7b
Merge branch 'eva' into election-test
hammerface Dec 4, 2025
d8bb72d
seems to kind of work. worried about duplicate entries?
hammerface Dec 4, 2025
cbed212
modified some logging, changed config
hammerface Dec 4, 2025
eb9692f
moved some variables around in header, changed receivePropose to Rece…
hammerface Dec 5, 2025
aa03e2a
Swapped to requests instead of AppendEntries outside RPC logic
hammerface Dec 5, 2025
ddd5e97
implemented LogEntry struct instead of using AppendEntries for everyt…
hammerface Dec 5, 2025
5278d19
Create and send individual AppendEntries to each follower from Receiv…
Nachiket-ML Dec 5, 2025
0f9843d
Setting nextIndex to lastlogindex() - 1 in ReceiveAEResponse fail case
Nachiket-ML Dec 7, 2025
c5c0ac6
Modified CreateAndSendAppendEntries to CreateAndSendAppendEntryMsg an…
Nachiket-ML Dec 7, 2025
1e9e1f2
Fix issues from the previous commit
JoshHuttonCode Dec 7, 2025
1a16d10
WIP catch up transactions to leader's most current log entry
JoshHuttonCode Dec 8, 2025
5aca792
staging for benchmarks
hammerface Dec 8, 2025
1a2bbc1
Added support for multiple entries in AppendEntries RPC
hammerface Dec 15, 2025
cbff3c7
added followerId field to AeFields struct to simplify code
hammerface Dec 15, 2025
a48f29a
revamped follower truncate/append logic
hammerface Dec 16, 2025
b2a7a59
added some comments to Raft::ReceiveAppendEntries method
hammerface Dec 16, 2025
2e75942
Merge remote-tracking branch 'origin/master' into raft
JoshHuttonCode Jan 6, 2026
d706396
update inflight limit code
hammerface Jan 26, 2026
3815da0
Merge branch 'raft' of github.com:hammerface/incubator-resilientdb in…
hammerface Jan 26, 2026
5ee998c
Merge branch 'raft' into raft-catchup
hammerface Jan 26, 2026
da77534
modified config generation script to accomodate changes to config rea…
hammerface Jan 26, 2026
efe3a50
Merge pull request #1 from hammerface/raft-catchup
hammerface Jan 27, 2026
e271230
Merge branch 'master' into raft
JoshHuttonCode Jan 29, 2026
6555213
Add a function for use in later commit
JoshHuttonCode Feb 5, 2026
9e7cfc5
Add tests related to timeouts and heartbeats for Raft
JoshHuttonCode Feb 5, 2026
b67e584
Add initial Raft tests for sending and receiving AppendEntries
JoshHuttonCode Feb 16, 2026
26db304
Update tests, restructure raft tests to support multiple raft test files
JoshHuttonCode Feb 16, 2026
15c50c8
Add tests for ReceiveAppendEntriesResponse()
JoshHuttonCode Feb 16, 2026
7d547c4
Add tests related to voting
JoshHuttonCode Feb 17, 2026
a0c3514
Run clang format and other style changes
JoshHuttonCode Feb 17, 2026
39c9b44
Merge branch 'master' into raft
JoshHuttonCode Feb 17, 2026
ba33b74
Undo some changes unrelated to Raft
JoshHuttonCode Feb 17, 2026
9532864
Merge pull request #2 from JoshHuttonCode/raft
JoshHuttonCode Feb 18, 2026
f863e12
Change log_ to vector<LogEntry>. Add helper functions for future use
JoshHuttonCode Mar 3, 2026
830fe9b
Change LogEntry to store Entry directly
JoshHuttonCode Mar 4, 2026
723ef64
WIP Add RaftRecovery class
JoshHuttonCode Mar 5, 2026
3f5711f
WIP Extract common Recovery code to BaseRecovery
JoshHuttonCode Mar 26, 2026
89ed7fd
WIP Update RaftRecovery and tests, tentatively working
JoshHuttonCode Apr 1, 2026
db4689b
Run clang format
JoshHuttonCode Apr 1, 2026
aa5e923
Change the templating of Recovery
JoshHuttonCode Apr 1, 2026
e53e41d
Reorganize file structure of recovery files
JoshHuttonCode Apr 2, 2026
9bc819c
Update things missed in previous commit
JoshHuttonCode Apr 2, 2026
341ea18
Fix RaftRecovery issue, add test for it, adjust dependencies
JoshHuttonCode Apr 2, 2026
c14204c
Remove access modifier change for test compilation, change whitespace
JoshHuttonCode Apr 2, 2026
7745b38
Address some PR comments, fix broken tests
JoshHuttonCode Apr 8, 2026
4ebb0ef
Improve reliability of RaftRecovery
JoshHuttonCode Apr 20, 2026
3cfafbe
Update changes for tests, change how log is accessed
JoshHuttonCode Apr 20, 2026
a917a8b
Add RaftRecovery to store and recover persistent state
JoshHuttonCode Apr 21, 2026
43660a9
Add Checkpointing, new tests, miscellaneous changes
JoshHuttonCode May 1, 2026
6eabf0e
Merge branch 'master' into raft
JoshHuttonCode May 1, 2026
018c063
Add Snapshotting to Raft
JoshHuttonCode May 1, 2026
db35d9e
Undo change to recovery_ckpt_time_s_
JoshHuttonCode May 1, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 10 additions & 6 deletions WORKSPACE
Original file line number Diff line number Diff line change
Expand Up @@ -174,8 +174,8 @@ boost_deps()
http_archive(
name = "net_zlib_zlib",
build_file = "@com_resdb_nexres//third_party:z.BUILD",
sha256 = "91844808532e5ce316b3c010929493c0244f3d37593afd6de04f71821d5136d9",
strip_prefix = "zlib-1.2.12",
sha256 = "9a93b2b7dfdac77ceba5a558a580e74667dd6fede4585b91eefb60f03b72df23",
strip_prefix = "zlib-1.3.1",
urls = [
"https://zlib.net/fossils/zlib-1.2.12.tar.gz",
"https://downloads.sourceforge.net/project/libpng/zlib/1.2.12/zlib-1.2.12.tar.gz",
Expand Down Expand Up @@ -225,10 +225,14 @@ bind(

http_archive(
name = "com_zlib",
build_file = "@com_resdb_nexres//third_party:zlib.BUILD",
sha256 = "629380c90a77b964d896ed37163f5c3a34f6e6d897311f1df2a7016355c45eff",
strip_prefix = "zlib-1.2.11",
url = "https://github.com/madler/zlib/archive/v1.2.11.tar.gz",
build_file = "@com_resdb_nexres//third_party:z.BUILD",
sha256 = "9a93b2b7dfdac77ceba5a558a580e74667dd6fede4585b91eefb60f03b72df23",
strip_prefix = "zlib-1.3.1",
urls = [
"https://zlib.net/zlib-1.3.1.tar.gz",
"https://zlib.net/fossils/zlib-1.3.1.tar.gz",
"https://github.com/madler/zlib/releases/download/v1.3.1/zlib-1.3.1.tar.gz",
],
)

http_archive(
Expand Down
43 changes: 43 additions & 0 deletions benchmark/protocols/raft/BUILD
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#

package(default_visibility = ["//visibility:private"])

load("@bazel_skylib//rules:common_settings.bzl", "bool_flag")

cc_binary(
name = "kv_server_performance",
srcs = ["kv_server_performance.cpp"],
deps = [
"//chain/storage:memory_db",
"//executor/kv:kv_executor",
"//platform/config:resdb_config_utils",
"//platform/consensus/ordering/raft/framework:consensus",
"//service/utils:server_factory",
],
)

cc_binary(
name = "kv_service_tools",
srcs = ["kv_service_tools.cpp"],
deps = [
"//common/proto:signature_info_cc_proto",
"//interface/kv:kv_client",
"//platform/config:resdb_config_utils",
],
)
83 changes: 83 additions & 0 deletions benchmark/protocols/raft/kv_server_performance.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

#include <glog/logging.h>

#include "chain/storage/memory_db.h"
#include "executor/kv/kv_executor.h"
#include "platform/config/resdb_config_utils.h"
#include "platform/consensus/ordering/raft/framework/consensus.h"
#include "platform/networkstrate/service_network.h"
#include "platform/statistic/stats.h"
#include "proto/kv/kv.pb.h"

using namespace resdb;
using namespace resdb::raft;
using namespace resdb::storage;

void ShowUsage() {
printf("<config> <private_key> <cert_file> [logging_dir]\n");
}

std::string GetRandomKey() {
int num1 = rand() % 10;
int num2 = rand() % 10;
return std::to_string(num1) + std::to_string(num2);
}

int main(int argc, char** argv) {
if (argc < 3) {
ShowUsage();
exit(0);
}

// google::InitGoogleLogging(argv[0]);
// FLAGS_minloglevel = google::GLOG_WARNING;

char* config_file = argv[1];
char* private_key_file = argv[2];
char* cert_file = argv[3];

if (argc >= 5) {
auto monitor_port = Stats::GetGlobalStats(5);
monitor_port->SetPrometheus(argv[4]);
}

std::unique_ptr<ResDBConfig> config =
GenerateResDBConfig(config_file, private_key_file, cert_file);

config->RunningPerformance(true);
ResConfigData config_data = config->GetConfigData();

auto performance_consens = std::make_unique<Consensus>(
*config, std::make_unique<KVExecutor>(std::make_unique<MemoryDB>()));
performance_consens->SetupPerformanceDataFunc([]() {
KVRequest request;
request.set_cmd(KVRequest::SET);
request.set_key(GetRandomKey());
request.set_value("helloword");
std::string request_data;
request.SerializeToString(&request_data);
return request_data;
});

auto server =
std::make_unique<ServiceNetwork>(*config, std::move(performance_consens));
server->Run();
}
51 changes: 51 additions & 0 deletions benchmark/protocols/raft/kv_service_tools.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

#include <fcntl.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>

#include <fstream>

#include "common/proto/signature_info.pb.h"
#include "interface/kv/kv_client.h"
#include "platform/config/resdb_config_utils.h"

using resdb::GenerateReplicaInfo;
using resdb::GenerateResDBConfig;
using resdb::KVClient;
using resdb::ReplicaInfo;
using resdb::ResDBConfig;

int main(int argc, char** argv) {
if (argc < 2) {
printf("<config path>\n");
return 0;
}
std::string client_config_file = argv[1];
ResDBConfig config = GenerateResDBConfig(client_config_file);

config.SetClientTimeoutMs(100000);

KVClient client(config);

client.Set("start", "value");
printf("start benchmark\n");
}
6 changes: 4 additions & 2 deletions chain/storage/leveldb.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -228,8 +228,10 @@ bool ResLevelDB::UpdateMetrics() {
return true;
}

bool ResLevelDB::Flush() {
leveldb::Status status = db_->Write(leveldb::WriteOptions(), &batch_);
bool ResLevelDB::Flush(bool should_sync) {
leveldb::WriteOptions opts = leveldb::WriteOptions();
opts.sync = should_sync;
leveldb::Status status = db_->Write(opts, &batch_);
if (status.ok()) {
batch_.Clear();
return true;
Expand Down
2 changes: 1 addition & 1 deletion chain/storage/leveldb.h
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ class ResLevelDB : public Storage {

bool UpdateMetrics();

bool Flush() override;
bool Flush(bool should_sync = false) override;

virtual uint64_t GetLastCheckpoint() override;

Expand Down
2 changes: 1 addition & 1 deletion chain/storage/mock_storage.h
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ class MockStorage : public Storage {
MOCK_METHOD(ItemsType, GetAllItems, (), (override));
MOCK_METHOD(ValuesSeqType, GetAllItemsWithSeq, (), (override));

MOCK_METHOD(bool, Flush, (), (override));
MOCK_METHOD(bool, Flush, (bool should_sync), (override));
};

} // namespace resdb
2 changes: 1 addition & 1 deletion chain/storage/storage.h
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ class Storage {
// Default no-op SQL execution for non-SQL backends.
virtual std::string ExecuteSQL(const std::string& sql_string) { return ""; }

virtual bool Flush() { return true; };
virtual bool Flush(bool should_sync = false) { return true; };

virtual uint64_t GetLastCheckpoint() { return 0; }

Expand Down
4 changes: 2 additions & 2 deletions platform/consensus/ordering/common/algorithm/protocol_base.h
Original file line number Diff line number Diff line change
Expand Up @@ -63,9 +63,9 @@ class ProtocolBase {
}

protected:
int SendMessage(int msg_type, const google::protobuf::Message& msg,
virtual int SendMessage(int msg_type, const google::protobuf::Message& msg,
int node_id);
int Broadcast(int msg_type, const google::protobuf::Message& msg);
virtual int Broadcast(int msg_type, const google::protobuf::Message& msg);
int Commit(const google::protobuf::Message& msg);

bool IsStop();
Expand Down
2 changes: 1 addition & 1 deletion platform/consensus/ordering/common/framework/consensus.h
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ class Consensus : public ConsensusManager {
protected:
int SendMsg(int type, const google::protobuf::Message& msg, int node_id);
int Broadcast(int type, const google::protobuf::Message& msg);
int ResponseMsg(const BatchUserResponse& batch_resp);
virtual int ResponseMsg(const BatchUserResponse& batch_resp);
void AsyncSend();
bool IsStop();

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -52,8 +52,8 @@ PerformanceManager::PerformanceManager(
total_num_ = 0;
replica_num_ = config_.GetReplicaNum();
id_ = config_.GetSelfInfo().id();
primary_ = id_ % replica_num_;
if (primary_ == 0) primary_ = replica_num_;
primary_.store(id_ % replica_num_);
if (primary_ == 0) primary_.store(replica_num_);
local_id_ = 1;
sum_ = 0;
}
Expand All @@ -67,7 +67,17 @@ PerformanceManager::~PerformanceManager() {
}
}

int PerformanceManager::GetPrimary() { return primary_; }
int PerformanceManager::GetPrimary() { return primary_.load(); }

void PerformanceManager::SetPrimary(int id) {
int curr_primary = primary_.load();
while (id != curr_primary) {
if (primary_.compare_exchange_strong(curr_primary, id)) {
LOG(INFO) << "JIM -> " << __FUNCTION__ << ": primary updated to " << id;
return;
}
}
}

int PerformanceManager::NeedResponse() {
return config_.GetMinClientReceiveNum(); // f+1;
Expand All @@ -88,16 +98,17 @@ int PerformanceManager::StartEval() {
return 0;
}
eval_started_ = true;
for (int i = 0; i < 100000000; ++i) {
std::unique_ptr<QueueItem> queue_item = std::make_unique<QueueItem>();
queue_item->context = nullptr;
queue_item->user_request = GenerateUserRequest();
batch_queue_.Push(std::move(queue_item));
if (i == 2000000) {
eval_ready_promise_.set_value(true);
std::thread([&](){
for (int i = 0; i < 100000000; ++i) {
std::unique_ptr<QueueItem> queue_item = std::make_unique<QueueItem>();
queue_item->context = nullptr;
queue_item->user_request = GenerateUserRequest();
batch_queue_.Push(std::move(queue_item));
if (i == 2000000) {
eval_ready_promise_.set_value(true);
}
}
}
LOG(WARNING) << "start eval done";
}).detach();
return 0;
}

Expand Down Expand Up @@ -176,8 +187,8 @@ void PerformanceManager::SendResponseToClient(
if (create_time > 0) {
uint64_t run_time = GetCurrentTime() - create_time;
LOG(ERROR) << "receive current:" << GetCurrentTime()
<< " create time:" << create_time << " run time:" << run_time
<< " local id:" << batch_response.local_id();
<< " create time:" << create_time << " run time:" << run_time
<< " local id:" << batch_response.local_id();
global_stats_->AddLatency(run_time);
}
send_num_--;
Expand All @@ -186,8 +197,8 @@ void PerformanceManager::SendResponseToClient(
// =================== request ========================
int PerformanceManager::BatchProposeMsg() {
LOG(WARNING) << "batch wait time:" << config_.ClientBatchWaitTimeMS()
<< " batch num:" << config_.ClientBatchNum()
<< " max txn:" << config_.GetMaxProcessTxn();
<< " batch num:" << config_.ClientBatchNum()
<< " max txn:" << config_.GetMaxProcessTxn();
std::vector<std::unique_ptr<QueueItem>> batch_req;
eval_ready_future_.get();
bool start = false;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ class PerformanceManager {
virtual ~PerformanceManager();

int StartEval();
void SetPrimary(int id);

int ProcessResponseMsg(std::unique_ptr<Context> context,
std::unique_ptr<Request> request);
Expand Down Expand Up @@ -89,7 +90,7 @@ class PerformanceManager {
std::mutex response_lock_[response_set_size_];
int replica_num_;
int id_;
int primary_;
std::atomic<int> primary_;
std::atomic<int> local_id_;
std::atomic<int> sum_;
};
Expand Down
6 changes: 3 additions & 3 deletions platform/consensus/ordering/pbft/consensus_manager_pbft.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -53,9 +53,9 @@ ConsensusManagerPBFT::ConsensusManagerPBFT(
view_change_manager_(std::make_unique<ViewChangeManager>(
config_, checkpoint_manager_.get(), message_manager_.get(),
system_info_.get(), GetBroadCastClient(), GetSignatureVerifier())),
recovery_(std::make_unique<Recovery>(config_, checkpoint_manager_.get(),
system_info_.get(),
message_manager_->GetStorage())),
recovery_(std::make_unique<PBFTRecovery>(
config_, checkpoint_manager_.get(), system_info_.get(),
message_manager_->GetStorage())),
query_(std::make_unique<Query>(config_, recovery_.get(),
std::move(query_executor))) {
LOG(INFO) << "is running is performance mode:"
Expand Down
4 changes: 2 additions & 2 deletions platform/consensus/ordering/pbft/consensus_manager_pbft.h
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
#include "platform/consensus/ordering/pbft/query.h"
#include "platform/consensus/ordering/pbft/response_manager.h"
#include "platform/consensus/ordering/pbft/viewchange_manager.h"
#include "platform/consensus/recovery/recovery.h"
#include "platform/consensus/recovery/pbft_recovery.h"
#include "platform/networkstrate/consensus_manager.h"

namespace resdb {
Expand Down Expand Up @@ -84,7 +84,7 @@ class ConsensusManagerPBFT : public ConsensusManager {
std::unique_ptr<ResponseManager> response_manager_;
std::unique_ptr<PerformanceManager> performance_manager_;
std::unique_ptr<ViewChangeManager> view_change_manager_;
std::unique_ptr<Recovery> recovery_;
std::unique_ptr<PBFTRecovery> recovery_;
Stats* global_stats_;
std::queue<std::pair<std::unique_ptr<Context>, std::unique_ptr<Request>>>
request_pending_;
Expand Down
Loading