RALE is a distributed consensus and key-value store system built on a solid foundation of proven algorithms and modern C engineering practices. The system provides reliable distributed coordination and persistent storage for distributed applications.
librale is the foundation component providing distributed consensus and key-value storage capabilities.
Key Features:
- Consensus Protocol: RALE consensus algorithm for reliable leader election
- Distributed Store: High-performance key-value storage with replication
- Network Layer: TCP/UDP communication with automatic failover
- Thread Safety: Full multi-threading support with proper synchronization
- Memory Management: Safe allocation/deallocation with leak prevention
- Logging: Clean, production-ready logging without colors or icons
API Design:
- Public API: Clean, well-documented interface in
librale.h - Thread Safe: All public functions are thread-safe
- Error Handling: Consistent error codes and detailed reporting
- Configuration: Flexible runtime configuration system
raled is the daemon process that manages cluster membership and coordination.
Responsibilities:
- Cluster Management: Node registration, health monitoring, membership changes
- Leadership: Leader election and failover coordination
- State Management: Persistent cluster state and configuration
- Communication: Inter-node communication and client API endpoints
- Monitoring: Health checks, metrics collection, and status reporting
Architecture:
- Event-Driven: Asynchronous processing with event loops
- Modular Design: Clean separation of concerns across modules
- Configuration: File-based configuration with runtime updates
- Logging: Structured logging with configurable levels
ralectrl is the command-line interface for cluster management and operations.
Commands:
- Node Management: ADD, REMOVE, LIST cluster nodes
- Status Queries: Cluster status, leader information, node health
- Configuration: Runtime configuration updates
- Diagnostics: Debug information and troubleshooting tools
Design:
- Unix Philosophy: Simple, composable commands
- Scriptable: JSON output for automation and integration
- Error Handling: Clear error messages and exit codes
- Help System: Comprehensive help and usage information
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Client │ │ Client │ │ Client │
│ Application │ │ Application │ │ Application │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
│ │ │
└──────────────────┼──────────────────┘
│
┌─────────────┐
│ librale │
│ (Library) │
└──────┬──────┘
│
┌──────────────────┼──────────────────┐
│ │ │
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ raled │ │ raled │ │ raled │
│ (Node 1) │◄──►│ (Node 2) │◄──►│ (Node 3) │
└─────────────┘ └─────────────┘ └─────────────┘
▲ ▲ ▲
│ │ │
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ ralectrl │ │ ralectrl │ │ ralectrl │
│ (CLI) │ │ (CLI) │ │ (CLI) │
└─────────────┘ └─────────────┘ └─────────────┘
The RALE consensus protocol ensures distributed agreement and consistency across cluster nodes.
Process:
- Candidate Selection: Nodes transition to candidate state during leader timeout
- Vote Collection: Candidates request votes from cluster members
- Majority Decision: Node with majority votes becomes leader
- Heartbeat Maintenance: Leader sends regular heartbeats to maintain authority
Safety Properties:
- Election Safety: At most one leader per term
- Leader Append-Only: Leaders never overwrite log entries
- Log Matching: Consistent log replication across nodes
- Leader Completeness: Leader contains all committed entries
Mechanism:
- Client Request: Application submits operation to leader
- Log Append: Leader appends entry to local log
- Replication: Leader replicates entry to follower nodes
- Commit: Entry committed when majority acknowledges
- Application: State machine applies committed entries
The DStore provides a distributed key-value storage layer with strong consistency guarantees.
Components:
- Hash Table: In-memory hash table for fast key lookups
- Persistence: File-based storage for durability
- Replication: Automatic replication across cluster nodes
- Compaction: Background compaction for space efficiency
Operations:
- PUT: Store key-value pairs with versioning
- GET: Retrieve values by key with consistency guarantees
- DELETE: Remove keys with tombstone handling
- LIST: Enumerate keys with prefix filtering
Strong Consistency: All reads return the most recent write
- Read Operations: Always served from committed state
- Write Operations: Require majority consensus before commit
- Linearizability: Operations appear atomic and instantaneous
Durability: All committed operations survive node failures
- Write-Ahead Logging: Operations logged before application
- Periodic Snapshots: State checkpoints for fast recovery
- Replication: Multiple copies across cluster nodes
RALE Protocol (Consensus):
- Transport: TCP with heartbeat keepalives
- Message Types: Vote requests, append entries, heartbeats
- Flow Control: Backpressure and congestion management
- Security: Optional TLS encryption and authentication
DStore Protocol (Storage):
- Transport: TCP with connection pooling
- Message Types: PUT, GET, DELETE, LIST operations
- Batching: Operation batching for efficiency
- Compression: Optional message compression
Management Protocol (Control):
- Transport: Unix domain sockets for local control
- Message Types: Admin commands, status queries
- Authentication: Process-based security model
- Format: JSON for human readability
Fully Connected: Every node can communicate with every other node
- Redundancy: Multiple communication paths for fault tolerance
- Load Distribution: Requests distributed across available nodes
- Partition Handling: Graceful degradation during network splits
Node Failure Detection:
- Heartbeat Monitoring: Regular health checks between nodes
- Timeout Management: Configurable timeout values for different scenarios
- Split-Brain Prevention: Quorum-based decisions during partitions
Network Failure Handling:
- Connection Retry: Automatic reconnection with exponential backoff
- Message Queuing: Buffering of messages during temporary outages
- Graceful Degradation: Reduced functionality during partial failures
Leader Recovery:
- Fast Election: Quick leader election during failures
- Log Repair: Automatic log consistency restoration
- State Synchronization: Catch-up mechanisms for lagging nodes
Data Recovery:
- Log Replay: Recovery from persistent logs
- Snapshot Restoration: Fast recovery from state snapshots
- Incremental Sync: Efficient synchronization of missing data
Cluster Size: Optimized for 3-7 node clusters
- Read Performance: Scales with cluster size
- Write Performance: Limited by consensus requirements
- Memory Usage: Efficient memory utilization with bounded growth
Consensus Operations: 1000+ operations/second per cluster Storage Operations: 10,000+ operations/second per node Network Bandwidth: Efficient protocol with minimal overhead
Leadership Election: Sub-second election times Write Latency: <10ms for local cluster writes Read Latency: <1ms for local reads
Process-Based: Local control via Unix domain sockets Optional TLS: Encrypted inter-node communication No Built-in Auth: Designed for trusted network environments
Admin Operations: Full cluster control via ralectrl Application Access: Direct library integration Read-Only Access: Future support for read-only clients
At Rest: Optional encryption for persistent storage In Transit: Optional TLS for network communication Memory: Secure memory handling with proper cleanup
Cluster Definition: Node addresses, ports, and roles Network Settings: Timeout values, retry policies Storage Settings: File paths, compaction policies
Runtime Updates: Live configuration changes Cluster Membership: Dynamic node addition/removal Performance Tuning: Runtime optimization parameters
Consensus Metrics: Term numbers, vote counts, election frequency Storage Metrics: Operation latencies, storage utilization Network Metrics: Connection status, message rates
Structured Logging: Clean, parseable log format Log Levels: ERROR, WARNING, INFO, DEBUG levels Module Identification: Clear component identification
Node Health: CPU, memory, disk utilization Cluster Health: Leadership status, node connectivity Application Health: Operation success rates, latencies
Direct Linking: Link librale directly into applications API Usage: Use public API for consensus and storage operations Thread Safety: Safe for multi-threaded applications
Daemon Mode: Use raled as a standalone service IPC Communication: Communicate via Unix sockets or network Language Bindings: Future support for multiple languages
Embedded: Library embedded in application processes Standalone: Separate daemon processes per node Containerized: Docker-ready deployment configuration Cloud Native: Kubernetes-ready with proper health checks