diff --git a/docs/features/global_cache_pooling.md b/docs/features/global_cache_pooling.md
index 72314567393..aee31b36f0d 100644
--- a/docs/features/global_cache_pooling.md
+++ b/docs/features/global_cache_pooling.md
@@ -48,7 +48,8 @@ Ready-to-use example scripts are available in [examples/cache_storage/](../../..
 |--------|----------|-------------|
 | `run.sh` | Multi-Instance | Two standalone instances sharing cache |
 | `run_03b_pd_storage.sh` | PD Disaggregation | P+D instances with global cache pooling |
-| `run_ha.sh` | High Availability | etcd + multi-master leader election, verifies failover after killing the leader |
+| `run_ha.sh` | High Availability (etcd) | etcd + multi-master leader election, verifies failover after killing the leader |
+| `run_ha_redis.sh` | High Availability (redis) | single redis + multi-master leader election, verifies failover after killing the leader |
 
 ## Prerequisites
 
@@ -287,14 +288,19 @@ curl -X POST "http://0.0.0.0:52700/v1/chat/completions" \
 
 ### Scenario 3: High-Availability (HA) Deployment
 
-A single master is a single point of failure; if it crashes, cluster operations pause. For production, use the **etcd + multi-master** mode: multiple `mooncake_master` instances perform leader election through etcd. When the leader fails, a standby is automatically re-elected, transparently to clients.
+A single master is a single point of failure; if it crashes, cluster operations pause. For production, run multiple `mooncake_master` instances that perform leader election through a coordination backend. When the leader fails, a standby is automatically re-elected, transparently to clients.
+
+Two coordination backends are supported:
+
+- **etcd** (`run_ha.sh`): a 3-node etcd cluster does election and metadata storage.
+- **redis** (`run_ha_redis.sh`): a single redis instance does lease-based election. Use this to avoid introducing etcd as an extra component.
 
 **Architecture:**
 
 ```
             ┌──────────────────────────────────────┐
-            │           etcd cluster (3 nodes)      │
-            │     leader election / metadata store  │
+            │   coordination backend (etcd / redis) │
+            │     leader election (master_view)     │
             └───────────────────┬──────────────────┘
                                 │ election (master_view)
           ┌─────────────────────┼─────────────────────┐
@@ -304,47 +310,30 @@ A single master is a single point of failure; if it crashes, cluster operations
    │ rpc:8081    │       │ rpc:8082    │       │ rpc:8083    │
    │ (leader)    │       │ (standby)   │       │ (standby)   │
    └──────┬──────┘       └─────────────┘       └─────────────┘
-          │  FastDeploy clients discover the current leader via etcd
+          │  FastDeploy clients discover the current leader via the backend
    ┌──────┴───────┐
    ▼              ▼
 server_0      server_1
 ```
 
-#### Prerequisites
-
-**1. Install etcd**
-
-Download and extract etcd (v3.5.30 in this example), then add `etcd` / `etcdctl` to `PATH`:
-
-```bash
-ETCD_VER=v3.5.30
-curl -L https://github.com/etcd-io/etcd/releases/download/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz \
-  -o etcd-${ETCD_VER}-linux-amd64.tar.gz
-tar -xzf etcd-${ETCD_VER}-linux-amd64.tar.gz
-export PATH=$PWD/etcd-${ETCD_VER}-linux-amd64:$PATH
-etcd --version
-```
+#### Build Mooncake from source
 
-**2. Build Mooncake from source (with etcd support)**
+HA mode requires Mooncake built with the matching backend enabled:
 
-HA mode requires Mooncake built with etcd support (`-DSTORE_USE_ETCD=ON -DUSE_ETCD=ON`). Install dependencies first, then build:
+- etcd: `-DSTORE_USE_ETCD=ON -DUSE_ETCD=ON`
+- redis: `-DSTORE_USE_REDIS=ON -DUSE_REDIS=ON` (build dependency: `libhiredis-dev`)
 
 ```bash
-# Download the source
 git clone https://github.com/kvcache-ai/Mooncake.git
 cd Mooncake
-
-# Install system & third-party dependencies
 bash dependencies.sh
 
-# Build C++ components (including mooncake_master, with etcd enabled)
 mkdir -p build && cd build
-cmake .. -DSTORE_USE_ETCD=ON -DUSE_ETCD=ON
+cmake .. -DSTORE_USE_ETCD=ON -DUSE_ETCD=ON   # add -DSTORE_USE_REDIS=ON -DUSE_REDIS=ON for redis
 make -j
 sudo make install
 cd ..
 
-# Build and install the Python wheel
 ./scripts/build_wheel.sh
 pip install mooncake-wheel/dist/*.whl
 ```
@@ -358,9 +347,24 @@ export CU13_BUILD=1
 pip install mooncake-wheel/dist/mooncake_transfer_engine_cuda13-*.whl
 ```
 
-#### HA Client Configuration
+#### Option A: etcd backend (`run_ha.sh`)
+
+**1. Install etcd**
+
+Download and extract etcd (v3.5.30 in this example), then add `etcd` / `etcdctl` to `PATH`:
+
+```bash
+ETCD_VER=v3.5.30
+curl -L https://github.com/etcd-io/etcd/releases/download/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz \
+  -o etcd-${ETCD_VER}-linux-amd64.tar.gz
+tar -xzf etcd-${ETCD_VER}-linux-amd64.tar.gz
+export PATH=$PWD/etcd-${ETCD_VER}-linux-amd64:$PATH
+etcd --version
+```
+
+**2. Client configuration** (`ha_mooncake_config.json`)
 
-In HA mode, both `metadata_server` and `master_server_addr` use the `etcd://` prefix pointing to the etcd cluster; clients discover the current leader through etcd (`ha_mooncake_config.json`):
+Both `metadata_server` and `master_server_addr` use the `etcd://` prefix; clients discover the current leader through etcd:
 
 ```json
 {
@@ -373,45 +377,77 @@ In HA mode, both `metadata_server` and `master_server_addr` use the `etcd://` pr
 }
 ```
 
-#### One-Command Launch & Failover Verification
+**3. Run**
+
+```bash
+cd examples/cache_storage
+bash run_ha.sh
+```
 
-A single self-contained script `examples/cache_storage/run_ha.sh` handles the whole flow — it starts the etcd cluster and the HA master cluster inline (each via a 3-iteration loop), so no separate `start_*.sh` is needed.
+The script starts a 3-node etcd cluster (client ports 12379/22379/32379), 3 HA masters (rpc 8081/8082/8083), and 2 FastDeploy instances; the leader address is written to the etcd key `mooncake-store/mooncake_cluster/master_view`.
 
-Run directly:
+Inspect the current leader manually:
+
+```bash
+etcdctl --endpoints=http://127.0.0.1:12379,http://127.0.0.1:22379,http://127.0.0.1:32379 \
+  get "mooncake-store/mooncake_cluster/master_view" --print-value-only
+```
+
+#### Option B: redis backend (`run_ha_redis.sh`)
+
+**1. Client configuration** (`ha_redis_mooncake_config.json`)
+
+Both `metadata_server` and `master_server_addr` use the `redis://` prefix pointing to the single redis instance:
+
+```json
+{
+  "metadata_server": "redis://127.0.0.1:6399",
+  "global_segment_size": 1000000000,
+  "local_buffer_size": 134217728,
+  "protocol": "rdma",
+  "rdma_devices": "",
+  "master_server_addr": "redis://127.0.0.1:6399"
+}
+```
+
+**2. Run**
 
 ```bash
 cd examples/cache_storage
-bash run_ha.sh
+bash run_ha_redis.sh
 ```
 
-What `run_ha.sh` does:
+The script starts a single redis instance (port 6399), 3 HA masters (rpc 8081/8082/8083) launched with `--ha_backend_type redis --ha_backend_connstring redis://127.0.0.1:6399`, and 2 FastDeploy instances. The master_view is a redis HASH at `mooncake-store/{mooncake_cluster}/master_view`.
+
+Inspect the current leader manually:
+
+```bash
+redis-cli -p 6399 hget "mooncake-store/{mooncake_cluster}/master_view" leader_address
+```
+
+#### What the HA scripts verify
+
+Both scripts run the same flow and verify failover:
 
-1. **Start the etcd cluster**: a loop launches 3 etcd nodes (client ports 12379/22379/32379) forming a raft cluster, after a port check.
-2. **Start 3 HA masters**: a loop launches 3 `mooncake_master` (rpc 8081/8082/8083, metrics 9091/9092/9093), each with `--enable_ha --etcd_endpoints ... --rpc_port ...`, electing one leader via etcd. The leader address is written to the etcd key `mooncake-store/mooncake_cluster/master_view`.
-3. **Start 2 FastDeploy instances**, both joining the same cache pool with `--kvcache-storage-backend mooncake`.
-4. **Verify pooling (before failover)**: warm up prompt **A** on `server_0`, then send the same prompt to `server_1`, which should hit the global cache.
-5. **Kill the leader**: the script reads the current leader's `rpc_port` from etcd, `kill -9`s that process, triggering re-election.
-6. **Verify pooling (after failover)**: once etcd's `master_view` is updated to the new leader, warm up a **brand-new** prompt **B** (never sent before) on `server_0`, then reuse it on `server_1`. Using a fresh prompt ensures the hit on `server_1` can only come from the new leader's global pool, rather than stale local cache from step 4.
+1. Start the coordination backend (etcd cluster / single redis).
+2. Start 3 HA masters; one is elected leader and published to `master_view`.
+3. Start 2 FastDeploy instances, both joining the same cache pool with `--kvcache-storage-backend mooncake`.
+4. **Before failover**: warm up prompt **A** on `server_0`, then send the same prompt to `server_1`, which should hit the global cache.
+5. **Kill the leader**: read the current leader's `rpc_port` from the backend and `kill -9` it, triggering re-election.
+6. **After failover**: once `master_view` updates to the new leader, warm up a **brand-new** prompt **B** on `server_0`, then reuse it on `server_1`. Using a fresh prompt ensures the hit can only come from the new leader's global pool, not stale local cache from step 4.
 
-> Check the election state manually:
->
-> ```bash
-> # Current leader (rpc_address:rpc_port)
-> etcdctl --endpoints=http://127.0.0.1:12379,http://127.0.0.1:22379,http://127.0.0.1:32379 \
->   get "mooncake-store/mooncake_cluster/master_view" --print-value-only
-> ```
->
-> Per-master roles can be seen in `log_master_1` / `log_master_2` / `log_master_3` (`role=leader` / `role=standby`), and etcd logs in `log_etcd_1` / `log_etcd_2` / `log_etcd_3`.
+Per-master roles can be seen in `log_master_1` / `log_master_2` / `log_master_3` (`role=leader` / `role=standby`).
 
 #### Key HA Master Parameters
 
 | Parameter | Description |
 |-----------|-------------|
 | `--enable_ha` | Enable HA mode |
+| `--ha_backend_type` | Coordination backend: `etcd` (default) or `redis` |
 | `--etcd_endpoints` | etcd endpoints, semicolon separated (when `ha_backend_type=etcd`) |
+| `--ha_backend_connstring` | Backend connection string, e.g. `redis://127.0.0.1:6399` (when `ha_backend_type=redis`) |
 | `--rpc_address` / `--rpc_port` | This master's reachable RPC address and port (must be unique per instance) |
 | `--cluster_id` | Cluster identifier; masters in the same cluster must match |
-| `--root_fs_dir` | Storage root directory in HA mode (unique per instance) |
 
 ## FastDeploy Parameters for Mooncake
 
diff --git a/docs/zh/features/global_cache_pooling.md b/docs/zh/features/global_cache_pooling.md
index 2680a5b393d..4fd15b663f6 100644
--- a/docs/zh/features/global_cache_pooling.md
+++ b/docs/zh/features/global_cache_pooling.md
@@ -48,7 +48,8 @@
 |------|------|------|
 | `run.sh` | 多实例缓存共享 | 两个独立实例共享缓存 |
 | `run_03b_pd_storage.sh` | PD 分离 | P+D 实例配合全局缓存池 |
-| `run_ha.sh` | 高可用（HA） | etcd + 多 Master 选主，杀掉 leader 后验证 failover |
+| `run_ha.sh` | 高可用（etcd） | etcd + 多 Master 选主，杀掉 leader 后验证 failover |
+| `run_ha_redis.sh` | 高可用（redis） | 单 redis + 多 Master 选主，杀掉 leader 后验证 failover |
 
 ## 环境要求
 
@@ -286,14 +287,19 @@ curl -X POST "http://0.0.0.0:52700/v1/chat/completions" \
 
 ### 场景三：高可用（HA）部署
 
-单 Master 是单点，崩溃后集群操作会暂停。生产环境推荐使用 **etcd + 多 Master** 模式：多个 `mooncake_master` 通过 etcd 进行 leader 选举，leader 故障后由备节点自动重新选主，客户端无感切换。
+单 Master 是单点，崩溃后集群操作会暂停。生产环境建议运行多个 `mooncake_master`，通过协调后端进行 leader 选举；leader 故障后由备节点自动重新选主，客户端无感切换。
+
+支持两种协调后端：
+
+- **etcd**（`run_ha.sh`）：3 节点 etcd 集群负责选主与元数据存储。
+- **redis**（`run_ha_redis.sh`）：单个 redis 实例做基于租约（lease）的选主。用它可以避免额外引入 etcd 组件。
 
 **架构图：**
 
 ```
             ┌──────────────────────────────────────┐
-            │            etcd 集群 (3 节点)         │
-            │       leader 选举 / 元数据存储        │
+            │     协调后端 (etcd / redis)           │
+            │       leader 选举 (master_view)       │
             └───────────────────┬──────────────────┘
                                 │ 选主 (master_view)
           ┌─────────────────────┼─────────────────────┐
@@ -303,47 +309,30 @@ curl -X POST "http://0.0.0.0:52700/v1/chat/completions" \
    │ rpc:8081    │       │ rpc:8082    │       │ rpc:8083    │
    │ (leader)    │       │ (standby)   │       │ (standby)   │
    └──────┬──────┘       └─────────────┘       └─────────────┘
-          │  FastDeploy 客户端通过 etcd 发现当前 leader
+          │  FastDeploy 客户端通过协调后端发现当前 leader
    ┌──────┴───────┐
    ▼              ▼
 server_0      server_1
 ```
 
-#### 前置准备
-
-**1. 安装 etcd**
-
-下载并解压 etcd（示例为 v3.5.30），将 `etcd` / `etcdctl` 加入 `PATH`：
-
-```bash
-ETCD_VER=v3.5.30
-curl -L https://github.com/etcd-io/etcd/releases/download/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz \
-  -o etcd-${ETCD_VER}-linux-amd64.tar.gz
-tar -xzf etcd-${ETCD_VER}-linux-amd64.tar.gz
-export PATH=$PWD/etcd-${ETCD_VER}-linux-amd64:$PATH
-etcd --version
-```
+#### 源码编译 Mooncake
 
-**2. 源码编译安装 Mooncake（支持 etcd）**
+HA 模式需要 Mooncake 在编译时开启对应后端：
 
-HA 模式需要 Mooncake 在编译时开启 etcd 支持（`-DSTORE_USE_ETCD=ON -DUSE_ETCD=ON`）。先安装依赖再编译：
+- etcd：`-DSTORE_USE_ETCD=ON -DUSE_ETCD=ON`
+- redis：`-DSTORE_USE_REDIS=ON -DUSE_REDIS=ON`（编译依赖：`libhiredis-dev`）
 
 ```bash
-# 下载源码
 git clone https://github.com/kvcache-ai/Mooncake.git
 cd Mooncake
-
-# 安装系统及第三方依赖
 bash dependencies.sh
 
-# 编译 C++ 组件（含 mooncake_master，开启 etcd）
 mkdir -p build && cd build
-cmake .. -DSTORE_USE_ETCD=ON -DUSE_ETCD=ON
+cmake .. -DSTORE_USE_ETCD=ON -DUSE_ETCD=ON   # redis 后端追加 -DSTORE_USE_REDIS=ON -DUSE_REDIS=ON
 make -j
 sudo make install
 cd ..
 
-# 编译并安装 Python wheel
 ./scripts/build_wheel.sh
 pip install mooncake-wheel/dist/*.whl
 ```
@@ -357,9 +346,24 @@ export CU13_BUILD=1
 pip install mooncake-wheel/dist/mooncake_transfer_engine_cuda13-*.whl
 ```
 
-#### HA 客户端配置
+#### 方式 A：etcd 后端（`run_ha.sh`）
+
+**1. 安装 etcd**
+
+下载并解压 etcd（示例为 v3.5.30），将 `etcd` / `etcdctl` 加入 `PATH`：
+
+```bash
+ETCD_VER=v3.5.30
+curl -L https://github.com/etcd-io/etcd/releases/download/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz \
+  -o etcd-${ETCD_VER}-linux-amd64.tar.gz
+tar -xzf etcd-${ETCD_VER}-linux-amd64.tar.gz
+export PATH=$PWD/etcd-${ETCD_VER}-linux-amd64:$PATH
+etcd --version
+```
+
+**2. 客户端配置**（`ha_mooncake_config.json`）
 
-HA 模式下，`metadata_server` 与 `master_server_addr` 都使用 `etcd://` 前缀指向 etcd 集群，由客户端通过 etcd 发现当前 leader（`ha_mooncake_config.json`）：
+`metadata_server` 与 `master_server_addr` 都使用 `etcd://` 前缀，由客户端通过 etcd 发现当前 leader：
 
 ```json
 {
@@ -372,45 +376,77 @@ HA 模式下，`metadata_server` 与 `master_server_addr` 都使用 `etcd://` 
 }
 ```
 
-#### 一键启动与 failover 验证
+**3. 运行**
+
+```bash
+cd examples/cache_storage
+bash run_ha.sh
+```
 
-单个自包含脚本 `examples/cache_storage/run_ha.sh` 负责整个流程 —— 它在脚本内部用循环分别拉起 etcd 集群和 HA master 集群，不再依赖单独的 `start_*.sh`。
+脚本会启动 3 节点 etcd 集群（client 端口 12379/22379/32379）、3 个 HA Master（rpc 8081/8082/8083）和 2 个 FastDeploy 实例；leader 地址写入 etcd 的 `mooncake-store/mooncake_cluster/master_view`。
 
-直接运行：
+手动查看当前 leader：
+
+```bash
+etcdctl --endpoints=http://127.0.0.1:12379,http://127.0.0.1:22379,http://127.0.0.1:32379 \
+  get "mooncake-store/mooncake_cluster/master_view" --print-value-only
+```
+
+#### 方式 B：redis 后端（`run_ha_redis.sh`）
+
+**1. 客户端配置**（`ha_redis_mooncake_config.json`）
+
+`metadata_server` 与 `master_server_addr` 都使用 `redis://` 前缀指向单个 redis 实例：
+
+```json
+{
+  "metadata_server": "redis://127.0.0.1:6399",
+  "global_segment_size": 1000000000,
+  "local_buffer_size": 134217728,
+  "protocol": "rdma",
+  "rdma_devices": "",
+  "master_server_addr": "redis://127.0.0.1:6399"
+}
+```
+
+**2. 运行**
 
 ```bash
 cd examples/cache_storage
-bash run_ha.sh
+bash run_ha_redis.sh
 ```
 
-`run_ha.sh` 的执行流程：
+脚本会启动单个 redis 实例（端口 6399）、3 个用 `--ha_backend_type redis --ha_backend_connstring redis://127.0.0.1:6399` 拉起的 HA Master（rpc 8081/8082/8083）和 2 个 FastDeploy 实例。master_view 是 redis 中的一个 HASH，key 为 `mooncake-store/{mooncake_cluster}/master_view`。
+
+手动查看当前 leader：
+
+```bash
+redis-cli -p 6399 hget "mooncake-store/{mooncake_cluster}/master_view" leader_address
+```
+
+#### HA 脚本验证的内容
+
+两个脚本流程相同，均验证 failover：
 
-1. **启动 etcd 集群**：端口检查后，用循环拉起 3 个 etcd 节点（client 端口 12379/22379/32379）组成 raft 集群。
-2. **启动 3 个 HA Master**：用循环拉起 3 个 `mooncake_master`（rpc 8081/8082/8083，metrics 9091/9092/9093），每个都带 `--enable_ha --etcd_endpoints ... --rpc_port ...`，通过 etcd 选出一个 leader。leader 地址写入 etcd 的 `mooncake-store/mooncake_cluster/master_view`。
-3. **启动 2 个 FastDeploy 实例**，均以 `--kvcache-storage-backend mooncake` 接入同一缓存池。
-4. **验证池化（failover 前）**：用 prompt **A** 在 `server_0` 预热，再向 `server_1` 发送相同 prompt，应命中全局缓存。
-5. **杀掉 leader**：脚本从 etcd 读取当前 leader 的 `rpc_port`，`kill -9` 对应进程，触发重新选主。
-6. **验证池化（failover 后）**：等待 etcd 中 `master_view` 更新为新 leader 后，用一条**全新的** prompt **B**（failover 前从未发过）在 `server_0` 预热，再在 `server_1` 复用。使用新 prompt 可确保 `server_1` 的命中只能来自新 leader 的全局池，而非步骤 4 残留的本地缓存。
+1. 启动协调后端（etcd 集群 / 单 redis）。
+2. 启动 3 个 HA Master，选出一个 leader 并写入 `master_view`。
+3. 启动 2 个 FastDeploy 实例，均以 `--kvcache-storage-backend mooncake` 接入同一缓存池。
+4. **failover 前**：用 prompt **A** 在 `server_0` 预热，再向 `server_1` 发送相同 prompt，应命中全局缓存。
+5. **杀掉 leader**：从后端读取当前 leader 的 `rpc_port`，`kill -9` 触发重新选主。
+6. **failover 后**：等待 `master_view` 更新为新 leader 后，用一条**全新的** prompt **B** 在 `server_0` 预热，再在 `server_1` 复用。使用新 prompt 可确保命中只能来自新 leader 的全局池，而非步骤 4 残留的本地缓存。
 
-> 单独验证选主状态：
->
-> ```bash
-> # 查看当前 leader（rpc_address:rpc_port）
-> etcdctl --endpoints=http://127.0.0.1:12379,http://127.0.0.1:22379,http://127.0.0.1:32379 \
->   get "mooncake-store/mooncake_cluster/master_view" --print-value-only
-> ```
->
-> 各 Master 角色可在 `log_master_1` / `log_master_2` / `log_master_3` 中查看（`role=leader` / `role=standby`），etcd 日志见 `log_etcd_1` / `log_etcd_2` / `log_etcd_3`。
+各 Master 角色可在 `log_master_1` / `log_master_2` / `log_master_3` 中查看（`role=leader` / `role=standby`）。
 
 #### HA Master 关键参数
 
 | 参数 | 说明 |
 |------|------|
 | `--enable_ha` | 开启 HA 模式 |
+| `--ha_backend_type` | 协调后端：`etcd`（默认）或 `redis` |
 | `--etcd_endpoints` | etcd 端点，分号分隔（`ha_backend_type=etcd` 时） |
+| `--ha_backend_connstring` | 后端连接串，如 `redis://127.0.0.1:6399`（`ha_backend_type=redis` 时） |
 | `--rpc_address` / `--rpc_port` | 该 Master 对外可达的 RPC 地址与端口（每个实例需唯一） |
 | `--cluster_id` | 集群标识，同一集群的 Master 需一致 |
-| `--root_fs_dir` | HA 模式下的存储根目录（每个实例独立） |
 
 ## FastDeploy Mooncake 相关参数
 
diff --git a/examples/cache_storage/README.md b/examples/cache_storage/README.md
index c1b4d05bcb8..6c269ed13fd 100644
--- a/examples/cache_storage/README.md
+++ b/examples/cache_storage/README.md
@@ -16,8 +16,9 @@ bash run.sh
 # PD disaggregation scenario
 bash run_03b_pd_storage.sh
 
-# High-availability (etcd + multi-master + failover) scenario
-bash run_ha.sh
+# High-availability scenario (multi-master + failover)
+bash run_ha.sh         # etcd backend
+bash run_ha_redis.sh   # redis backend
 ```
 
 ## Scripts
@@ -26,11 +27,13 @@ bash run_ha.sh
 |--------|----------|-------------|
 | `run.sh` | Multi-Instance | Two standalone instances sharing cache |
 | `run_03b_pd_storage.sh` | PD Disaggregation | P+D instances with global cache pooling |
-| `run_ha.sh` | High Availability | Self-contained: starts etcd + 3 masters with leader election, then kills the leader and re-verifies pooling with a fresh prompt after re-election |
+| `run_ha.sh` | High Availability (etcd) | Self-contained: starts etcd + 3 masters with leader election, then kills the leader and re-verifies pooling with a fresh prompt after re-election |
+| `run_ha_redis.sh` | High Availability (redis) | Same flow as `run_ha.sh`, but uses a single redis instead of etcd for leader election |
 
 ## Files
 
 - `mooncake_config.json` - Mooncake configuration file (single master)
 - `ha_mooncake_config.json` - Mooncake HA client config (etcd-based master discovery)
+- `ha_redis_mooncake_config.json` - Mooncake HA client config (redis-based master discovery)
 - `utils.sh` - Utility functions for scripts
 - `stop.sh` - Stop all running services
diff --git a/examples/cache_storage/ha_redis_mooncake_config.json b/examples/cache_storage/ha_redis_mooncake_config.json
new file mode 100644
index 00000000000..662bfc6965f
--- /dev/null
+++ b/examples/cache_storage/ha_redis_mooncake_config.json
@@ -0,0 +1,8 @@
+{
+  "metadata_server": "redis://127.0.0.1:6399",
+  "global_segment_size": 1000000000,
+  "local_buffer_size": 134217728,
+  "protocol": "rdma",
+  "rdma_devices": "",
+  "master_server_addr": "redis://127.0.0.1:6399"
+}
diff --git a/examples/cache_storage/run_ha.sh b/examples/cache_storage/run_ha.sh
index dd87dbb7128..ba7ea0a60de 100644
--- a/examples/cache_storage/run_ha.sh
+++ b/examples/cache_storage/run_ha.sh
@@ -65,17 +65,23 @@ wait_for_leader() {
     done
 }
 
-# Kill the mooncake_master process that owns the given rpc_port (leader).
+# Kill the mooncake_master process(es) that own the given rpc_port (leader).
 kill_master_by_rpc_port() {
     local rpc_port=$1
     # match "--rpc_port 8081" or "--rpc_port=8081" on the full command line
-    local pid=$(pgrep -af mooncake_master | grep -E "rpc_port[ =]${rpc_port}([^0-9]|$)" | awk '{print $1}' | head -n1)
-    if [ -n "${pid}" ]; then
-        echo "kill leader master pid=${pid} (rpc_port=${rpc_port})"
-        kill -9 "${pid}" || true
-    else
+    local pids=$(pgrep -af mooncake_master | grep -E "rpc_port[ =]${rpc_port}([^0-9]|$)" | awk '{print $1}')
+    if [ -z "${pids}" ]; then
         echo "⚠️  no mooncake_master process found for rpc_port=${rpc_port}"
+        return
     fi
+    # also collect direct children by ppid, in case a child's cmdline didn't match.
+    local all_pids="${pids}"
+    for p in ${pids}; do
+        local kids=$(pgrep -P "${p}" 2>/dev/null)
+        [ -n "${kids}" ] && all_pids="${all_pids} ${kids}"
+    done
+    echo "kill leader master pids=$(echo ${all_pids} | tr '\n' ' ')(rpc_port=${rpc_port})"
+    kill -9 ${all_pids} 2>/dev/null || true
 }
 
 # Send a chat request to a FastDeploy server.
@@ -135,17 +141,13 @@ check_ports "${master_ports[@]}" || {
 }
 
 for i in 1 2 3; do
-    rm -rf /tmp/mooncake_ha/master${i}
-    mkdir -p /tmp/mooncake_ha/master${i}
     mooncake_master \
         --enable_ha \
         --etcd_endpoints "${ETCD_ENDPOINTS_HA}" \
         --cluster_id "${CLUSTER_ID}" \
         --rpc_address "127.0.0.1" \
         --rpc_port 808${i} \
-        --metrics_port=909${i} \
-        --root_fs_dir=/tmp/mooncake_ha/master${i} \
-        --enable_offload=true > log_master_${i} 2>&1 &
+        --metrics_port=909${i} > log_master_${i} 2>&1 &
 done
 
 echo "waiting for leader election..."
diff --git a/examples/cache_storage/run_ha_redis.sh b/examples/cache_storage/run_ha_redis.sh
new file mode 100644
index 00000000000..0eb25e02202
--- /dev/null
+++ b/examples/cache_storage/run_ha_redis.sh
@@ -0,0 +1,260 @@
+#!/bin/bash
+set -e
+# =============================================================================
+# HA Global Cache Pooling test script — REDIS backend (single redis + multi-master + failover)
+# Mirror of run_ha.sh, but replaces the 3-node etcd cluster with a SINGLE redis
+# instance. The 3 mooncake_master use redis (lease-based leader election) instead
+# of etcd raft. Motivation: redis avoids introducing etcd as an extra component.
+#
+# Flow (identical to run_ha.sh):
+#   1. start a single redis instance
+#   2. start 3 HA masters (one is elected leader via a redis lease)
+#   3. start 2 FastDeploy instances sharing the global cache pool
+#   4. verify pooling (before failover): warmup on server_0, reuse on server_1
+#   5. kill the leader master, wait for a standby to be re-elected
+#   6. verify pooling (after failover) with a BRAND-NEW prompt
+# =============================================================================
+
+export PYTHONPATH="/workspace/mooncake-test/FastDeploy:$PYTHONPATH"
+export MODEL_NAME="/workspace/models/Ernie-0.3B"
+export MOONCAKE_CONFIG_PATH=./ha_redis_mooncake_config.json
+export FD_DEBUG=1
+
+unset http_proxy && unset https_proxy
+
+echo "begin"
+source ./utils.sh
+
+# ---- topology ---------------------------------------------------------------
+# redis:        client port = 6399 (single instance)
+# master node i: rpc port = 808${i}, metrics port = 909${i}
+REDIS_PORT=6399
+REDIS_SERVER_BIN="$(command -v redis-server || echo /usr/local/redis/bin/redis-server)"
+REDIS_CLI_BIN="$(command -v redis-cli || echo /usr/local/redis/bin/redis-cli)"
+REDIS_CONN="redis://127.0.0.1:${REDIS_PORT}"          # for mooncake_master + client config
+
+CLUSTER_ID="mooncake_cluster"
+# redis master_view key uses a hash-tag {cluster_id} so all related keys land in
+# the same Redis Cluster slot; it is a HASH with fields leader_address/view_version/owner_token.
+MASTER_VIEW_KEY="mooncake-store/{${CLUSTER_ID}}/master_view"
+
+S0_PORT=52700
+S1_PORT=52800
+
+# ---- helpers ----------------------------------------------------------------
+
+# Query redis for the current leader's "rpc_address:rpc_port".
+# The master_view is a redis HASH; the leader endpoint lives in field leader_address.
+# redis-cli prints raw (unquoted) output when piped, so no extra unquoting needed.
+get_leader_addr() {
+    "${REDIS_CLI_BIN}" -p "${REDIS_PORT}" hget "${MASTER_VIEW_KEY}" leader_address 2>/dev/null \
+        | tr -d '[:space:]'
+}
+
+# Wait until a leader is elected and published into redis.
+wait_for_leader() {
+    local timeout=${1:-60}
+    local start_time=$(date +%s)
+    while true; do
+        local leader=$(get_leader_addr)
+        if [ -n "${leader}" ]; then
+            echo "${leader}"
+            return 0
+        fi
+        if [ $(( $(date +%s) - start_time )) -ge ${timeout} ]; then
+            echo ""
+            return 1
+        fi
+        sleep 1
+    done
+}
+
+# Kill the mooncake_master process(es) that own the given rpc_port (leader).
+kill_master_by_rpc_port() {
+    local rpc_port=$1
+    # match "--rpc_port 8081" or "--rpc_port=8081" on the full command line
+    local pids=$(pgrep -af mooncake_master | grep -E "rpc_port[ =]${rpc_port}([^0-9]|$)" | awk '{print $1}')
+    if [ -z "${pids}" ]; then
+        echo "⚠️  no mooncake_master process found for rpc_port=${rpc_port}"
+        return
+    fi
+    # also collect direct children by ppid, in case a child's cmdline didn't match.
+    local all_pids="${pids}"
+    for p in ${pids}; do
+        local kids=$(pgrep -P "${p}" 2>/dev/null)
+        [ -n "${kids}" ] && all_pids="${all_pids} ${kids}"
+    done
+    echo "kill leader master pids=$(echo ${all_pids} | tr '\n' ' ')(rpc_port=${rpc_port})"
+    kill -9 ${all_pids} 2>/dev/null || true
+
+}
+
+# Send a chat request to a FastDeploy server.
+send_request() {
+    local port=$1
+    local content=$2
+    curl -s -X POST "http://0.0.0.0:${port}/v1/chat/completions" \
+      -H "Content-Type: application/json" \
+      -d "{
+        \"messages\": [
+          {\"role\": \"user\", \"content\": \"${content}\"}
+        ],
+        \"max_tokens\": 50,
+        \"stream\": false,
+        \"top_p\": 0
+      }"
+    echo
+}
+
+# ---- 1. start a single redis instance ---------------------------------------
+echo "=== [1/6] start redis ==="
+pkill -9 -f "redis-server .*:${REDIS_PORT}" || true
+sleep 1
+
+check_ports "${REDIS_PORT}" || {
+    echo "❌ redis port ${REDIS_PORT} is in use. Please release it."
+    exit 1
+}
+
+# disable persistence; this is a throwaway coordination store.
+"${REDIS_SERVER_BIN}" --port "${REDIS_PORT}" --save "" --appendonly no \
+    --daemonize no > log_redis 2>&1 &
+sleep 2
+echo "=== redis health check ==="
+"${REDIS_CLI_BIN}" -p "${REDIS_PORT}" ping
+
+# ---- 2. start 3 HA masters (redis backend) ----------------------------------
+echo "=== [2/6] start 3 HA mooncake_master (redis backend) ==="
+pkill -9 -f mooncake_master || true
+sleep 1
+
+master_ports=(8081 8082 8083 9091 9092 9093)
+check_ports "${master_ports[@]}" || {
+    echo "❌ Some master ports are in use. Please release them."
+    exit 1
+}
+
+for i in 1 2 3; do
+    # --ha_backend_type redis + --ha_backend_connstring redis://...
+    mooncake_master \
+        --enable_ha \
+        --ha_backend_type redis \
+        --ha_backend_connstring "${REDIS_CONN}" \
+        --cluster_id "${CLUSTER_ID}" \
+        --rpc_address "127.0.0.1" \
+        --rpc_port 808${i} \
+        --metrics_port=909${i} > log_master_${i} 2>&1 &
+done
+
+echo "waiting for leader election..."
+LEADER_ADDR=$(wait_for_leader 60) || {
+    echo "❌ no leader elected within timeout"
+    exit 1
+}
+echo "✅ current leader: ${LEADER_ADDR}"
+
+# ---- 3. start 2 FastDeploy instances ----------------------------------------
+echo "=== [3/6] start FastDeploy instances ==="
+
+# clean up any lingering FastDeploy services so the ports are free.
+# the api_server runs under gunicorn; killing the gunicorn masters takes the
+# workers down with them.
+pkill -f "gunicorn: master" || true
+sleep 2
+
+rm -rf log_0 log_1
+
+fd_ports=("$S0_PORT" "$S1_PORT")
+check_ports "${fd_ports[@]}" || {
+    echo "❌ Some ports are in use. Please release them."
+    exit 1
+}
+
+# Launch FD server 0
+export CUDA_VISIBLE_DEVICES=6
+export FD_LOG_DIR="log_0"
+mkdir -p ${FD_LOG_DIR}
+echo "server 0 port: ${S0_PORT}"
+
+nohup python -m fastdeploy.entrypoints.openai.api_server \
+       --model ${MODEL_NAME} \
+       --port ${S0_PORT} \
+       --max-model-len 32768 \
+       --max-num-seqs 32 \
+       --kvcache-storage-backend mooncake \
+       2>&1 >${FD_LOG_DIR}/nohup &
+
+# Launch FD server 1
+export CUDA_VISIBLE_DEVICES=7
+export FD_LOG_DIR="log_1"
+mkdir -p ${FD_LOG_DIR}
+echo "server 1 port: ${S1_PORT}"
+
+nohup python -m fastdeploy.entrypoints.openai.api_server \
+       --model ${MODEL_NAME} \
+       --port ${S1_PORT} \
+       --max-model-len 32768 \
+       --max-num-seqs 32 \
+       --kvcache-storage-backend mooncake \
+       2>&1 >${FD_LOG_DIR}/nohup &
+
+wait_for_health ${S0_PORT}
+wait_for_health ${S1_PORT}
+# ---- 4. verify pooling before failover (warmup on s0, reuse on s1) ----------
+# msg_a: warmed on server_0, then reused on server_1.
+msg_a="深圳是中国经济实力最强的城市之一。近年来，深圳GDP持续稳步增长，2023年突破3.4万亿元人民币，2024年接近3.7万亿元。长期位居全国城市前列。深圳经济以第二产业和第三产业为主，高端制造业、电子信息产业和现代服务业发达，形成了以科技创新为核心的产业结构。依托华为、腾讯、大疆等龙头企业，深圳在数字经济、人工智能、新能源等领域具有显著优势。同时，深圳进出口总额常年位居全国城市第一，是中国对外开放和高质量发展的重要引擎。深圳持续推进创新驱动发展战略，不断加大研发投入，全社会研发投入占GDP比重长期保持较高水平。深圳拥有完善的创业生态体系，吸引了大量科技企业和创新人才。近年来，深圳积极布局半导体、生物医药、低空经济和智能网联汽车等战略性新兴产业，进一步增强经济增长动能。请总结深圳经济发展的核心优势。"
+
+echo "=== [4/6] verify pooling before failover ==="
+echo ">>> warmup msg_a on server_0 (${S0_PORT})"
+send_request ${S0_PORT} "${msg_a}"
+sleep 5
+echo ">>> reuse msg_a on server_1 (${S1_PORT}), expect cache hit"
+send_request ${S1_PORT} "${msg_a}"
+
+# ---- 5. kill the leader, wait for re-election -------------------------------
+echo "=== [5/6] kill leader and wait for failover ==="
+OLD_LEADER_ADDR=$(get_leader_addr)
+OLD_LEADER_PORT="${OLD_LEADER_ADDR##*:}"
+echo "old leader: ${OLD_LEADER_ADDR} (rpc_port=${OLD_LEADER_PORT})"
+kill_master_by_rpc_port "${OLD_LEADER_PORT}"
+
+echo "waiting for a new leader to be elected..."
+NEW_LEADER_ADDR=""
+start_time=$(date +%s)
+while true; do
+    cur=$(get_leader_addr)
+    if [ -n "${cur}" ] && [ "${cur}" != "${OLD_LEADER_ADDR}" ]; then
+        NEW_LEADER_ADDR="${cur}"
+        break
+    fi
+    if [ $(( $(date +%s) - start_time )) -ge 60 ]; then
+        echo "❌ no new leader elected within timeout"
+        exit 1
+    fi
+    sleep 1
+done
+echo "✅ new leader: ${NEW_LEADER_ADDR} (was ${OLD_LEADER_ADDR})"
+
+# wait for the new leader to finish recovery and reach serving state
+# (and for clients to reconnect) before sending requests; 5s was too short.
+sleep 10
+
+# ---- 6. verify pooling after failover with a BRAND-NEW prompt ---------------
+# Use a different prompt (msg_b) never sent before the failover, so a hit on
+# server_1 proves the cache was written/read through the NEW leader's global
+# pool (not stale local cache from step 4).
+msg_b="人工智能已经成为全球科技竞争的重要方向。近年来，大模型技术快速发展，在自然语言处理、代码生成、多模态理解以及智能代理等领域取得显著突破。越来越多企业开始将人工智能技术应用于客服、办公自动化、内容生成、金融风控和软件开发等场景。与此同时，人工智能的发展也带来了新的挑战，包括算力成本快速上升、训练数据质量参差不齐、模型幻觉问题以及隐私保护需求增强。各国政府正在制定相应监管框架，以平衡技术创新和风险控制之间的关系。未来几年，人工智能有望进一步推动生产力提升，并深刻影响教育、医疗、科研和工业制造等行业的发展模式。请列出人工智能当前面临的主要挑战。"
+
+echo "=== [6/6] verify pooling after failover (new prompt msg_b) ==="
+echo ">>> warmup msg_b on server_0 (${S0_PORT})"
+send_request ${S0_PORT} "${msg_b}"
+sleep 5
+echo ">>> reuse msg_b on server_1 (${S1_PORT}), expect cache hit via new leader"
+send_request ${S1_PORT} "${msg_b}"
+
+echo
+echo "=== HA (redis) test completed ==="
+echo "Check cache hit:  grep -E 'storage_cache_token_num' log_*/cache_storage.log* "
+echo "Master logs:      log_master_1 / log_master_2 / log_master_3"
+echo "Redis log:        log_redis"
+echo "Current leader:   ${REDIS_CLI_BIN} -p ${REDIS_PORT} hget '${MASTER_VIEW_KEY}' leader_address"