Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 52 additions & 16 deletions docs/en/install/global_dr.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -132,32 +132,61 @@ Refer to the following documentation to complete installation:

### Step 3: Enable etcd Synchronization \{#etcd_sync}

1. When applicable, configure the load balancer to forward port `2379` to control plane nodes of the corresponding cluster. ONLY TCP mode is supported; forwarding on L7 is not supported.
1. Before installing the plugin, create the `etcd-sync-active-cluster-token` Secret in the standby global cluster under the `cpaas-system` namespace. The Secret must store a bearer token for accessing the active global cluster API server under the data key `token`.

```bash
# Run this command on the standby cluster.
ACTIVE_CLUSTER_TOKEN='<paste-your-token-here>'
kubectl -n cpaas-system create secret generic etcd-sync-active-cluster-token \
--from-literal=token="${ACTIVE_CLUSTER_TOKEN}" \
--dry-run=client -o yaml | kubectl apply -f -
```

2. When applicable, configure the load balancer to forward port `2379` to control plane nodes of the corresponding cluster. ONLY TCP mode is supported; forwarding on L7 is not supported.

:::info
Port forwarding through a load balancer is not required.
If direct access from the standby cluster to the active global cluster is available, specify the etcd addresses via **Active Global Cluster ETCD Endpoints**.
:::

2. Access the **standby global cluster** Web Console using its VIP, and switch to **Administrator** view;
3. Navigate to **Marketplace > Cluster Plugins**, select the `global` cluster;
4. Find **<Term name="product" /> etcd Synchronizer**, click **Install**, configure parameters:
3. Access the **standby global cluster** Web Console using its VIP, and switch to **Administrator** view.
4. Navigate to **Marketplace > Cluster Plugins**, select the `global` cluster.
5. Find **<Term name="product" /> etcd Synchronizer**, click **Install**, and configure these parameters:

* Set **Active Global Cluster VIP** to the VIP of the active global cluster.
* When not forwarding port `2379` through the load balancer, configure **Active Global Cluster ETCD Endpoints** correctly.
* Set **Standby Cluster ETCD Endpoints** to the standby cluster etcd address. Use the default value unless the local etcd service is exposed through a different endpoint.
* Set **Active Global Cluster Token Secret** to `etcd-sync-active-cluster-token`.
* Use the default value of **Data Check Interval**.
* Leave **Print detail logs** disabled unless troubleshooting.

* When not forwarding port `2379` through load balancer, its required to configure **Active Global Cluster ETCD Endpoints** correctly;
* Use the default value of **Data Check Interval**;
* Leave **Print detail logs** switch disabled unless troubleshooting.
During installation, the system runs the `etcd-sync-bootstrap` Job before the `etcd-sync` Deployment starts. The plugin installation continues only after the Job prepares `remote-etcd-ca`, `remote-etcd-issuer`, and `remote-etcd-client`.

Verify the bootstrap Job and runtime resources:

```bash
kubectl get job -n cpaas-system etcd-sync-bootstrap
kubectl logs -n cpaas-system job/etcd-sync-bootstrap
kubectl get secret -n cpaas-system remote-etcd-ca
kubectl get issuer -n cpaas-system remote-etcd-issuer
kubectl get certificate -n cpaas-system remote-etcd-client
kubectl get secret -n cpaas-system remote-etcd-client
```

Verify the sync Pod is running on the standby cluster:
Verify the sync Pods are running on the standby cluster and identify the current leader:

```bash
kubectl get po -n cpaas-system -l app=etcd-sync
kubectl logs -n cpaas-system $(kubectl get po -n cpaas-system -l app=etcd-sync --no-headers | head -1) | grep -i "Start Sync update"
kubectl get lease -n cpaas-system etcd-sync-mirror
leader_pod=$(kubectl get lease -n cpaas-system etcd-sync-mirror -o jsonpath='{.spec.holderIdentity}')
kubectl logs -n cpaas-system "$leader_pod" | grep -E "Acquired leader lease|Start Sync update"
```

Once “Start Sync update” appears, recreate one of the pods to re-trigger sync of resources with ownerReference dependencies:
If resources with `ownerReference` dependencies need to be resynchronized, recreate the current leader Pod after `Start Sync update` appears:

```bash
kubectl delete po -n cpaas-system $(kubectl get po -n cpaas-system -l app=etcd-sync --no-headers | head -1)
leader_pod=$(kubectl get lease -n cpaas-system etcd-sync-mirror -o jsonpath='{.spec.holderIdentity}')
kubectl delete po -n cpaas-system "$leader_pod"
```

Check sync status:
Expand All @@ -166,16 +195,16 @@ Check sync status:
mirror_svc=$(kubectl get svc -n cpaas-system etcd-sync-monitor -o jsonpath='{.spec.clusterIP}')
ipv6_regex="^[0-9a-fA-F:]+$"
if [[ $mirror_svc =~ $ipv6_regex ]]; then
export mirror_new_svc="[$mirror_svc]"
mirror_host="[$mirror_svc]"
else
export mirror_new_svc=$mirror_svc
mirror_host="$mirror_svc"
fi
curl $mirror_new_svc/check
curl -g "http://${mirror_host}/check"
```

**Output explanation:**

* `LOCAL ETCD missed keys`: Keys exist in the Primary but are missing from the standby. Often caused by GC due to resource order during sync. Restart one etcd-sync Pod to fix;
* `LOCAL ETCD missed keys`: Keys exist in the Primary but are missing from the standby. Often caused by GC due to resource order during sync. Restart the current etcd-sync leader Pod to fix;
* `LOCAL ETCD surplus keys`: Extra keys exist only in the standby cluster. Confirm with ops team before deleting these keys from the standby.

If the following components are installed, restart their services:
Expand Down Expand Up @@ -258,7 +287,14 @@ If the following components are installed, restart their services:
Regularly check sync status on the standby cluster:

```bash
curl $(kubectl get svc -n cpaas-system etcd-sync-monitor -o jsonpath='{.spec.clusterIP}')/check
mirror_svc=$(kubectl get svc -n cpaas-system etcd-sync-monitor -o jsonpath='{.spec.clusterIP}')
ipv6_regex="^[0-9a-fA-F:]+$"
if [[ $mirror_svc =~ $ipv6_regex ]]; then
mirror_host="[$mirror_svc]"
else
mirror_host="$mirror_svc"
fi
curl -g "http://${mirror_host}/check"
```

If any keys are missing or surplus, follow the instructions in the output to resolve them.
Expand Down
45 changes: 38 additions & 7 deletions docs/en/upgrade/upgrade_global_cluster.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -315,6 +315,16 @@ After the standby global cluster has reached the desired version, run the remain

Before reinstalling the plugin, verify that port `2379` is forwarded correctly from both global-cluster VIPs to their control plane nodes when that forwarding mode is used. Port forwarding through a load balancer is not required if the standby global cluster can access the active global cluster directly.

Before reinstalling the plugin, create or update the `etcd-sync-active-cluster-token` Secret in `cpaas-system`. The Secret must store a bearer token for accessing the active global cluster API server under the data key `token`. Use this Secret through **Active Global Cluster Token Secret**. Legacy plain-token configuration remains only as a compatibility fallback and is not the recommended operational path.

```bash
# Run this command on the standby cluster.
ACTIVE_CLUSTER_TOKEN='<paste-your-token-here>'
kubectl -n cpaas-system create secret generic etcd-sync-active-cluster-token \
--from-literal=token="${ACTIVE_CLUSTER_TOKEN}" \
--dry-run=client -o yaml | kubectl apply -f -
```

To reinstall the plugin:

1. Access the **standby global cluster** Web Console through its VIP and switch to **Administrator** view.
Expand All @@ -323,23 +333,40 @@ To reinstall the plugin:

When you configure the plugin:

- Set **Active Global Cluster VIP** to the VIP of the active global cluster.
- When port `2379` is not forwarded through a load balancer, set **Active Global Cluster ETCD Endpoints** correctly.
- Set **Standby Cluster ETCD Endpoints** to the standby cluster etcd address. Use the default value unless the local etcd service is exposed through a different endpoint.
- Set **Active Global Cluster Token Secret** to `etcd-sync-active-cluster-token`.
- Use the default value of **Data Check Interval**.
- Leave **Print detail logs** disabled unless you are troubleshooting.

Verify the sync Pod is running on the standby global cluster:
During reinstallation, the system runs the `etcd-sync-bootstrap` Job before the `etcd-sync` Deployment starts. The release continues only after the Job prepares `remote-etcd-ca`, `remote-etcd-issuer`, and `remote-etcd-client`.

Verify the bootstrap Job and runtime resources:

```bash
kubectl get job -n cpaas-system etcd-sync-bootstrap
kubectl logs -n cpaas-system job/etcd-sync-bootstrap
kubectl get secret -n cpaas-system remote-etcd-ca
kubectl get issuer -n cpaas-system remote-etcd-issuer
kubectl get certificate -n cpaas-system remote-etcd-client
kubectl get secret -n cpaas-system remote-etcd-client
```

Verify the sync Pods are running on the standby global cluster and identify the current leader:

```bash
kubectl get po -n cpaas-system -l app=etcd-sync
etcd_sync_pod=$(kubectl get po -n cpaas-system -l app=etcd-sync -o jsonpath='{.items[0].metadata.name}')
kubectl logs -n cpaas-system "$etcd_sync_pod" | grep -i "Start Sync update"
kubectl get lease -n cpaas-system etcd-sync-mirror
leader_pod=$(kubectl get lease -n cpaas-system etcd-sync-mirror -o jsonpath='{.spec.holderIdentity}')
kubectl logs -n cpaas-system "$leader_pod" | grep -E "Acquired leader lease|Start Sync update"
```

Once `Start Sync update` appears, recreate one of the Pods to trigger synchronization of resources with ownerReference dependencies:
If resources with `ownerReference` dependencies need to be resynchronized, recreate the current leader Pod after `Start Sync update` appears:

```bash
etcd_sync_pod=$(kubectl get po -n cpaas-system -l app=etcd-sync -o jsonpath='{.items[0].metadata.name}')
kubectl delete po -n cpaas-system "$etcd_sync_pod"
leader_pod=$(kubectl get lease -n cpaas-system etcd-sync-mirror -o jsonpath='{.spec.holderIdentity}')
kubectl delete po -n cpaas-system "$leader_pod"
```

Check sync status:
Expand All @@ -357,9 +384,13 @@ curl -g "http://${mirror_host}/check"

Output interpretation:

- `LOCAL ETCD missed keys`: Keys exist in the primary global cluster but are missing from the standby. This often resolves after restarting one `etcd-sync` Pod.
- `LOCAL ETCD missed keys`: Keys exist in the primary global cluster but are missing from the standby. This often resolves after restarting the current `etcd-sync` leader Pod.
- `LOCAL ETCD surplus keys`: Keys exist in the standby global cluster but not in the primary. Review these with your operations team before deleting them.

After verification succeeds, remove any remaining legacy plain-token configuration from the plugin settings or release values.

If the active-cluster token or the remote `etcd-ca` changes later, run the plugin upgrade or reinstall workflow again so that the `etcd-sync-bootstrap` hook refreshes the runtime credentials and certificates.

</Steps>

## Related Documentation
Expand Down