You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
## Scenario 4: One node disappears from the cluster
49
49
50
-
This is the case when one node becomes unavailable due to power outage, hardware failure, kernel panic, mysqld crash, **kill -9** on mysqld pid, etc.
50
+
This is the case when one node becomes unavailable due to power outage, hardware failure, kernel panic, mysqld crash, kill -9 on mysqld pid, etc.
51
51
52
52
Two remaining nodes notice the connection to node A is down and start trying to re-connect to it. After several timeouts, node A is removed from the cluster. The quorum is saved (two out of three nodes are up), so no service disruption happens. After it is restarted, node A joins automatically (as described in [Scenario 1: Node A is gracefully stopped](#scenario-1-node-a-is-gracefully-stopped)).
53
53
54
54
## Scenario 5: Two nodes disappear from the cluster
55
55
56
-
Two nodes are not available and the remaining node (node C) is not able to form the quorum alone. The cluster has to switch to a non-primary mode, where MySQL refuses to serve any SQL queries. In this state, the **mysqld** process on node C is still running and can be connected to but any statement related to data fails with an error.
56
+
Two nodes are not available and the remaining node (node C) is not able to form the quorum alone. The cluster has to switch to a non-primary mode, where MySQL refuses to serve any SQL queries. In this state, the mysqld process on node C is still running and can be connected to but any statement related to data fails with an error.
In this case, you cannot be sure that all nodes are consistent with each other. We cannot use `safe_to_bootstrap` variable to determine the node that has the last transaction committed as this variable is set to **0** for each node.
112
+
In this case, you cannot be sure that all nodes are consistent with each other. The `safe_to_bootstrap` variable is set to 0 on every node and cannot be used to identify which node has the last transaction committed.
113
113
114
-
An attempt to bootstrap from such a node will fail unless you start `mysqld` with the `--wsrep-recover` option:
114
+
!!! warning "Risk of split-brain"
115
+
116
+
Setting `safe_to_bootstrap: 1` on a node without first confirming that node has the highest recovered position can cause split-brain and data loss. Always run the validation step below on every node and bootstrap only from the node with the highest seqno.
117
+
118
+
### Validation step: recover and record position on every node
119
+
120
+
On each node that was part of the cluster, run `mysqld` with the `--wsrep-recover` option so that the server prints the recovered position and exits (the server does not stay running):
115
121
116
122
```shell
117
123
mysqld --wsrep-recover
118
124
```
119
125
120
-
Search the output for the line that reports the recovered position after the node UUID (**1122** in this case):
126
+
In the output, find the line that reports the recovered position in the form `UUID:seqno`:
The node where the recovered position is marked by the greatest number is the best bootstrap candidate. In its `grastate.dat` file, set the safe_to_bootstrap variable to **1**. Then, bootstrap from this node:
136
+
Run the command on every node and record the UUID and seqno from each. Use a table like the following so that you can compare and choose the correct bootstrap candidate:
137
+
138
+
| Node (hostname or label) | UUID | seqno |
139
+
|--------------------------|------|-------|
140
+
| node1 |||
141
+
| node2 |||
142
+
| node3 |||
143
+
144
+
!!! warning "When highest seqno is not safe to use"
145
+
146
+
The procedure below assumes you have access to every node that was in the cluster and that the recovered positions are trustworthy. If either is false, bootstrapping from the node with the highest seqno can permanently destroy data.
147
+
148
+
* Access to all nodes: If a node is unreachable (for example, in another datacenter or still down), you cannot assume the highest seqno you see is the true cluster state. The missing node may have had a higher seqno. Bootstrap only after you have run `mysqld --wsrep-recover` on every member and recorded the result.
149
+
150
+
* Trustworthiness of the "highest" node: A node can report a higher seqno but have corrupt or incomplete data—for example, after a partition (it was in a minority and applied writes that were never committed cluster-wide), a write-ahead or disk failure (it reported a seqno that was not fully persisted), or an unclean shutdown. Bootstrapping from that node forces the rest of the cluster to sync to that state. The cluster will then permanently drop or overwrite the transactions that existed only on the other nodes. If you suspect the "highest" node was partitioned, had storage or write-ahead issues, or you cannot verify its history, do not bootstrap from it without expert guidance or a verified backup strategy. Prefer [Get help from Percona](get-help.md) or your support channel when in doubt.
151
+
152
+
If you have verified all nodes and trust the node with the greatest seqno, that node is the intended bootstrap candidate. If two nodes show the same UUID and seqno, either can be used.
153
+
154
+
### Bootstrap step: set safe_to_bootstrap and start the first node
155
+
156
+
Only on the node that has the highest seqno from the validation step (and only after the caveats above are satisfied), set `safe_to_bootstrap` to 1 in that node’s `grastate.dat` file, then bootstrap from that node:
131
157
132
158
```shell
159
+
# On the chosen node only: edit grastate.dat and set safe_to_bootstrap: 1, then:
133
160
systemctl start mysql@bootstrap.service
134
161
```
135
162
136
-
After a shutdown, you can bootstrap from the node which is marked as safe in the `grastate.dat` file.
137
-
138
-
```{.text .no-copy}
139
-
...
140
-
safe_to_bootstrap: 1
141
-
...
142
-
```
163
+
After a clean shutdown in the future, you can bootstrap from the node which is marked as safe in the `grastate.dat` file (where `safe_to_bootstrap: 1`).
143
164
144
165
In recent Galera versions, the option [`pc.recovery`](wsrep-provider-index.md#pcrecovery) (enabled by default) saves the cluster state into a file named `gvwstate.dat` on each member node. As the name of this option suggests (pc – primary component), it saves only a cluster being in the PRIMARY state. An example content of the file may look like this:
145
166
@@ -184,7 +205,7 @@ After this, you are able to work on the manually restored part of the cluster, a
184
205
185
206
Then, as the Galera replication model truly cares about data consistency: once the inconsistency is detected, nodes that cannot execute row change statement due to a data difference – an emergency shutdown will be performed and the only way to bring the nodes back to the cluster is via the full [SST](glossary.md#sst)
186
207
187
-
**Based on material from Percona Database Performance Blog**
208
+
Based on material from Percona Database Performance Blog
188
209
189
210
This article is based on the blog post [Galera replication - how to recover a PXC cluster by *Przemysław Malkowski* :octicons-link-external-16:]: https://www.percona.com/blog/2014/09/01/galera-replication-how-to-recover-a-pxc-cluster/
0 commit comments