You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-[OSD pods are failing to start](#osd-pods-are-failing-to-start)
35
-
-[Symptoms](#symptoms-3)
36
-
-[Investigation](#investigation-3)
37
-
-[Solution](#solution-4)
38
19
-[OSD pods are not created on my devices](#osd-pods-are-not-created-on-my-devices)
39
-
-[Symptoms](#symptoms-4)
40
-
-[Investigation](#investigation-4)
41
-
-[Solution](#solution-5)
42
20
-[Node hangs after reboot](#node-hangs-after-reboot)
43
-
-[Symptoms](#symptoms-5)
44
-
-[Investigation](#investigation-5)
45
-
-[Solution](#solution-6)
46
21
-[Using multiple shared filesystem (CephFS) is attempted on a kernel version older than 4.7](#using-multiple-shared-filesystem-cephfs-is-attempted-on-a-kernel-version-older-than-47)
47
-
-[Symptoms](#symptoms-6)
48
-
-[Solution](#solution-7)
49
22
-[Set debug log level for all Ceph daemons](#set-debug-log-level-for-all-ceph-daemons)
50
23
-[Activate log to file for a particular Ceph daemon](#activate-log-to-file-for-a-particular-ceph-daemon)
51
24
-[A worker node using RBD devices hangs up](#a-worker-node-using-rbd-devices-hangs-up)
52
-
-[Symptoms](#symptoms-7)
53
-
-[Investigation](#investigation-6)
54
-
-[Solution](#solution-8)
55
25
-[Too few PGs per OSD warning is shown](#too-few-pgs-per-osd-warning-is-shown)
56
-
-[Symptoms](#symptoms-8)
57
-
-[Solution](#solution-9)
58
26
-[LVM metadata can be corrupted with OSD on LV-backed PVC](#lvm-metadata-can-be-corrupted-with-osd-on-lv-backed-pvc)
59
-
-[Symptoms](#symptoms-9)
60
-
-[Solution](#solution-10)
61
27
-[OSD prepare job fails due to low aio-max-nr setting](#osd-prepare-job-fails-due-to-low-aio-max-nr-setting)
-[Recover from corruption (v1.6.0-v1.6.7)](#recover-from-corruption-v160-v167)
66
29
-[Operator environment variables are ignored](#operator-environment-variables-are-ignored)
67
-
-[Symptoms](#symptoms-11)
68
-
-[Investigation](#investigation-7)
69
-
-[Solution](#solution-12)
70
30
-[The cluster is in an unhealthy state or fails to configure when LimitNOFILE=infinity in containerd](#the-cluster-is-in-an-unhealthy-state-or-fails-to-configure-when-limitnofileinfinity-in-containerd)
71
-
-[Symptoms](#symptoms-12)
72
-
-[Solution](#solution-13)
73
31
74
32
75
33
Many of these problem cases are hard to summarize down to a short phrase that adequately describes the problem. Each problem will start with a bulleted list of symptoms. Keep in mind that all symptoms may not apply depending on the configuration of Rook. If the majority of the symptoms are seen there is a fair chance you are experiencing that problem.
@@ -94,15 +52,15 @@ After you verify the basic health of the running pods, next you will want to run
94
52
* Other artifacts:
95
53
* The monitors that are expected to be in quorum: `kubectl -n <cluster-namespace> get configmap rook-ceph-mon-endpoints -o yaml | grep data`
96
54
97
-
####Tools in the Rook Toolbox
55
+
### Tools in the Rook Toolbox
98
56
99
57
The [rook-ceph-tools pod](ceph-toolbox.md) provides a simple environment to run Ceph tools. Once the pod is up and running, connect to the pod to execute Ceph commands to evaluate that current state of the cluster.
100
58
101
59
```console
102
60
kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[*].metadata.name}') -- bash
103
61
```
104
62
105
-
####Ceph Commands
63
+
### Ceph Commands
106
64
107
65
Here are some common commands to troubleshoot a Ceph cluster:
0 commit comments