Skip to content

Commit c54be3b

Browse files
kvapsclaude
andauthored
fix(e2e): make rwx-ganesha dump_diag actually capture the failure surface (#144)
The wait-Ready diagnostics had three blind spots that made BUG-028 triage rely on re-running the scenario by hand: - the blockstor-side dump exec'd `linstor r l` inside the blockstor-controller image, which ships no linstor binary, so that section always failed with 'executable file not found'. Read the RD/Resource CRDs instead — same state, no in-pod binary needed. - `kubectl logs ds/linstor-csi-node` only prints one pod of the DaemonSet; loop over every linstor-csi-node pod, all containers. - the NFS-Ganesha publish side was invisible: dump every linstor-csi-nfs-server pod (all containers), the drbd-reactor promoter ConfigMap, and the EndpointSlices of svc linstor-csi-nfs. Diagnostics only — no timeout or assertion changes. Signed-off-by: Andrei Kvapil <kvapss@gmail.com> Co-authored-by: Claude <noreply@anthropic.com>
1 parent 8f70323 commit c54be3b

1 file changed

Lines changed: 46 additions & 6 deletions

File tree

tests/e2e/rwx-ganesha.sh

Lines changed: 46 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -78,12 +78,52 @@ dump_diag() {
7878
echo "----- diag ($label): linstor-csi-controller log tail -----" >&2
7979
kubectl -n piraeus-datastore logs deploy/linstor-csi-controller \
8080
-c linstor-csi --tail=80 2>&1 >&2 || true
81-
echo "----- diag ($label): linstor-csi-node logs (per worker) -----" >&2
82-
kubectl -n piraeus-datastore logs ds/linstor-csi-node \
83-
-c linstor-csi --tail=80 --prefix=true 2>&1 >&2 || true
84-
echo "----- diag ($label): linstor r l (via blockstor controller) -----" >&2
85-
kubectl -n blockstor-system exec deploy/blockstor-controller -- \
86-
linstor r l 2>&1 >&2 || true
81+
# `kubectl logs ds/...` follows only ONE pod ("Found 3 pods, using
82+
# pod/..."), hiding the workers that actually hit the failure. Loop
83+
# over every pod explicitly, all containers.
84+
echo "----- diag ($label): linstor-csi-node logs (every pod, all containers) -----" >&2
85+
for pod in $(kubectl -n piraeus-datastore get pods \
86+
-l app.kubernetes.io/component=linstor-csi-node \
87+
-o name 2>/dev/null); do
88+
echo "--- $pod ---" >&2
89+
kubectl -n piraeus-datastore logs "$pod" --all-containers \
90+
--prefix --tail=80 2>&1 >&2 || true
91+
done
92+
# The NFS-Ganesha export pods and their in-pod drbd-reactor promoter
93+
# are the usual suspects when consumers see `mount.nfs: Connection
94+
# refused`: if the promoter never promotes the backing DRBD resource,
95+
# ganesha never starts listening on any of them.
96+
echo "----- diag ($label): linstor-csi-nfs-server logs (every pod, all containers) -----" >&2
97+
for pod in $(kubectl -n piraeus-datastore get pods \
98+
-l app.kubernetes.io/component=linstor-csi-nfs-server \
99+
-o name 2>/dev/null); do
100+
echo "--- $pod ---" >&2
101+
kubectl -n piraeus-datastore logs "$pod" --all-containers \
102+
--prefix --tail=120 2>&1 >&2 || true
103+
done
104+
echo "----- diag ($label): drbd-reactor promoter ConfigMap -----" >&2
105+
kubectl -n piraeus-datastore get cm linstor-csi-nfs-server-reactor-config \
106+
-o yaml 2>&1 >&2 || true
107+
echo "----- diag ($label): EndpointSlices for svc linstor-csi-nfs -----" >&2
108+
kubectl -n piraeus-datastore get endpointslices \
109+
-l kubernetes.io/service-name=linstor-csi-nfs -o yaml 2>&1 >&2 || true
110+
# blockstor-side view of the volume. The previous shape exec'd
111+
# `linstor r l` inside the blockstor-controller image, which ships no
112+
# `linstor` binary — that dump always died with "executable file not
113+
# found in \$PATH". The blockstor CRDs carry the same RD/Resource
114+
# state without needing any in-pod binary or a port-forward.
115+
echo "----- diag ($label): blockstor RD + Resource CRDs -----" >&2
116+
kubectl get resourcedefinitions.blockstor.cozystack.io 2>&1 >&2 || true
117+
if [[ -n "${PV:-}" ]]; then
118+
kubectl get "resourcedefinitions.blockstor.cozystack.io/$PV" \
119+
-o yaml 2>&1 >&2 || true
120+
for res in $(kubectl get resources.blockstor.cozystack.io \
121+
-o name 2>/dev/null | grep -F "$PV" || true); do
122+
kubectl get "$res" -o yaml 2>&1 >&2 || true
123+
done
124+
else
125+
kubectl get resources.blockstor.cozystack.io 2>&1 >&2 || true
126+
fi
87127
}
88128

89129
cleanup() {

0 commit comments

Comments
 (0)