@@ -164,6 +164,94 @@ oc get volumesnapshotcontent
164164oc describe pvc < pvc-name> -n openstack
165165```
166166
167+ ### Data Mover restore stuck with WaitForFirstConsumer storage (LVM)
168+
169+ ** Known issue:** When using the OADP Data Mover (` snapshotMoveData: true ` ) with a
170+ StorageClass that has ` volumeBindingMode: WaitForFirstConsumer ` (e.g., LVM/topolvm),
171+ the PVC restore at order 00 will deadlock. The data mover waits for the PVC to be
172+ consumed by a pod before downloading data, but with WaitForFirstConsumer the PVC
173+ won't bind until a pod references it. Since PVCs are restored before any workload
174+ pods exist, this creates a deadlock.
175+
176+ ** Symptoms:**
177+ - Restore stuck in ` WaitingForPluginOperations ` phase
178+ - DataDownload CRs in ` Accepted ` or ` <none> ` phase, never progressing
179+ - PVCs in ` Pending ` state with event: "waiting for first consumer to be created"
180+ - Node-agent logs: ` error to wait target PVC consumed ... context deadline exceeded `
181+
182+ ** Upstream issues:**
183+ - [ velero #7561 ] ( https://github.com/vmware-tanzu/velero/issues/7561 ) — WaitForFirstConsumer incompatibility
184+ - [ velero #8044 ] ( https://github.com/vmware-tanzu/velero/issues/8044 ) — Enhancement proposal
185+ - [ velero #9343 ] ( https://github.com/vmware-tanzu/velero/issues/9343 ) — Topology-aware storage
186+
187+ ** Workaround:** Create temporary pods that reference the pending PVCs to trigger
188+ binding. The data mover will then proceed with the download.
189+
190+ ``` bash
191+ # 1. List pending PVCs
192+ oc get pvc -n openstack --no-headers | awk ' {print $1}'
193+
194+ # 2. List available nodes
195+ oc get nodes -l node-role.kubernetes.io/worker --no-headers -o custom-columns=NAME:.metadata.name
196+
197+ # 3. Create a dummy pod for each PVC, targeting a specific node.
198+ # Distribute PVCs across nodes for balanced storage usage.
199+ # With LVM, the PVC will be provisioned on the node the pod targets.
200+ create_dummy_pod () {
201+ local pvc_name=$1
202+ local node_name=$2
203+ local ns=${3:- openstack}
204+ local pod_name=" pvc-consumer-${pvc_name} "
205+ # Truncate pod name to 63 chars (k8s limit)
206+ pod_name=" ${pod_name: 0: 63} "
207+ cat << EOF | oc apply -f -
208+ apiVersion: v1
209+ kind: Pod
210+ metadata:
211+ name: ${pod_name}
212+ namespace: ${ns}
213+ spec:
214+ nodeName: ${node_name}
215+ containers:
216+ - name: pause
217+ image: registry.k8s.io/pause:3.9
218+ volumeMounts:
219+ - name: data
220+ mountPath: /mnt/data
221+ volumes:
222+ - name: data
223+ persistentVolumeClaim:
224+ claimName: ${pvc_name}
225+ EOF
226+ echo " Created pod ${pod_name} on ${node_name} for PVC ${pvc_name} "
227+ }
228+
229+ # Example: distribute PVCs across 3 nodes
230+ NODES=($( oc get nodes -l node-role.kubernetes.io/worker --no-headers -o custom-columns=NAME:.metadata.name) )
231+ PVCS=($( oc get pvc -n openstack --no-headers | awk ' $2 == "Pending" {print $1}' ) )
232+ for i in " ${! PVCS[@]} " ; do
233+ node_idx=$(( i % ${# NODES[@]} ))
234+ create_dummy_pod " ${PVCS[$i]} " " ${NODES[$node_idx]} "
235+ done
236+
237+ # 4. Wait for PVCs to bind
238+ oc get pvc -n openstack -w
239+
240+ # 5. Wait for DataDownloads to complete
241+ oc get datadownloads -n openshift-adp -o custom-columns=NAME:.metadata.name,PHASE:.status.phase,BYTES:.status.progress.totalBytes
242+
243+ # 6. Delete dummy pods after all DataDownloads are Completed
244+ for pvc in " ${PVCS[@]} " ; do
245+ pod_name=" pvc-consumer-${pvc} "
246+ pod_name=" ${pod_name: 0: 63} "
247+ oc delete pod " ${pod_name} " -n openstack --ignore-not-found
248+ done
249+ ```
250+
251+ ** Note:** This workaround is not needed when restoring from local CSI snapshots
252+ (without data mover). It only affects cross-cluster or disaster recovery restores
253+ where PVC data is downloaded from the BackupStorageLocation (S3/MinIO).
254+
167255### Database restore issues
168256
169257``` bash
0 commit comments