Skip to content

Commit bb5d4db

Browse files
abaysclaude
authored andcommitted
[cifmw_backup_restore] Add Swift data verification to e2e workflow
Upload a random file to Swift before backup and verify it can be downloaded with a matching checksum after restore. This catches Swift data loss caused by xattr issues during OADP DataMover/kopia PVC backup and restore cycles. Closes: OSPRH-30250 Related-Issue: #OSPRH-29818 Signed-off-by: Andrew Bays <abays@redhat.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 7e2aa33 commit bb5d4db

6 files changed

Lines changed: 182 additions & 0 deletions

File tree

docs/dictionary/en-custom.txt

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -233,6 +233,7 @@ fultonj
233233
fusco
234234
fwcybtb
235235
Galera
236+
GaleraBackup
236237
gapped
237238
genericcloud
238239
genindex
@@ -369,6 +370,8 @@ maxdepth
369370
mcs
370371
mellanox
371372
metallb
373+
MiB
374+
MinIO
372375
metalsmith
373376
mgmt
374377
minclient
@@ -680,6 +683,7 @@ wljewmjozmzawlzasdje
680683
wljewmtozmzawlzasdje
681684
workstream
682685
xargs
686+
xattr
683687
xdg
684688
xoc
685689
xpath

roles/cifmw_backup_restore/README.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,19 @@ OpenShift cluster.
5353
* `cifmw_backup_restore_pin_pvcs`: (Boolean) Enable PVC-to-node pinning during restore for WaitForFirstConsumer storage classes. Defaults to `false`.
5454
* Post-EDPM **Neutron–OVN** steps follow [user guide Step 12](https://github.com/openstack-k8s-operators/dev-docs/blob/main/backup-restore/user-guide.md#step-12-verify-and-sync-neutron-to-ovn): run `neutron-ovn-db-sync-util` in `log` mode first (`neutron-dist.conf`, `neutron.conf`, `neutron.conf.d`). **Repair** runs if `cifmw_backup_restore_ovn_db` is `false` (no OVN NB/SB file backup was taken), or if log-mode stdout/stderr contains a `WARNING` line—Neutron reports drift that way while still exiting 0. If OVN file backup/restore was enabled and log output has no `WARNING` lines, repair is skipped as redundant.
5555

56+
### End-to-end orchestration (e2e.yml)
57+
58+
* `cifmw_backup_restore_install_deps`: (Boolean) Install MinIO, OADP, and GaleraBackup CRs. Defaults to `true`.
59+
* `cifmw_backup_restore_create_workload`: (Boolean) Create a test VM with floating IP before backup. Defaults to `true`.
60+
* `cifmw_backup_restore_run_backup`: (Boolean) Run the backup step. Defaults to `true`.
61+
* `cifmw_backup_restore_run_cleanup`: (Boolean) Run the cleanup step. Defaults to `true`.
62+
* `cifmw_backup_restore_run_restore`: (Boolean) Run the restore step. Defaults to `true`.
63+
* `cifmw_backup_restore_run_post_tempest`: (Boolean) Run tempest validation after restore. Defaults to `false`.
64+
* `cifmw_backup_restore_test_swift_data`: (Boolean) Upload a random file to Swift before backup and verify it can be downloaded after restore. Catches Swift data loss caused by xattr issues (OSPRH-29818). Defaults to `true`.
65+
* `cifmw_backup_restore_swift_test_container`: (String) Swift container name for the test object. Defaults to `backup-test-container`.
66+
* `cifmw_backup_restore_swift_test_object`: (String) Object name for the test file. Defaults to `backup-test-object`.
67+
* `cifmw_backup_restore_swift_test_file_size_bytes`: (Integer) Size of the random test file in bytes. Defaults to `1048576` (1 MiB).
68+
5669
### Cleanup
5770

5871
* `cifmw_backup_restore_cleanup_ctlplane`: (Boolean) Delete control-plane resources during cleanup. Defaults to `true`.

roles/cifmw_backup_restore/defaults/main.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,10 @@ cifmw_backup_restore_run_backup: true
3333
cifmw_backup_restore_run_cleanup: true
3434
cifmw_backup_restore_run_restore: true
3535
cifmw_backup_restore_run_post_tempest: false
36+
cifmw_backup_restore_test_swift_data: true
37+
cifmw_backup_restore_swift_test_container: "backup-test-container"
38+
cifmw_backup_restore_swift_test_object: "backup-test-object"
39+
cifmw_backup_restore_swift_test_file_size_bytes: 1048576
3640

3741
# Passthrough to update role when creating the test workload (prefix matches update role, not this role)
3842
cifmw_update_ping_test: true

roles/cifmw_backup_restore/tasks/e2e.yml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,13 @@
7575
name: update
7676
tasks_from: create_instance.yml
7777

78+
# ========================================
79+
# Step 2.5: Upload Swift test data
80+
# ========================================
81+
- name: Upload Swift test data before backup
82+
ansible.builtin.include_tasks: swift_data_upload.yml
83+
when: cifmw_backup_restore_test_swift_data | bool
84+
7885
# ========================================
7986
# Step 3: Create backup
8087
# ========================================
@@ -261,6 +268,15 @@
261268
Workload validation passed: instance reachable via FIP {{ _instance_fip.stdout }},
262269
stop/start successful, ping after restart OK
263270
271+
# ========================================
272+
# Step 6.5: Validate Swift data after restore
273+
# ========================================
274+
- name: Validate Swift data after restore
275+
ansible.builtin.include_tasks: swift_data_verify.yml
276+
when:
277+
- cifmw_backup_restore_test_swift_data | bool
278+
- cifmw_backup_restore_run_restore | bool
279+
264280
# ========================================
265281
# Step 7: Post-restore tempest validation
266282
# ========================================
Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
---
2+
# Copyright Red Hat, Inc.
3+
# All Rights Reserved.
4+
#
5+
# Licensed under the Apache License, Version 2.0 (the "License"); you may
6+
# not use this file except in compliance with the License. You may obtain
7+
# a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing, software
12+
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
13+
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
14+
# License for the specific language governing permissions and limitations
15+
# under the License.
16+
17+
# Upload a random test file to Swift before backup.
18+
# After restore, swift_data_verify.yml downloads the file and checks its
19+
# checksum to confirm Swift data survived the backup/restore cycle.
20+
21+
- name: Set openstackclient exec prefix
22+
ansible.builtin.set_fact:
23+
_os_exec: >-
24+
oc exec -t openstackclient -n {{ cifmw_backup_restore_namespace }} --
25+
26+
- name: Generate random test file in openstackclient pod
27+
ansible.builtin.shell: |
28+
{{ _os_exec }} dd if=/dev/urandom of=/tmp/{{ cifmw_backup_restore_swift_test_object }} \
29+
bs=1 count={{ cifmw_backup_restore_swift_test_file_size_bytes }} 2>/dev/null
30+
changed_when: true
31+
32+
- name: Calculate checksum of test file
33+
ansible.builtin.shell: |
34+
set -o pipefail
35+
{{ _os_exec }} md5sum /tmp/{{ cifmw_backup_restore_swift_test_object }} | awk '{print $1}'
36+
register: _swift_test_checksum_result
37+
changed_when: false
38+
39+
- name: Store checksum for post-restore verification
40+
ansible.builtin.set_fact:
41+
_swift_test_checksum: "{{ _swift_test_checksum_result.stdout }}"
42+
43+
- name: Create Swift container
44+
ansible.builtin.shell: |
45+
{{ _os_exec }} openstack container create {{ cifmw_backup_restore_swift_test_container }}
46+
changed_when: true
47+
48+
- name: Upload test object to Swift
49+
ansible.builtin.shell: |
50+
{{ _os_exec }} openstack object create {{ cifmw_backup_restore_swift_test_container }} \
51+
/tmp/{{ cifmw_backup_restore_swift_test_object }} \
52+
--name {{ cifmw_backup_restore_swift_test_object }}
53+
changed_when: true
54+
55+
- name: Verify upload succeeded
56+
ansible.builtin.shell: |
57+
set -o pipefail
58+
{{ _os_exec }} openstack object show \
59+
{{ cifmw_backup_restore_swift_test_container }} \
60+
{{ cifmw_backup_restore_swift_test_object }} -f json | jq -r '."content-length"'
61+
register: _swift_upload_check
62+
changed_when: false
63+
64+
- name: Display Swift test data info
65+
ansible.builtin.debug:
66+
msg: >-
67+
Swift test data uploaded: container={{ cifmw_backup_restore_swift_test_container }},
68+
object={{ cifmw_backup_restore_swift_test_object }},
69+
size={{ _swift_upload_check.stdout }} bytes,
70+
md5={{ _swift_test_checksum }}
71+
72+
- name: Remove local temp file from pod
73+
ansible.builtin.shell: |
74+
{{ _os_exec }} rm -f /tmp/{{ cifmw_backup_restore_swift_test_object }}
75+
changed_when: true
Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
---
2+
# Copyright Red Hat, Inc.
3+
# All Rights Reserved.
4+
#
5+
# Licensed under the Apache License, Version 2.0 (the "License"); you may
6+
# not use this file except in compliance with the License. You may obtain
7+
# a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing, software
12+
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
13+
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
14+
# License for the specific language governing permissions and limitations
15+
# under the License.
16+
17+
# Verify Swift test data survived the backup/restore cycle.
18+
# Downloads the object uploaded by swift_data_upload.yml and compares
19+
# its checksum against the value recorded before backup.
20+
21+
- name: Set openstackclient exec prefix
22+
ansible.builtin.set_fact:
23+
_os_exec: >-
24+
oc exec -t openstackclient -n {{ cifmw_backup_restore_namespace }} --
25+
26+
- name: Verify Swift container exists after restore
27+
ansible.builtin.shell: |
28+
{{ _os_exec }} openstack container show {{ cifmw_backup_restore_swift_test_container }}
29+
changed_when: false
30+
31+
- name: Download test object from Swift
32+
ansible.builtin.shell: |
33+
{{ _os_exec }} openstack object save \
34+
{{ cifmw_backup_restore_swift_test_container }} \
35+
{{ cifmw_backup_restore_swift_test_object }} \
36+
--file /tmp/{{ cifmw_backup_restore_swift_test_object }}.restored
37+
changed_when: true
38+
39+
- name: Calculate checksum of restored file
40+
ansible.builtin.shell: |
41+
set -o pipefail
42+
{{ _os_exec }} md5sum /tmp/{{ cifmw_backup_restore_swift_test_object }}.restored | awk '{print $1}'
43+
register: _swift_restored_checksum
44+
changed_when: false
45+
46+
- name: Verify checksum matches
47+
ansible.builtin.assert:
48+
that:
49+
- _swift_restored_checksum.stdout == _swift_test_checksum
50+
fail_msg: >-
51+
Swift data verification FAILED: checksum mismatch.
52+
Expected {{ _swift_test_checksum }}, got {{ _swift_restored_checksum.stdout }}.
53+
This indicates Swift data was lost or corrupted during backup/restore
54+
(possibly due to xattr loss — see OSPRH-29818).
55+
success_msg: >-
56+
Swift data verification PASSED: checksum {{ _swift_test_checksum }} matches.
57+
58+
- name: Clean up Swift test data
59+
ansible.builtin.shell: |
60+
{{ _os_exec }} openstack object delete \
61+
{{ cifmw_backup_restore_swift_test_container }} \
62+
{{ cifmw_backup_restore_swift_test_object }}
63+
{{ _os_exec }} openstack container delete {{ cifmw_backup_restore_swift_test_container }}
64+
changed_when: true
65+
failed_when: false
66+
67+
- name: Remove restored temp file from pod
68+
ansible.builtin.shell: |
69+
{{ _os_exec }} rm -f /tmp/{{ cifmw_backup_restore_swift_test_object }}.restored
70+
changed_when: true

0 commit comments

Comments
 (0)