Skip to content

Commit 5ed0fab

Browse files
stuggiclaude
andcommitted
Remove last-applied-configuration annotation during restore
Add removal of kubectl.kubernetes.io/last-applied-configuration annotation to all OADP restore operations. This annotation can become very large and cause restore failures. The Problem: - kubectl.kubernetes.io/last-applied-configuration stores entire resource spec - For large resources (OpenStackControlPlane with many services), this can exceed etcd limits - OADP restore fails with "annotation too large" or API server errors - Observed in production: annotation size issues during restore The Solution: - Use OADP resourceModifiers to remove the annotation during restore - Applied to ALL restore orders (00, 10, 20, 30, 40, 60) - Consistent pattern everywhere using conditions: {} Implementation: - Updated "OwnerReference Handling" section → "OwnerReference and Annotation Handling" - Added explanation of last-applied-configuration size issue - Updated all example Restore CRs to include both patches: 1. Remove ownerReferences 2. Remove kubectl.kubernetes.io/last-applied-configuration - Updated Phase 3 manual restore examples (all orders) - Updated Key Points section to mention metadata cleanup - Created CURRENT_JQ_HANDLING.md documenting all current jq transformations resourceModifiers pattern (used everywhere): ```yaml resourceModifiers: - conditions: {} # Match all resources patches: - operation: remove path: "/metadata/ownerReferences" - operation: remove path: "/metadata/annotations/kubectl.kubernetes.io~1last-applied-configuration" ``` Benefits: - Prevents restore failures due to annotation size limits - Removes stale client-side apply tracking - Resource gets fresh annotation on next kubectl apply - Consistent with current backup playbook behavior (already removes this) Note: - JSONPatch path escaping: kubectl.kubernetes.io/last-applied-configuration → /metadata/annotations/kubectl.kubernetes.io~1last-applied-configuration (tilde followed by 1 escapes the forward slash) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
1 parent b5f3421 commit 5ed0fab

2 files changed

Lines changed: 228 additions & 16 deletions

File tree

docs/dev/CURRENT_JQ_HANDLING.md

Lines changed: 164 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,164 @@
1+
# Current jq-Based Backup/Restore Metadata Handling
2+
3+
This document summarizes all metadata and field manipulations currently done via jq in the backup/restore playbooks.
4+
5+
## 1. Metadata Fields Removed During Backup
6+
7+
All resources have these fields removed during backup:
8+
9+
```bash
10+
jq 'del(.items[].metadata.uid,
11+
.items[].metadata.resourceVersion,
12+
.items[].metadata.creationTimestamp,
13+
.items[].metadata.managedFields,
14+
.items[].metadata.annotations."kubectl.kubernetes.io/last-applied-configuration",
15+
.metadata)'
16+
```
17+
18+
**Fields:**
19+
- `.metadata.uid` - Cluster-unique identifier (auto-assigned by Kubernetes)
20+
- `.metadata.resourceVersion` - Optimistic concurrency control version
21+
- `.metadata.creationTimestamp` - Resource creation time
22+
- `.metadata.managedFields` - Server-side apply field tracking
23+
- `.metadata.annotations."kubectl.kubernetes.io/last-applied-configuration"` - Client-side apply annotation
24+
- `.metadata` (root level) - List metadata
25+
26+
**Why:**
27+
- These are Kubernetes-managed fields that get new values on restore
28+
- Including them causes conflicts during restore
29+
30+
## 2. ownerReferences Removal
31+
32+
Most resources have ownerReferences removed:
33+
34+
```bash
35+
jq 'del(.items[].metadata.ownerReferences, ...)'
36+
```
37+
38+
**Why:**
39+
- UID mismatch: Backed-up ownerReferences contain OLD UIDs
40+
- Restored owners get NEW UIDs
41+
- Causes orphaned resources
42+
- **STATUS: ADDRESSED** in webhook design using OADP resourceModifiers
43+
44+
**Exceptions:**
45+
- OpenStackControlPlane: ownerReferences NOT removed (has no owner)
46+
- NetworkAttachmentDefinitions: ownerReferences NOT removed (why?)
47+
- RabbitMQUser: ownerReferences NOT removed during backup, filtered during restore
48+
49+
## 3. Status Field Removal
50+
51+
GaleraBackup has status removed:
52+
53+
```bash
54+
jq 'del(.items[].status, ...)'
55+
```
56+
57+
**Why:**
58+
- Status is runtime state, not desired state
59+
- Velero strips status by default anyway
60+
61+
## 4. Filtering by ownerReferences
62+
63+
### DNSData (Backup)
64+
```bash
65+
jq '.items |= map(select(.metadata.ownerReferences == null or (.metadata.ownerReferences | length) == 0))'
66+
```
67+
**Why:** Only backup user-created DNSData (no ownerReferences)
68+
69+
### Secrets (Restore)
70+
```bash
71+
jq '.items |= map(
72+
select((.metadata.name | startswith("rabbitmq-")) | not) |
73+
select(.metadata.labels."service-cert" | not) |
74+
select(
75+
(.metadata.ownerReferences == null) or
76+
(.metadata.name | startswith("rootca-")) or
77+
(.metadata.name | contains("-db-password")) or
78+
(.metadata.name | contains("-dbpassword"))
79+
)
80+
)'
81+
```
82+
83+
**Includes:**
84+
- User-provided secrets (no ownerReferences)
85+
- CA certificates (rootca-*)
86+
- Database passwords (*-db-password, *-dbpassword)
87+
88+
**Excludes:**
89+
- rabbitmq-* secrets
90+
- Secrets with label "service-cert"
91+
- Other operator-managed secrets
92+
93+
### ConfigMaps (Restore)
94+
```bash
95+
jq '.items |= map(select(.metadata.ownerReferences == null))'
96+
```
97+
**Includes:** Only user-provided ConfigMaps (no ownerReferences)
98+
99+
### RabbitMQUser (Restore)
100+
```bash
101+
jq '.items |= map(select(.metadata.ownerReferences == null))'
102+
```
103+
**Includes:** Only user-created RabbitMQUser (no ownerReferences)
104+
105+
## 5. Secret Type Filtering (Backup)
106+
107+
```bash
108+
jq '.items |= map(select(.type != "kubernetes.io/dockercfg" and .type != "kubernetes.io/service-account-token"))'
109+
```
110+
111+
**Excludes:**
112+
- `kubernetes.io/dockercfg` - Image pull secrets (auto-generated)
113+
- `kubernetes.io/service-account-token` - Service account tokens (auto-generated)
114+
115+
## 6. Apply Strategy (Restore)
116+
117+
Resources are applied differently based on annotations:
118+
119+
```bash
120+
if [ -n "${has_annotation}" ]; then
121+
oc apply -f "${temp_file}"
122+
else
123+
oc apply --server-side=true -f "${temp_file}"
124+
fi
125+
```
126+
127+
**Logic:**
128+
- Has `kubectl.kubernetes.io/last-applied-configuration` → use `oc apply` (client-side)
129+
- No annotation → use `oc apply --server-side=true`
130+
131+
**Why:**
132+
- Server-side apply for resources never applied with kubectl
133+
- Client-side apply for resources that have last-applied-configuration
134+
135+
## 7. Staged Deployment Annotation (Restore)
136+
137+
OpenStackControlPlane gets staged annotation added during restore:
138+
139+
```bash
140+
jq '.items[0].metadata.annotations["core.openstack.org/deployment-stage"] = "infrastructure-only"'
141+
```
142+
143+
**Why:** Prevents full deployment until database/RabbitMQ are restored
144+
145+
**STATUS: ADDRESSED** in webhook design using OADP resourceModifiers
146+
147+
---
148+
149+
## Summary: What Needs Handling in OADP Design
150+
151+
### ✅ Already Addressed
152+
1. **ownerReferences removal** - Using OADP resourceModifiers with `conditions: {}`
153+
2. **Staged deployment annotation** - Using OADP resourceModifiers
154+
3. **last-applied-configuration annotation** - Using OADP resourceModifiers to remove (can be too large and cause failures)
155+
156+
### ⚠️ Needs Discussion
157+
1. **Secret type filtering** - Exclude dockercfg and service-account-token types
158+
2. **Apply strategy** - Can OADP handle server-side vs client-side apply?
159+
3. **Filtering by ownerReferences** - Handled by webhook labels, but need to verify coverage
160+
161+
### ✅ OADP Handles Automatically
162+
1. **uid, resourceVersion, creationTimestamp, managedFields** - Kubernetes auto-assigns
163+
2. **status** - Velero strips by default
164+
3. **List metadata** - Velero handles

docs/dev/backup-restore-webhook-design.md

Lines changed: 64 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -491,15 +491,19 @@ This means:
491491
- Operator-created ConfigMaps → Not labeled → Not restored, recreated by operator ✅
492492
- All CRs with annotations → Labeled by webhook → Restored ✅
493493
494-
#### OwnerReference Handling
494+
#### OwnerReference and Annotation Handling
495495
496-
**The Problem:**
496+
**The Problems:**
497497
498-
When OADP restores resources, each resource gets a NEW UID (UIDs are cluster-unique identifiers). However, backed-up ownerReferences contain OLD UIDs from the original cluster. This causes:
498+
When OADP restores resources from backup, several metadata fields can cause issues:
499499
500-
1. **Orphaned resources**: Restored resource has ownerReference with old UID that doesn't match the new owner's UID
501-
2. **Broken ownership chain**: Kubernetes doesn't recognize the ownership relationship
502-
3. **Potential data loss**: Operators might try to delete/recreate PVCs when they don't recognize them as owned resources
500+
**1. OwnerReferences with Stale UIDs:**
501+
502+
Each resource gets a NEW UID on restore (UIDs are cluster-unique identifiers). However, backed-up ownerReferences contain OLD UIDs from the original cluster. This causes:
503+
504+
- **Orphaned resources**: Restored resource has ownerReference with old UID that doesn't match the new owner's UID
505+
- **Broken ownership chain**: Kubernetes doesn't recognize the ownership relationship
506+
- **Potential data loss**: Operators might try to delete/recreate PVCs when they don't recognize them as owned resources
503507
504508
**Example:**
505509
```yaml
@@ -517,9 +521,29 @@ metadata:
517521
# - Operator might delete/recreate PVC → DATA LOSS!
518522
```
519523

524+
**2. last-applied-configuration Annotation Too Large:**
525+
526+
The `kubectl.kubernetes.io/last-applied-configuration` annotation stores the entire resource specification from the last `kubectl apply`. This can:
527+
528+
- **Exceed size limits**: Very large resources fail to restore due to annotation size
529+
- **Cause API server errors**: etcd has size limits on annotations
530+
- **Be unnecessary**: Resource will get new annotation on next apply
531+
520532
**The Solution:**
521533

522-
Use OADP `resourceModifiers` to **strip ALL ownerReferences** during restore. Operators will adopt resources during reconciliation and set correct ownerReferences with new UIDs.
534+
Use OADP `resourceModifiers` to **strip ownerReferences and large annotations** from ALL resources during restore:
535+
536+
```yaml
537+
resourceModifiers:
538+
- conditions: {} # Match all resources
539+
patches:
540+
# Remove ownerReferences (operators will adopt during reconciliation)
541+
- operation: remove
542+
path: "/metadata/ownerReferences"
543+
# Remove last-applied-configuration (can be too large)
544+
- operation: remove
545+
path: "/metadata/annotations/kubectl.kubernetes.io~1last-applied-configuration"
546+
```
523547
524548
**Applied to ALL restore orders** for simplicity and safety - no need to think about which resources need it.
525549
@@ -539,13 +563,15 @@ spec:
539563
openstack.org/backup-restore: "true"
540564
openstack.org/backup-restore-order: "10"
541565
restorePVs: false # Don't restore PVCs in this order
542-
# CRITICAL: Remove ownerReferences to prevent orphaned resources
566+
# CRITICAL: Remove problematic metadata to prevent issues
543567
# Operators will adopt resources and set correct ownerReferences during reconciliation
544568
resourceModifiers:
545569
- conditions: {} # Match all resources
546570
patches:
547571
- operation: remove
548572
path: "/metadata/ownerReferences"
573+
- operation: remove
574+
path: "/metadata/annotations/kubectl.kubernetes.io~1last-applied-configuration"
549575
---
550576
# Restore Order 20: TLS Issuers
551577
apiVersion: velero.io/v1
@@ -560,21 +586,25 @@ spec:
560586
openstack.org/backup-restore: "true"
561587
openstack.org/backup-restore-order: "20"
562588
restorePVs: false
563-
# Remove ownerReferences from all resources
589+
# Remove problematic metadata from all resources
564590
resourceModifiers:
565591
- conditions: {} # Match all resources
566592
patches:
567593
- operation: remove
568594
path: "/metadata/ownerReferences"
595+
- operation: remove
596+
path: "/metadata/annotations/kubectl.kubernetes.io~1last-applied-configuration"
569597
---
570598
# And so on for each restore order...
571-
# NOTE: ALL restore orders use resourceModifiers to strip ownerReferences
599+
# NOTE: ALL restore orders use resourceModifiers to strip ownerReferences and large annotations
572600
```
573601

574602
**Key Points:**
575603
- **Backup**: All user resources in namespace (all Secrets, ConfigMaps, CRs) - complete snapshot
576604
- **Restore**: Only resources with `openstack.org/backup-restore: "true"` label - selective filtering
577-
- **OwnerReferences Removed**: All restore orders use `resourceModifiers` to strip ownerReferences (prevents orphaned resources, operators adopt during reconciliation)
605+
- **Metadata Cleanup**: All restore orders use `resourceModifiers` to remove:
606+
- `ownerReferences` - Prevents orphaned resources (operators adopt during reconciliation)
607+
- `kubectl.kubernetes.io/last-applied-configuration` - Can be too large and cause restore failures
578608
- **Webhooks**: Add restore labels to user-provided resources (no ownerReferences)
579609
- **Operators**: Recreate their own Secrets/ConfigMaps on reconciliation (not restored from backup)
580610

@@ -1152,12 +1182,14 @@ spec:
11521182
openstack.org/backup-restore: "true"
11531183
openstack.org/backup-restore-order: "00"
11541184
restorePVs: true # CSI snapshots
1155-
# Remove ownerReferences to prevent orphaned resources (operators will adopt during reconciliation)
1185+
# Remove problematic metadata (operators will adopt during reconciliation)
11561186
resourceModifiers:
11571187
- conditions: {} # Match all resources
11581188
patches:
11591189
- operation: remove
11601190
path: "/metadata/ownerReferences"
1191+
- operation: remove
1192+
path: "/metadata/annotations/kubectl.kubernetes.io~1last-applied-configuration"
11611193
EOF
11621194
11631195
# Wait for completion (CSI snapshot restore may take time)
@@ -1178,12 +1210,14 @@ spec:
11781210
openstack.org/backup-restore: "true"
11791211
openstack.org/backup-restore-order: "10"
11801212
restorePVs: false
1181-
# Remove ownerReferences to prevent orphaned resources
1213+
# Remove problematic metadata
11821214
resourceModifiers:
11831215
- conditions: {} # Match all resources
11841216
patches:
11851217
- operation: remove
11861218
path: "/metadata/ownerReferences"
1219+
- operation: remove
1220+
path: "/metadata/annotations/kubectl.kubernetes.io~1last-applied-configuration"
11871221
EOF
11881222
11891223
# Wait for completion
@@ -1204,6 +1238,14 @@ spec:
12041238
openstack.org/backup-restore: "true"
12051239
openstack.org/backup-restore-order: "20"
12061240
restorePVs: false
1241+
# Remove problematic metadata
1242+
resourceModifiers:
1243+
- conditions: {} # Match all resources
1244+
patches:
1245+
- operation: remove
1246+
path: "/metadata/ownerReferences"
1247+
- operation: remove
1248+
path: "/metadata/annotations/kubectl.kubernetes.io~1last-applied-configuration"
12071249
EOF
12081250
12091251
# Wait for completion
@@ -1225,11 +1267,13 @@ spec:
12251267
openstack.org/backup-restore-order: "30"
12261268
restorePVs: false
12271269
resourceModifiers:
1228-
# Remove ownerReferences from all resources
1270+
# Remove problematic metadata from all resources
12291271
- conditions: {} # Match all resources
12301272
patches:
12311273
- operation: remove
12321274
path: "/metadata/ownerReferences"
1275+
- operation: remove
1276+
path: "/metadata/annotations/kubectl.kubernetes.io~1last-applied-configuration"
12331277
# Add staged deployment annotation to OpenStackControlPlane
12341278
- conditions:
12351279
groupResource: openstackcontrolplanes.core.openstack.org
@@ -1261,12 +1305,14 @@ spec:
12611305
openstack.org/backup-restore: "true"
12621306
openstack.org/backup-restore-order: "40"
12631307
restorePVs: false
1264-
# Remove ownerReferences from all resources
1308+
# Remove problematic metadata
12651309
resourceModifiers:
12661310
- conditions: {} # Match all resources
12671311
patches:
12681312
- operation: remove
12691313
path: "/metadata/ownerReferences"
1314+
- operation: remove
1315+
path: "/metadata/annotations/kubectl.kubernetes.io~1last-applied-configuration"
12701316
EOF
12711317
12721318
oc wait --for=jsonpath='{.status.phase}'=Completed \
@@ -1314,12 +1360,14 @@ spec:
13141360
openstack.org/backup-restore: "true"
13151361
openstack.org/backup-restore-order: "60"
13161362
restorePVs: false
1317-
# Remove ownerReferences from all resources
1363+
# Remove problematic metadata
13181364
resourceModifiers:
13191365
- conditions: {} # Match all resources
13201366
patches:
13211367
- operation: remove
13221368
path: "/metadata/ownerReferences"
1369+
- operation: remove
1370+
path: "/metadata/annotations/kubectl.kubernetes.io~1last-applied-configuration"
13231371
EOF
13241372
13251373
oc wait --for=jsonpath='{.status.phase}'=Completed \

0 commit comments

Comments
 (0)