Skip to content

Commit 779a71f

Browse files
stuggiclaude
andcommitted
Add customizable restore order for core Kubernetes resources
Problem: Hardcoded restore order "1" for all Secrets/ConfigMaps is too inflexible. Different resources may need different restore orders (e.g., CA secrets in order 1, service secrets in order 5). Changes: 1. Webhook respects existing labels (doesn't overwrite user customization) 2. Added section on manual labeling for immediate customization 3. Designed OpenStackBackupConfig CRD for future configuration 4. Updated helper function to check existing labels before applying defaults 5. Added precedence: Manual labels > Config CRD > Hardcoded defaults Implementation phases: - Phase 1: Manual labeling (available immediately) - Phase 4: CRD-based configuration with OpenStackBackupConfig Benefits: - Users can pre-label resources with custom restore orders - Future: centralized configuration via CRD - No hardcoded assumptions about restore order - Flexible per-deployment customization Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
1 parent 949bf65 commit 779a71f

1 file changed

Lines changed: 176 additions & 18 deletions

File tree

docs/dev/backup-restore-webhook-design.md

Lines changed: 176 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -150,8 +150,24 @@ func labelResourceForRestoreIfUserProvided(ctx context.Context, namespace, kind,
150150
}
151151
152152
labels["openstack.org/backup-restore"] = "true"
153-
labels["openstack.org/backup-category"] = "all"
154-
labels["openstack.org/backup-restore-order"] = "1" // Secrets/ConfigMaps always order 1
153+
154+
// Set category if not already set (allows user override)
155+
if labels["openstack.org/backup-category"] == "" {
156+
labels["openstack.org/backup-category"] = "all"
157+
}
158+
159+
// Set default restore order if not already set (allows user override)
160+
if labels["openstack.org/backup-restore-order"] == "" {
161+
// Default restore order by resource type
162+
// Users can pre-label resources to customize order
163+
switch kind {
164+
case "Secret", "ConfigMap":
165+
labels["openstack.org/backup-restore-order"] = "1"
166+
default:
167+
labels["openstack.org/backup-restore-order"] = "1"
168+
}
169+
}
170+
155171
obj.SetLabels(labels)
156172
157173
// Update the resource
@@ -245,6 +261,91 @@ spec:
245261

246262
**Key Point**: Webhooks add `openstack.org/backup-restore: "true"` labels to resources that need restore. OADP restore uses these labels for selective restore, even though the backup contains everything.
247263

264+
## Customizing Restore Order for Core Resources
265+
266+
### Manual Labeling (Available Immediately)
267+
268+
Users can pre-label Secrets, ConfigMaps, PVCs, and cert-manager resources to customize their restore order. The webhook respects existing labels and won't overwrite them.
269+
270+
**Example: CA secret restored in order 1, service secret in order 5**
271+
272+
```bash
273+
# CA certificate secret (restored early)
274+
oc label secret openstack-ca-cert \
275+
openstack.org/backup-restore=true \
276+
openstack.org/backup-category=all \
277+
openstack.org/backup-restore-order=1 \
278+
-n openstack
279+
280+
# Service-specific secret (restored after infrastructure)
281+
oc label secret nova-cell1-config \
282+
openstack.org/backup-restore=true \
283+
openstack.org/backup-category=controlplane \
284+
openstack.org/backup-restore-order=5 \
285+
-n openstack
286+
```
287+
288+
**How it works:**
289+
1. User creates and labels resource with desired restore order
290+
2. Webhook checks if resource already has `openstack.org/backup-restore: "true"`
291+
3. If yes, webhook skips labeling (preserves user's custom order)
292+
4. If no, webhook applies default labels
293+
294+
### Configuration via CRD (Future - Phase 4)
295+
296+
When the Golang controller is implemented, restore order defaults can be configured via CRD:
297+
298+
```yaml
299+
apiVersion: core.openstack.org/v1beta1
300+
kind: OpenStackBackupConfig
301+
metadata:
302+
name: backup-config
303+
namespace: openstack
304+
spec:
305+
# Default restore orders for core Kubernetes resources
306+
restoreDefaults:
307+
secrets:
308+
category: "all"
309+
order: "1"
310+
configmaps:
311+
category: "all"
312+
order: "1"
313+
persistentvolumeclaims:
314+
category: "all"
315+
order: "8"
316+
issuers: # cert-manager Issuer
317+
category: "all"
318+
order: "2"
319+
networkattachmentdefinitions:
320+
category: "all"
321+
order: "1"
322+
323+
# Custom overrides for specific resources
324+
customOrders:
325+
- resource:
326+
kind: Secret
327+
name: openstack-ca-cert
328+
category: "all"
329+
order: "1"
330+
- resource:
331+
kind: Secret
332+
name: nova-cell1-config
333+
category: "controlplane"
334+
order: "5"
335+
```
336+
337+
**Benefits of CRD-based configuration:**
338+
- Centralized configuration for all restore order defaults
339+
- Easy to customize per deployment
340+
- No need to manually label every resource
341+
- Can be backed up and restored along with other CRs
342+
343+
**Implementation approach:**
344+
1. Webhook reads OpenStackBackupConfig CR to get default orders
345+
2. Applies configured defaults instead of hardcoded values
346+
3. Still respects existing labels (manual overrides take precedence)
347+
4. Fallback to hardcoded defaults if no config CR exists
348+
248349
## Restore Order
249350
250351
The restore sequence is critical for maintaining dependencies between resources.
@@ -366,24 +467,47 @@ Use case: Full restore of all labeled resources (default)
366467

367468
### Phase 1: Webhook & CRD Annotations (No Controller)
368469

369-
**Goal**: Automatic labeling of resources for backup
470+
**Goal**: Automatic labeling of resources for restore
370471

371472
**Changes:**
372473
1. Add CRD annotations to all operator CRDs
373-
2. Implement mutating webhook in openstack-operator
374-
3. Deploy webhook configuration
375-
4. Test that resources get labeled on creation
474+
2. Implement mutating webhook in openstack-operator (reuse ValidateCreate pattern)
475+
3. Implement generic helper function in lib-common (respects existing labels)
476+
4. Deploy webhook configuration
477+
5. Test that resources get labeled on creation
376478

377479
**Backward Compatibility**: Existing Ansible backup/restore continues to work
378480

481+
**Features:**
482+
- Automatic labeling of user-provided resources (no ownerReferences)
483+
- Respects existing labels (allows manual customization)
484+
- Default restore order based on resource type
485+
- Works on both Create and Update (handles existing environments)
486+
379487
**Testing:**
380488
```bash
381-
# Create a test secret
489+
# Test 1: Automatic labeling with defaults
382490
oc create secret generic test-secret --from-literal=foo=bar -n openstack
383491
384-
# Verify label was added
492+
# Verify default labels were added
385493
oc get secret test-secret -n openstack -o jsonpath='{.metadata.labels}'
386494
# Should show: openstack.org/backup-restore: "true", openstack.org/backup-restore-order: "1"
495+
496+
# Test 2: Manual override (pre-label before webhook runs)
497+
oc create secret generic custom-secret \
498+
--from-literal=foo=bar \
499+
-n openstack \
500+
--dry-run=client -o yaml | \
501+
oc label -f - --local \
502+
openstack.org/backup-restore=true \
503+
openstack.org/backup-category=controlplane \
504+
openstack.org/backup-restore-order=5 \
505+
--dry-run=client -o yaml | \
506+
oc apply -f -
507+
508+
# Verify custom labels were preserved
509+
oc get secret custom-secret -n openstack -o jsonpath='{.metadata.labels}'
510+
# Should show: openstack.org/backup-restore: "true", openstack.org/backup-restore-order: "5"
387511
```
388512

389513
### Phase 2: OADP Backup (No Controller)
@@ -522,9 +646,34 @@ oc annotate openstackcontrolplane openstack-galera-network-isolation \
522646

523647
### Phase 4: Golang Controller (Full Automation)
524648

525-
**Goal**: Full automation with OpenStackBackupRestore CRD and controller
649+
**Goal**: Full automation with controller and CRDs
650+
651+
**New CRDs:**
526652

527-
**New CRD:**
653+
**OpenStackBackupConfig** - Configure restore order defaults
654+
```yaml
655+
apiVersion: core.openstack.org/v1beta1
656+
kind: OpenStackBackupConfig
657+
metadata:
658+
name: backup-config
659+
namespace: openstack
660+
spec:
661+
restoreDefaults:
662+
secrets:
663+
category: "all"
664+
order: "1"
665+
configmaps:
666+
category: "all"
667+
order: "1"
668+
persistentvolumeclaims:
669+
category: "all"
670+
order: "8"
671+
issuers:
672+
category: "all"
673+
order: "2"
674+
```
675+
676+
**OpenStackBackupRestore** - Execute backup/restore operations
528677
```yaml
529678
apiVersion: core.openstack.org/v1beta1
530679
kind: OpenStackBackupRestore
@@ -596,23 +745,32 @@ status:
596745
- Is this acceptable?
597746
- Can we optimize for specific resource types?
598747

599-
3. **Label vs Annotation for Restore Order**: Should `restore-order` be a label or annotation?
600-
- Label: Can be used in OADP selector
601-
- Annotation: Cleaner, but need controller to copy to label
602-
603-
4. **Database Restore Automation**: Should controller exec into pods or require manual intervention?
748+
3. **Database Restore Automation**: Should controller exec into pods or require manual intervention?
604749
- Automated mode: Controller execs into pods
605750
- Manual mode: User runs commands
606751

607-
5. **PVC Labeling**: How do PVCs get the backup label?
752+
4. **PVC Labeling**: How do PVCs get the backup label?
608753
- Service operators add label when creating PVCs?
609754
- Separate webhook for PVCs?
610755
- Manual labeling required?
611756

612-
6. **Webhook Scope**: Should webhook run in openstack-operator or separate deployment?
613-
- openstack-operator: Simpler deployment
757+
5. **Webhook Scope**: Should webhook run in openstack-operator or separate deployment?
758+
- openstack-operator: Simpler deployment (reuse existing webhooks)
614759
- Separate: Cleaner separation of concerns
615760

761+
6. **OpenStackBackupConfig Scope**: Should the config CR be namespace-scoped or cluster-scoped?
762+
- Namespace-scoped: Different configs per OpenStack deployment
763+
- Cluster-scoped: Single config for all OpenStack deployments
764+
765+
7. **Default vs Custom Order Precedence**: How should the order precedence work?
766+
- Current proposal: Manual labels > OpenStackBackupConfig > Hardcoded defaults
767+
- Alternative: OpenStackBackupConfig > Manual labels > Hardcoded defaults
768+
769+
8. **Webhook Update Logic**: Should webhook update resources on every reconcile?
770+
- Only on Create: Simpler, but doesn't handle label removal
771+
- On Create and Update: Handles label changes, but more update operations
772+
- Current proposal: On Create and Update (ValidateCreate + ValidateUpdate)
773+
616774
## Next Steps
617775

618776
1. Sketch detailed implementation for Phase 1 (webhook)

0 commit comments

Comments
 (0)