|
| 1 | +# Drift Detection and External Deletion Handling |
| 2 | + |
| 3 | +ORC can periodically reconcile resources to detect and correct configuration drift — changes made to OpenStack resources outside of ORC's control. This feature also detects when managed resources have been deleted directly from OpenStack and recreates them automatically. |
| 4 | + |
| 5 | +## Enabling Drift Detection |
| 6 | + |
| 7 | +Drift detection is disabled by default. Enable it per-resource by setting `spec.resyncPeriod`: |
| 8 | + |
| 9 | +```yaml |
| 10 | +apiVersion: openstack.k-orc.cloud/v1alpha1 |
| 11 | +kind: Network |
| 12 | +metadata: |
| 13 | + name: critical-network |
| 14 | +spec: |
| 15 | + cloudCredentialsRef: |
| 16 | + secretName: openstack-clouds |
| 17 | + cloudName: openstack |
| 18 | + managementPolicy: managed |
| 19 | + resyncPeriod: 1h # Re-check OpenStack every hour |
| 20 | + resource: |
| 21 | + description: Critical application network |
| 22 | +``` |
| 23 | +
|
| 24 | +The `resyncPeriod` field accepts any Go duration string: `10m`, `1h`, `24h`, etc. |
| 25 | + |
| 26 | +**Default:** `0` (disabled). When disabled, ORC only reconciles resources in response to spec changes or controller restarts. |
| 27 | + |
| 28 | +!!! note |
| 29 | + |
| 30 | + Conservative resync periods (e.g., `1h` or `10h`) are recommended in production to avoid excessive OpenStack API calls. |
| 31 | + |
| 32 | +## How It Works |
| 33 | + |
| 34 | +After a resource reaches a stable state (`Progressing=False`), ORC schedules a reconciliation after the configured `resyncPeriod`. On each resync: |
| 35 | + |
| 36 | +1. ORC fetches the current state of the OpenStack resource. |
| 37 | +2. For **managed** resources: if drift is detected, ORC updates the resource to match the Kubernetes spec. |
| 38 | +3. For **unmanaged** resources: ORC refreshes `status.resource` to reflect the current OpenStack state, but makes no changes. |
| 39 | +4. The next resync is scheduled. |
| 40 | + |
| 41 | +A small random jitter ([0%, +20%]) is applied to `resyncPeriod` to spread reconciliations and avoid thundering-herd effects. |
| 42 | + |
| 43 | +!!! note |
| 44 | + |
| 45 | + Resources in a terminal error state (`Progressing=False` with reason `InvalidConfiguration` or `UnrecoverableError`) are **not** periodically resynced. Terminal errors require manual intervention to resolve. |
| 46 | + |
| 47 | +## Tracking Sync Status |
| 48 | + |
| 49 | +Every ORC resource has a `status.lastSyncTime` field that records when ORC last successfully reconciled with OpenStack: |
| 50 | + |
| 51 | +```bash |
| 52 | +kubectl get network critical-network -o jsonpath='{.status.lastSyncTime}' |
| 53 | +# 2026-02-03T10:30:00Z |
| 54 | +``` |
| 55 | + |
| 56 | +ORC persists this timestamp in the Kubernetes status. After a controller restart, it uses `lastSyncTime` to determine when the next resync should occur, preventing a thundering herd of reconciliations on startup. |
| 57 | + |
| 58 | +## External Deletion Handling |
| 59 | + |
| 60 | +When a resource is deleted directly from OpenStack (bypassing ORC), the behavior depends on how ORC originally obtained the resource. |
| 61 | + |
| 62 | +### ORC-Created Resources (Managed, Not Imported) |
| 63 | + |
| 64 | +If you created the resource through ORC's `spec.resource` field, ORC **recreates** it automatically: |
| 65 | + |
| 66 | +1. ORC detects the resource is missing from OpenStack (the ID stored in `status.id` no longer exists). |
| 67 | +2. ORC clears `status.id`. |
| 68 | +3. On the next reconcile, ORC creates a new OpenStack resource. |
| 69 | +4. The new resource ID is stored in `status.id`. |
| 70 | + |
| 71 | +The ORC object continues to exist and becomes `Available=True` again once the resource is recreated. |
| 72 | + |
| 73 | +```yaml |
| 74 | +# This type of resource will be recreated if deleted from OpenStack |
| 75 | +spec: |
| 76 | + managementPolicy: managed |
| 77 | + resyncPeriod: 10m # Enable resync to detect deletion quickly |
| 78 | + resource: # Resource was created by ORC |
| 79 | + description: My application network |
| 80 | +``` |
| 81 | + |
| 82 | +!!! warning |
| 83 | + |
| 84 | + Recreation produces a new OpenStack resource with a **new ID**. Any OpenStack resources (outside ORC) that referenced the old ID will need to be updated manually. |
| 85 | + |
| 86 | +### Imported Resources (Terminal Error) |
| 87 | + |
| 88 | +If you imported an existing resource using `spec.import`, ORC reports a **terminal error** when the resource is deleted from OpenStack: |
| 89 | + |
| 90 | +- `Available=False` |
| 91 | +- `Progressing=False` |
| 92 | +- Condition reason: `UnrecoverableError` |
| 93 | +- Message: `resource has been deleted from OpenStack` |
| 94 | + |
| 95 | +ORC does **not** recreate imported resources because it did not create them originally, and recreating a new empty resource would not restore what was lost. |
| 96 | + |
| 97 | +```yaml |
| 98 | +# This type of resource enters terminal error if deleted from OpenStack |
| 99 | +spec: |
| 100 | + managementPolicy: managed |
| 101 | + import: |
| 102 | + id: "12345678-1234-1234-1234-123456789abc" # Was imported by ID |
| 103 | +``` |
| 104 | + |
| 105 | +```yaml |
| 106 | +# This type also enters terminal error if deleted from OpenStack |
| 107 | +spec: |
| 108 | + managementPolicy: unmanaged |
| 109 | + import: |
| 110 | + filter: |
| 111 | + name: public # Was imported by filter |
| 112 | +``` |
| 113 | + |
| 114 | +To recover: manually recreate the OpenStack resource and update the ORC object's `spec.import.id` to the new resource ID, or delete and recreate the ORC object. |
| 115 | + |
| 116 | +### Summary Table |
| 117 | + |
| 118 | +| Resource Type | How Obtained | External Deletion Behavior | |
| 119 | +|--------------|--------------|---------------------------| |
| 120 | +| Managed, ORC-created | `spec.resource` | **Recreated** automatically | |
| 121 | +| Managed, imported by ID | `spec.import.id` | **Terminal error** | |
| 122 | +| Managed, imported by filter | `spec.import.filter` | **Terminal error** | |
| 123 | +| Unmanaged | `spec.import.*` | **Terminal error** | |
| 124 | + |
| 125 | +## Verifying Recreation Occurred |
| 126 | + |
| 127 | +When an ORC-created resource is recreated after external deletion, `status.id` changes to reflect the new OpenStack resource ID. Monitor this to detect recreation events: |
| 128 | + |
| 129 | +```bash |
| 130 | +# Record the current ID |
| 131 | +ORIGINAL_ID=$(kubectl get network my-network -o jsonpath='{.status.id}') |
| 132 | +echo "Original ID: $ORIGINAL_ID" |
| 133 | +
|
| 134 | +# ... some time later, check if it changed ... |
| 135 | +CURRENT_ID=$(kubectl get network my-network -o jsonpath='{.status.id}') |
| 136 | +if [ "$ORIGINAL_ID" != "$CURRENT_ID" ]; then |
| 137 | + echo "Resource was recreated! New ID: $CURRENT_ID" |
| 138 | +fi |
| 139 | +``` |
| 140 | + |
| 141 | +You can also watch the resource for status changes: |
| 142 | + |
| 143 | +```bash |
| 144 | +kubectl get network my-network -w |
| 145 | +``` |
| 146 | + |
| 147 | +During recreation, you will observe: |
| 148 | + |
| 149 | +1. `Available=False`, `Progressing=True` — ORC is recreating the resource |
| 150 | +2. `Available=True`, `Progressing=False` — Recreation complete, `status.id` has new value |
| 151 | + |
| 152 | +## Implications for Dependent Resources |
| 153 | + |
| 154 | +OpenStack enforces referential integrity for most resource relationships (e.g., a Network cannot be deleted while Subnets exist). If an external deletion manages to bypass these constraints (e.g., direct database manipulation), the behavior of dependent ORC resources follows these rules: |
| 155 | + |
| 156 | +### If a Parent Resource Is Recreated |
| 157 | + |
| 158 | +When a parent resource (e.g., Network) is recreated by ORC, dependent resources that reference it (e.g., Subnets) detect the parent as available again but may encounter errors when OpenStack rejects operations referencing the old parent ID. **Manual intervention may be required** to recreate dependent resources against the new parent. |
| 159 | + |
| 160 | +### If a Parent Resource Enters Terminal Error |
| 161 | + |
| 162 | +When a parent resource enters terminal error: |
| 163 | + |
| 164 | +- **Dependent resources waiting on it** (e.g., a Subnet waiting for its Network): ORC will not proceed — it waits until the parent becomes available again. The dependent is not itself in an error state; it is just waiting. |
| 165 | +- **Dependent resources already created**: ORC continues managing them normally. If ORC attempts to update a dependent resource that references a deleted parent in OpenStack, the behavior depends on what OpenStack returns for that operation. |
| 166 | + |
| 167 | +!!! warning |
| 168 | + |
| 169 | + If a parent resource is externally deleted in a way that bypasses OpenStack's referential integrity checks, the resulting state may require manual cleanup of both the parent and dependent resources. This is an unusual operational scenario and not specific to drift detection. |
| 170 | + |
| 171 | +## Interaction with `managementPolicy: unmanaged` |
| 172 | + |
| 173 | +Unmanaged resources are never modified by ORC. With `resyncPeriod` set, ORC will periodically refresh `status.resource` to reflect the current OpenStack state. However, if the OpenStack resource is deleted, ORC will report a terminal error — it does not recreate unmanaged resources under any circumstances. |
| 174 | + |
| 175 | +```yaml |
| 176 | +spec: |
| 177 | + managementPolicy: unmanaged |
| 178 | + resyncPeriod: 1h # Refresh status every hour, but never modify OpenStack |
| 179 | + import: |
| 180 | + id: "12345678-1234-1234-1234-123456789abc" |
| 181 | +``` |
| 182 | + |
| 183 | +## Drift Detection Without Resync |
| 184 | + |
| 185 | +Even with `resyncPeriod: 0` (the default, disabled), ORC will still detect external deletion when another event triggers reconciliation — for example, when you make a spec change or the controller restarts. The recreation or terminal error behavior is the same; the difference is only in how quickly ORC detects the deletion. |
| 186 | + |
| 187 | +!!! tip |
| 188 | + |
| 189 | + If you want rapid detection of external deletions for critical resources, set a short `resyncPeriod` (e.g., `10m`). |
0 commit comments