You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
## Changelog
Changes to the committed resource reservations feature:
- Commitments usage api uses postgres database instead of calling nova
- Refinements to the committed resource alerts
- Added optional `VideoRAMMiB` field to the commitments usage api
- Check hypervisor resources against reservations
- Remove unused database connection and unused databaseSecretRef
configuration in cortex-nova bundle
Changes related to placement api shim:
- Expose metrics that measure the down/upstream request duration and
number of errors
- Add alerts based on the exposed metrics
- Use controller-runtime logging consistently and provide traceability
- Add e2e tests checking all endpoints of the passthrough
Changes to the failover reservations feature:
- Track the state of reservations and expose metrics
Changes regarding infrastructure dashboard and ops:
- Add vmware commitments kpi
- Add hypervisor family label to flavor usage kpi
- Fix links in cortex alerts
- External dependency upgrades
Copy file name to clipboardExpand all lines: docs/architecture.md
+29Lines changed: 29 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -29,3 +29,32 @@ Cortex receives the list of possible hosts and their weights from Nova. It then
29
29
30
30
> [!NOTE]
31
31
> Since, by default, Nova does not support calling an external service, this functionality needs to be added like in [SAP's fork of Nova](https://github.com/sapcc/nova/blob/stable/2023.2-m3/nova/scheduler/external.py).
32
+
33
+
## Placement API Shim
34
+
35
+
[Placement](https://github.com/openstack/placement) is OpenStack's resource inventory service. It provides an API to query the inventory of resources in the OpenStack cloud, such as compute nodes, their available resources, and the current resource usage. In the OpenStack realm, Placement is used by [Nova](https://github.com/openstack/nova) to carry out virtual machine scheduling, as well as [Neutron](https://github.com/openstack/neutron) for network resource allocation.
36
+
37
+
As part of the [CobaltCore](https://cobaltcore-dev.github.io/docs/) stack, we provide a Placement-like API shim, which translates requests from Nova and Neutron to the [Hypervisor CRD](https://github.com/cobaltcore-dev/openstack-hypervisor-operator) based on the KVM stack provided by [IronCore](https://ironcore.dev/), [Gardener](https://gardener.cloud/) and [Garden Linux](https://gardenlinux.io/). This means, instead of managing resource inventories in Placement's database, the Hypervisor CRD is used to track resource allocations and hypervisor capabilities.
38
+
39
+
### Passthrough
40
+
41
+
Placement maintains hypervisors of various kinds, such as [Ironic](https://github.com/openstack/ironic) or VMware vCenter Servers, not only KVM. However, only KVM hypervisors can be managed by the Cortex Placement API Shim. This means, when Nova or Neutron ask for VMware or Ironic resource providers, the shim needs to forward this request to another Placement instance. We call this the passthrough, and it looks like this:
After a request was received by the API, it is processed in two ways depending on the kind of endpoint that was requested:
56
+
57
+
1.**Aggregated forwarding**: For requests that ask for a list of resource providers, such as `GET /resource_providers`, the shim needs to forward the request to both the KVM translation and the passthrough. The responses from both sides are then aggregated and returned to the caller.
58
+
2.**Per-request forwarding**: For requests that ask for a specific resource provider, such as `GET /resource_providers/{uuid}`, the shim needs to determine if the requested resource provider is managed by the KVM translation or the passthrough. This can be done by checking the UUID of the resource provider against a list of known KVM resource providers. If it is a KVM resource provider, the request is forwarded to the translation; otherwise, it is forwarded to the OpenStack Placement instance.
59
+
60
+
The translation layer is responsible for translating the requests and responses between the OpenStack Placement API and the Hypervisor CRD. This includes mapping resource provider attributes, inventory, and allocations to the corresponding fields in the Hypervisor CRD.
Controller -->|update Spec/Status.Allocations| Res
90
+
Res -->|watch| C[Controller]
91
+
HV -->|watch - instance changes| C
92
+
Res -->|periodic safety-net requeue| C
93
+
C -->|update Spec/Status.Allocations| Res
92
94
```
93
95
94
96
| Component | Event | Timing | Action |
95
97
|-----------|-------|--------|--------|
96
98
|**Scheduling Pipeline**| VM Create, Migrate, Resize | Immediate | Add VM to `Spec.Allocations`|
97
-
|**Controller**| Reservation CRD updated |`committedResourceRequeueIntervalGracePeriod` (default: 1 min) | Verify new VMs via Nova API; update `Status.Allocations`|
98
-
|**Controller**| Periodic check |`committedResourceRequeueIntervalActive` (default: 5 min) | Verify established VMs via Hypervisor CRD; remove gone VMs from `Spec.Allocations`|
99
+
|**Controller**| Reservation CRD updated |`committedResourceRequeueIntervalGracePeriod` (default: 1 min) | Defer verification for new VMs still spawning; update `Status.Allocations`|
100
+
|**Controller**| Hypervisor CRD updated (VM appeared/disappeared) | Immediate (event-driven) | Verify allocations via Hypervisor CRD; remove gone VMs from `Spec.Allocations`|
101
+
|**Controller**| Periodic safety-net |`committedResourceRequeueIntervalActive` (default: 5 min) | Same as above; catches any missed events |
99
102
100
103
**Allocation fields**:
101
104
-`Spec.Allocations` — Expected VMs (written by the scheduling pipeline on placement)
102
105
-`Status.Allocations` — Confirmed VMs (written by the controller after verifying the VM is on the expected host)
103
106
104
107
**VM allocation state diagram**:
105
108
106
-
The controller uses two sources to verify VM allocations, depending on how recently the VM was placed:
107
-
-**Nova API** — used during the grace period (`committedResourceAllocationGracePeriod`, default: 15 min) where the VM may still be starting up; provides real-time host assignment
108
-
-**Hypervisor CRD** — used for established allocations; reflects the set of instances the hypervisor operator observes on the host
109
+
The controller uses the **Hypervisor CRD** as the sole source of truth for VM allocation verification:
110
+
-**Hypervisor CRD** — used for all allocation checks; reflects the set of instances the hypervisor operator observes on the host
SpecOnly --> Confirmed : found on HV CRD after grace period
121
+
SpecOnly --> [*] : not on HV CRD after grace period
122
+
Confirmed --> [*] : not on HV CRD
126
123
```
127
124
128
125
**Note**: VM allocations may not consume all resources of a reservation slot. A reservation with 128 GB may have VMs totaling only 96 GB if that fits the project's needs. Allocations may exceed reservation capacity (e.g., after VM resize).
@@ -185,10 +182,12 @@ The controller watches Reservation CRDs and performs two types of reconciliation
185
182
186
183
**Placement** - Finds hosts for new reservations (calls scheduler API)
187
184
188
-
**Allocation Verification** - Tracks VM lifecycle on reservations. VMs take time to appear on a host after scheduling, so new allocations are verified more frequently via the Nova API for real-time status, while established allocations are verified via the Hypervisor CRD:
189
-
- New VMs (within `committedResourceAllocationGracePeriod`, default: 15 min): checked via Nova API every `committedResourceRequeueIntervalGracePeriod` (default: 1 min)
190
-
- Established VMs: checked via Hypervisor CRD every `committedResourceRequeueIntervalActive` (default: 5 min)
191
-
- Missing VMs: removed from `Spec.Allocations` after Nova API confirms 404
185
+
**Allocation Verification** - Tracks VM lifecycle on reservations. The controller uses the Hypervisor CRD as the sole source of truth, with two triggers:
186
+
- New VMs (within `committedResourceAllocationGracePeriod`, default: 15 min): verification deferred — VM may still be spawning; requeued every `committedResourceRequeueIntervalGracePeriod` (default: 1 min)
187
+
- Established VMs: verified reactively when the Hypervisor CRD changes (VM appeared or disappeared in `Status.Instances`), with `committedResourceRequeueIntervalActive` (default: 5 min) as a safety-net fallback
188
+
- Missing VMs: removed from `Spec.Allocations` when not found on the Hypervisor CRD after the grace period
0 commit comments