2026-05-07 — #814
Non-breaking changes:
- Add
ProjectQuotaCRD with per-resource, per-AZ quota breakdown and PAYG (pay-as-you-go) calculation support (#796) - Add
FlavorGroupCapacityCRD and background capacity controller that pre-computes per-flavor VM slot capacity for each (flavor group × AZ) pair on a configurable interval (#728) - Report capacity from
FlavorGroupCapacityCRDs inPOST /commitments/v1/report-capacity— replaces placeholder zeros with real values; stale CRDs report last-known capacity - Move CommittedResource usage computation from the API handler into a dedicated reconciler that persists results in CRD status, making usage data available to both the LIQUID API and quota controller (#800)
- Add KVM OS version as a label to KVM host capacity metrics (#810)
- Add KVM project usage metrics (running VMs and resource usage per project/flavor) (#803)
- Add
domain_idand name to vmware project capacity metrics (#802) - Include
domain_idin vmware project commitment KPI (#806) - Add weighing explainer for scheduling decisions, surfacing per-host scoring rationale (#808)
- Move KVM host capacity metric into infrastructure plugins package (#809)
- Remove deprecated per-compute infrastructure KPIs (
flavor_running_vms,host_running_vms,resource_capacity_kvm) (#807) - Rename hypervisor
ClusterRoleBindingobjects to avoidroleRefconflicts on redeploy (#804) - Move bundle-specific RBAC templates from the library chart into individual bundle charts (
cortex-ironcore,cortex-pods) (#797) - Move webhook templates from library chart back into
cortex-novabundle (reverts earlier move) (#805) - Fix: add
identity-domainsas a KPI dependency - Fix: remove
ignoreAllocationsfrom kvm-report-capacity pipeline to unblock deployment against older admission webhook (#812) - Fix: suppress nova scheduling alerts on transient
no such hostDNS errors - Replace
testlib.Ptrhelper with nativenew()across test files (#801)
Includes updated chart cortex v0.0.47.
Non-breaking changes:
- Add Prometheus datasource for KVM project usage metrics
- Add KVM project usage KPI CRD templates
- Add KVM project utilization KPI CRD templates
- Update
cortex-novaRBAC to grant permissions forFlavorGroupCapacityandProjectQuotaCRDs
2026-05-04 — #793
Non-breaking changes:
- Fix capacity filter to correctly account for multi-VM CommittedResource reservation slots — confirmed VMs are now summed (not just the last one), blocks are clamped to zero when confirmed exceeds slot size, and spec-only VMs larger than remaining slot are fully covered
- Expose
prometheusDatasourceControllerParallelReconcilesconfig option to allow parallel reconciles in the Prometheus datasource controller, reducing initial sync latency - Remove
Conffield from PrometheusDatasourceReconciler — config is now loaded internally viaconf.GetConfigduringSetupWithManager - Add operator-controlled per-resource-type config (
flavorGroupResourceConfig) for committed resources, replacing runtime derivation from flavor group metadata; supports wildcard (*) catch-all for unknown groups - Propagate
AnnotationCreatorRequestIDfrom the change-commitments API to the CommittedResource CRD and through the reservation controller for end-to-end request tracing
Includes updated chart cortex v0.0.46.
Non-breaking changes:
- Remove all committed resource related Prometheus alerts (info API, change API, usage API, capacity API, and syncer alerts)
- Add
flavorGroupResourceConfigto cortex-nova values.yaml with a wildcard default that setshasCapacity: truefor ram, cores, and instances
2026-05-04 — #779
Non-breaking changes:
- Add CommittedResource CRD definition and controller that watches CommittedResource objects and manages child Reservation CRUD
- Add
AllowRejectionfield to CommittedResourceSpec for controlling placement failure behavior - Add vmware project utilization KPI tracking instances per project/flavor and capacity per host
- Move vmware resource commitments KPI to new infrastructure plugins package with shared utilities
- Move vmware host capacity KPI to infrastructure plugins package
- Add basic support for flavor groups for failover reservation with consolidation weigher
- Add
useFlavorGroupResourcesvalues.yaml key for cortex-nova (default: false) - Update external dependencies (controller-runtime v0.24.0, go-sqlite3 v1.14.44, zap v1.28.0)
- Alert only on new vm faults (avoid re-alerting on historical faults)
Breaking changes:
- Remove
traits.staticvalues.yaml key and Helm-managed static traits ConfigMap template — traits are now fully managed by the shim at runtime via a single ConfigMap
Non-breaking changes:
- Add per-request feature mode override via
X-Cortex-Feature-Modeheader - Refactor /traits API to single-ConfigMap model with reusable Syncer interface pattern
- Implement feature-gated /resource_classes API with ConfigMap storage (passthrough, hybrid, crd modes)
- Add ResourceClassSyncer for periodic upstream sync into local ConfigMap
- Add
resourceClasses.configMapNamevalues.yaml key for configuring the resource classes ConfigMap name - Support traits and aggregates endpoints per resource provider with three feature modes (passthrough, hybrid, crd)
- Exercise all three feature modes in placement shim e2e tests
- Fix nil pointer panic in feature mode override guard
Breaking changes:
- Upgrade PostgreSQL from 17.9 to 18.3 — resource names now include a
-v{major}suffix for zero-downtime upgrades (e.g.,cortex-nova-postgresql-v18). After deploy, operators must remove old StatefulSets and PVCs manually.
Non-breaking changes:
- Add versioned resource naming with
cortex-postgres.versionedFullnamehelper for zero-downtime PG major upgrades - Add
majorvalues.yaml key (default: "18") to control version suffix - Set PGDATA to subdirectory to avoid lost+found conflict
Includes updated charts cortex v0.0.45 and cortex-postgres v0.6.0.
Non-breaking changes:
- Reorganize KPI CRD templates for infrastructure dashboard metrics
- Add
useFlavorGroupResourcesvalues.yaml key for failover reservations (default: false) - Restructure committedResource config keys into nested objects (
committedResourceReservationController,committedResourceController,committedResourceAPI) - Add
committedResourceSyncIntervalconfig key for syncer reconciliation interval
Includes updated chart cortex-shim v0.1.0.
Breaking changes:
- Remove
traits.staticvalues.yaml key (inherited from cortex-shim breaking change)
Non-breaking changes:
- Add
resourceClasses.configMapNamevalues.yaml key
Non-breaking changes:
- Fix bump-artifact workflow to handle concurrent changes on main with concurrency groups and freshness checks
- Add reusable
bump-chart.shscript for CI chart version bumps - Add pull-request-creator Claude agent
- Add changelog update command and workflow for release PRs
- Add linting workflow for scaffold completeness checks
- Make /release claude command idempotent
- Don't run helm-lint workflow when release PR is in draft
- Update actions/setup-python action to v6
- Fix stale documentation: traits model, pipeline name, and API path