[MWG-1605] feat: add cube and gpu support via templates#351
Conversation
|
@jriedel-ionos I want to add an additional e2e test for cubes, could you please set the variable |
|
|
||
| // when using templates (cubes or gpu servers) we cannot delete the boot volume | ||
| // the whole server must be deleted at once | ||
| if !deleteVolumes && bootVolumeID != nil && (server.Properties != nil && server.Properties.TemplateUuid == nil) { |
There was a problem hiding this comment.
for this change here I am not 100% sure if this works as expected.
when using templates you must delete the whole server including the boot volume at once, you cannot detach or delete the boot volume by itself.
but we also dont want to delete the attached volumes from PVCs.
in testing I noticed that CAPI (or CAPIC, not sure) waits until all PVCs are detached, I could perform a node rebuild/deletion without loosing the PVC volumes. but I am not sure if this a guarantee
|
Note to reviewers: I have this running on our teams sandbox here: https://github.com/ionos-cloud/mwg-deployment/tree/main/projects/sandbox-cluster/capi/templates |
There was a problem hiding this comment.
Pull request overview
This PR adds support for provisioning IONOS Cloud CUBE and GPU Kubernetes nodes via server templates, including new clusterctl templates, CRD/schema updates, and tests (with e2e coverage using CUBE as a cheaper proxy for GPU template behavior).
Changes:
- Add
templateIDplus new server/disk types (CUBE/GPU, DAS) to the API types and CRDs, including validation rules. - Update server reconciliation to set template-backed server properties correctly and handle template-specific boot volume constraints.
- Add new cluster templates (cube/gpu) and extend e2e coverage with a CUBE flavor test.
Reviewed changes
Copilot reviewed 16 out of 17 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
internal/service/cloud/server.go |
Applies template-specific server/boot-volume property rules and skips boot-volume deletion for template-backed servers. |
internal/service/cloud/server_test.go |
Adds unit tests for CUBE/GPU template provisioning and template deletion behavior. |
api/v1alpha1/ionoscloudmachine_types.go |
Introduces templateID, new ServerType values (CUBE/GPU), and DAS disk type + validation annotations. |
api/v1alpha1/ionoscloudmachine_types_test.go |
Adds/extends validation tests for new server types and templateID rules. |
config/crd/bases/infrastructure.cluster.x-k8s.io_ionoscloudmachines.yaml |
Updates CRD schema/enum/validations for template-backed server types and DAS. |
config/crd/bases/infrastructure.cluster.x-k8s.io_ionoscloudmachinetemplates.yaml |
Same as above for machine templates CRD. |
templates/cluster-template-cube.yaml |
Adds clusterctl flavor template for CUBE servers using templateID (and DAS). |
templates/cluster-template-gpu.yaml |
Adds clusterctl flavor template for GPU servers using templateID. |
test/e2e/data/infrastructure-ionoscloud/cluster-template-cube.yaml |
Adds e2e cluster template for the cube flavor. |
test/e2e/config/ionoscloud.yaml |
Registers the new e2e template and adds IONOSCLOUD_CUBE_TEMPLATE_ID variable. |
test/e2e/capic_test.go |
Adds an e2e QuickStartSpec covering the cube flavor. |
.github/workflows/e2e.yaml |
Plumbs IONOSCLOUD_CUBE_TEMPLATE_ID into the e2e workflow environment. |
docs/quickstart.md |
Documents new server types and the new cube/gpu templates and variables. |
docs/custom-image.md |
Documents EFI/UEFI requirements for GPU usage and updated build guidance. |
envfile.example |
Adds example env vars for cube/gpu template IDs. |
go.mod, go.sum |
Bumps github.com/ionos-cloud/sdk-go/v6 to v6.3.6. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Edit: this essentially means that cubes should not be used as control plane nodes. |
For #351 I need the cube template id added to the e2e tests. For the tests to run I need to merge this to main because I am coming from a fork.
|



What is the purpose of this pull request/Why do we need it?
We need kubernetes nodes with GPUs.
Description of changes:
GPU servers are similar to cubes, they use templates.
Therefore I added support for templates and since CUBE is just another server type, I added support for it in addition to GPU as well.
Added an e2e test that tests with cubes because testing with GPUs requires an image with UEFI and would also just be too expensive.
Checklist: