Support creating sole tenancy nodes#1410
Conversation
| region = var.gcp_region | ||
| distribution_policy_zones = [var.gcp_zone] | ||
|
|
||
| target_size = var.isolated_client_cluster_size < var.isolated_client_cluster_size_max ? null : var.isolated_client_cluster_size |
There was a problem hiding this comment.
Bug: Bug
The target_size logic for the isolated_client_pool instance group manager is misconfigured. When isolated_client_cluster_size is less than isolated_client_cluster_size_max, target_size becomes null. This implies autoscaling, but no autoscaling policies are defined, which can lead to the instance group manager failing or behaving unexpectedly.
ValentaTomas
left a comment
There was a problem hiding this comment.
Let's tweak the default setup so that ideally the node type is defined but it got size 0.
|
|
||
| scheduling { | ||
| on_host_maintenance = "MIGRATE" | ||
| } |
There was a problem hiding this comment.
Bug: Missing node_affinities for sole-tenant scheduling
The scheduling block for sole tenant instances is missing the required node_affinities configuration. Instances created from this template won't be scheduled on the sole tenant node group (google_compute_node_group.client), defeating the purpose of sole tenancy. The scheduling block should include node_affinities that reference the node group to ensure instances are placed on the dedicated sole tenant nodes.
don't need it to validate
a node group cannot have a zero size
|
Putting this back into draft, as us-west-1 has no available sole tenant n1 resources that also have local ssds. |
|
We will reopen after we have support for the new machine types. |
This will let us isolate ourselves from noisy neighbors
Note
Adds an isolated client sole-tenant GCE node pool with new variables wired through Terraform, relaxes the Google provider constraint, and introduces a CI job to validate IaC.
iac/provider-gcp/nomad-cluster/nodepool-client-isolated.tf(google_compute_node_template,google_compute_node_group,google_compute_instance_template,google_compute_region_instance_group_manager).client_node_type,isolated_client_cluster_target_size, andisolated_client_cluster_disk_countvariables with defaults iniac/provider-gcp/variables.tf; plumb throughiac/provider-gcp/main.tfto./nomad-clusterand declare iniac/provider-gcp/nomad-cluster/variables.tf.googleprovider to~> 6iniac/provider-gcp/main.tf.iac/provider-gcp/variables.tf.permissions: contents: readand newvalidate-iacjob in.github/workflows/pr-tests.ymltoterraform init -backend=falseandterraform validateiniac/provider-gcp.Written by Cursor Bugbot for commit a45dda8. This will update automatically on new commits. Configure here.