This example demonstrates how to use the GKE cluster and node pools Terraform modules to create a complete Google Kubernetes Engine setup.
- A GKE cluster with private nodes
- Three different node pools:
- Default pool: General-purpose workloads with autoscaling (1-3 nodes)
- High-memory pool: Memory-intensive workloads with taints (0-2 nodes)
- Spot pool: Cost-effective batch workloads using spot instances (0-5 nodes)
-
GCP Project: You need a GCP project with the following APIs enabled:
- Kubernetes Engine API
- Compute Engine API
- IAM API
-
Authentication: Configure authentication using one of:
gcloud auth application-default login- Service account key file
- Workload Identity (for GKE)
-
Terraform: Install Terraform >= 1.0
-
Copy the example configuration:
cp terraform.tfvars.example terraform.tfvars
-
Edit
terraform.tfvars:- Replace
your-gcp-project-idwith your actual GCP project ID - Adjust other values as needed (region, cluster name, etc.)
- Replace
-
Initialize Terraform:
terraform init
-
Plan the deployment:
terraform plan
-
Apply the configuration:
terraform apply
-
Connect to your cluster:
gcloud container clusters get-credentials my-gke-cluster --location us-central1 --project your-gcp-project-id
By default, this example uses the default VPC network. For production use, consider:
- Creating a custom VPC network
- Using VPC-native networking with secondary IP ranges
- Configuring firewall rules appropriately
The example creates a private cluster with:
- Private nodes (no external IP addresses)
- Public endpoint (can be changed to private)
- Master authorized networks for access control
The example includes three node pools with different characteristics:
-
Default Pool (
e2-medium):- General workloads
- Auto-scaling 1-3 nodes
- Standard persistent disk
-
High-Memory Pool (
e2-highmem-2):- Memory-intensive workloads
- Auto-scaling 0-2 nodes
- SSD persistent disk
- Tainted for specific workloads
-
Spot Pool (
e2-standard-2):- Batch/fault-tolerant workloads
- Auto-scaling 0-5 nodes
- Spot instances for cost savings
- Tainted for spot workloads
You can add additional node pools by extending the node_pools variable:
node_pools = {
# ... existing pools ...
"gpu-pool" = {
machine_type = "n1-standard-2"
disk_size_gb = 100
gpu_config = {
type = "nvidia-tesla-t4"
count = 1
}
autoscaling = {
min_node_count = 0
max_node_count = 2
}
taints = [
{
key = "nvidia.com/gpu"
value = "true"
effect = "NO_SCHEDULE"
}
]
}
}You can customize various cluster settings:
# Enable different features
enable_workload_identity = true
network_policy_enabled = true
release_channel = "RAPID" # or "REGULAR", "STABLE"
# Configure maintenance window
maintenance_start_time = "02:00" # 2:00 AM
# Add resource labels
resource_labels = {
environment = "production"
team = "platform"
cost_center = "engineering"
}After successful deployment, you'll get outputs including:
- Cluster name and endpoint
- Node pool information
- kubectl connection command
- Network configuration details
To destroy all resources:
terraform destroy- Master Authorized Networks: Configure appropriate CIDR blocks
- Private Nodes: Use private nodes for better security
- Workload Identity: Enable for secure pod-to-GCP-service authentication
- Network Policies: Enable for pod-to-pod traffic control
- Node Image: Use COS (Container-Optimized OS) for security updates
- Spot Instances: Use for fault-tolerant workloads
- Autoscaling: Configure appropriate min/max node counts
- Right-sizing: Choose appropriate machine types
- Preemptible Nodes: Alternative to spot instances for cost savings
The cluster is configured with:
- Google Cloud Logging integration
- Google Cloud Monitoring integration
- Cluster and node metrics collection
Common issues and solutions:
- API not enabled: Enable required GCP APIs
- Quota exceeded: Check GCP quotas for compute resources
- Network connectivity: Verify firewall rules and network configuration
- Authentication: Ensure proper GCP credentials are configured