Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
main.tf	main.tf

Name

Last commit message

Last commit date

Development Cluster with L4 Spot GPUs

Creates a minimal GKE cluster with a single L4 GPU spot node pool — ideal for development and testing SIE (Search Inference Engine) workloads at low cost.

What this example creates

Resource	Configuration
GKE cluster	Private nodes, Cloud NAT, Workload Identity
GPU node pool	1x NVIDIA L4 per node (g2-standard-8), spot VMs, scale 0-5
CPU node pool	e2-standard-4, scale 1-3 (system workloads)
Artifact Registry	Docker repository for SIE images
NAP	Node Auto-Provisioning enabled (auto-creates pools as needed)

Estimated cost: ~$0.50/hr when a GPU node is running. $0/hr when scaled to zero (only the GKE management fee applies).

Usage

export TF_VAR_project_id="your-gcp-project-id"

terraform init
terraform plan
terraform apply

After apply, deploy SIE via Helm:

# Configure kubectl
$(terraform output -raw kubectl_command)

# Install SIE (gateway, workers, KEDA, Prometheus, Grafana)
helm upgrade --install sie-cluster oci://ghcr.io/superlinked/charts/sie-cluster --version 0.3.4 \
  -f values-gke.yaml \
  --create-namespace -n sie \
  --set serviceAccount.annotations."iam\.gke\.io/gcp-service-account"="$(terraform output -raw workload_identity_annotation)"

Variables

Variable	Default	Description
`project_id`	— (required)	Your GCP project ID
`region`	`us-central1`	GCP region
`cluster_name`	`sie-dev`	Cluster name
`create_artifact_registry`	`true`	Create a Docker registry for SIE images
`deployer_service_account`	`""`	Service account email (for CI/CD; optional for interactive use)

Outputs

Output	Description
`cluster_name`	GKE cluster name
`kubectl_command`	Run this to configure kubectl
`artifact_registry_url`	URL for pushing Docker images
`workload_identity_annotation`	Annotation for Helm service account

Customizing

Change region:

export TF_VAR_region="europe-west4"

Use on-demand instead of spot (more reliable, higher cost):

Override gpu_node_pools in a terraform.tfvars file:

gpu_node_pools = [
  {
    name           = "l4-ondemand"
    machine_type   = "g2-standard-8"
    gpu_type       = "nvidia-l4"
    gpu_count      = 1
    min_node_count = 0
    max_node_count = 5
    spot           = false
  }
]

Prerequisites

GCP project with billing enabled
GPU quota for nvidia-l4 in your region (check: gcloud compute regions describe REGION --format="table(quotas.filter(metric:NVIDIA))")
APIs enabled: container.googleapis.com, compute.googleapis.com

Cleanup

terraform destroy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Development Cluster with L4 Spot GPUs

What this example creates

Usage

Variables

Outputs

Customizing

Prerequisites

Cleanup

FilesExpand file tree

dev-l4-spot

Directory actions

More options

Directory actions

More options

Latest commit

History

dev-l4-spot

Folders and files

parent directory

README.md

Development Cluster with L4 Spot GPUs

What this example creates

Usage

Variables

Outputs

Customizing

Prerequisites

Cleanup