|
| 1 | +--- |
| 2 | +title: Karpenter |
| 3 | +weight: 2 |
| 4 | +--- |
| 5 | + |
| 6 | +[Karpenter](https://github.com/kubernetes-sigs/karpenter) automatically launches just the right compute resources to handle your cluster's applications, but it is built to adhere to the scheduling decisions of kube-scheduler, so it's certainly possible we would run across some cases where Karpenter makes incorrect decisions when the InftyAI scheduler is in the mix. |
| 7 | + |
| 8 | +We forked the Karpenter project and re-complie the karpenter image for cloud providers like AWS, and you can find the details in [this proposal](https://github.com/InftyAI/llmaz/blob/main/docs/proposals/106-spot-instance-karpenter/README.md). This document provides deployment steps to install and configure Customized Karpenter in an EKS cluster. |
| 9 | + |
| 10 | +## How to use |
| 11 | + |
| 12 | +Please run the following command in the same terminal. |
| 13 | + |
| 14 | +### Create a cluster and add Karpenter |
| 15 | + |
| 16 | +Please refer to the [Getting Started with Karpenter](https://docs.aws.amazon.com/eks/latest/userguide/getting-started-eksctl.html) to create a cluster and add Karpenter. |
| 17 | + |
| 18 | +### Install the gpu operator |
| 19 | + |
| 20 | +```shell |
| 21 | +helm repo add nvidia https://helm.ngc.nvidia.com/nvidia \ |
| 22 | + && helm repo update |
| 23 | +helm install --wait --generate-name \ |
| 24 | + -n gpu-operator --create-namespace \ |
| 25 | + nvidia/gpu-operator \ |
| 26 | + --version=v25.3.0 |
| 27 | +``` |
| 28 | + |
| 29 | +### Install llmaz with InftyAI scheduler enabled |
| 30 | + |
| 31 | +Please refer to [installation](../getting-started/installation.md). |
| 32 | + |
| 33 | +### Configure Karpenter with customized image |
| 34 | + |
| 35 | +We need to assign the `karpenter-core-llmaz` cluster role to the `karpenter` service account and update the karpenter image to the customized one. |
| 36 | + |
| 37 | +```shell |
| 38 | +cat <<EOF | envsubst | kubectl apply -f - |
| 39 | +apiVersion: rbac.authorization.k8s.io/v1 |
| 40 | +kind: ClusterRoleBinding |
| 41 | +metadata: |
| 42 | + name: karpenter-core-llmaz |
| 43 | +roleRef: |
| 44 | + apiGroup: rbac.authorization.k8s.io |
| 45 | + kind: ClusterRole |
| 46 | + name: karpenter-core-llmaz |
| 47 | +subjects: |
| 48 | +- kind: ServiceAccount |
| 49 | + name: karpenter |
| 50 | + namespace: ${KARPENTER_NAMESPACE} |
| 51 | +--- |
| 52 | +apiVersion: rbac.authorization.k8s.io/v1 |
| 53 | +kind: ClusterRole |
| 54 | +metadata: |
| 55 | + name: karpenter-core-llmaz |
| 56 | +rules: |
| 57 | +- apiGroups: ["llmaz.io"] |
| 58 | + resources: ["openmodels"] |
| 59 | + verbs: ["get", "list", "watch"] |
| 60 | +EOF |
| 61 | + |
| 62 | +helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter --version "${KARPENTER_VERSION}" --namespace "${KARPENTER_NAMESPACE}" --create-namespace \ |
| 63 | + --set "settings.clusterName=${CLUSTER_NAME}" \ |
| 64 | + --set "settings.interruptionQueue=${CLUSTER_NAME}" \ |
| 65 | + --set controller.resources.requests.cpu=1 \ |
| 66 | + --set controller.resources.requests.memory=1Gi \ |
| 67 | + --set controller.resources.limits.cpu=1 \ |
| 68 | + --set controller.resources.limits.memory=1Gi \ |
| 69 | + --wait \ |
| 70 | + --set controller.image.repository=inftyai/aws-karpenter \ |
| 71 | + --set "controller.image.tag=${KARPENTER_VERSION}" \ |
| 72 | + --set controller.image.digest="" |
| 73 | +``` |
| 74 | + |
| 75 | +## Basic Example |
| 76 | + |
| 77 | +1. Create a gpu node pool |
| 78 | + |
| 79 | +```yaml |
| 80 | +cat <<EOF | envsubst | kubectl apply -f - |
| 81 | +apiVersion: karpenter.k8s.aws/v1 |
| 82 | +kind: EC2NodeClass |
| 83 | +metadata: |
| 84 | + name: llmaz-demo # you can change the name to a more meaningful one, please align with the node pool's nodeClassRef. |
| 85 | +spec: |
| 86 | + amiSelectorTerms: |
| 87 | + - alias: al2023@${ALIAS_VERSION} |
| 88 | + blockDeviceMappings: |
| 89 | + # the default volume size of the selected AMI is 20Gi, it is not enough for kubelet to pull |
| 90 | + # the images and run the workloads. So we need to map a larger volume to the root device. |
| 91 | + # You can change the volume size to a larger value according to your actual needs. |
| 92 | + - deviceName: /dev/xvda |
| 93 | + ebs: |
| 94 | + deleteOnTermination: true |
| 95 | + volumeSize: 50Gi |
| 96 | + volumeType: gp3 |
| 97 | + role: KarpenterNodeRole-${CLUSTER_NAME} # replace with your cluster name |
| 98 | + securityGroupSelectorTerms: |
| 99 | + - tags: |
| 100 | + karpenter.sh/discovery: ${CLUSTER_NAME} # replace with your cluster name |
| 101 | + subnetSelectorTerms: |
| 102 | + - tags: |
| 103 | + karpenter.sh/discovery: ${CLUSTER_NAME} # replace with your cluster name |
| 104 | +--- |
| 105 | +apiVersion: karpenter.sh/v1 |
| 106 | +kind: NodePool |
| 107 | +metadata: |
| 108 | + name: llmaz-demo-gpu-nodepool # you can change the name to a more meaningful one. |
| 109 | +spec: |
| 110 | + disruption: |
| 111 | + budgets: |
| 112 | + - nodes: 10% |
| 113 | + consolidateAfter: 5m |
| 114 | + consolidationPolicy: WhenEmptyOrUnderutilized |
| 115 | + limits: # You can change the limits to match your actual needs. |
| 116 | + cpu: 1000 |
| 117 | + template: |
| 118 | + spec: |
| 119 | + expireAfter: 720h |
| 120 | + nodeClassRef: |
| 121 | + group: karpenter.k8s.aws |
| 122 | + kind: EC2NodeClass |
| 123 | + name: llmaz-demo |
| 124 | + requirements: |
| 125 | + - key: kubernetes.io/arch |
| 126 | + operator: In |
| 127 | + values: |
| 128 | + - amd64 |
| 129 | + - key: kubernetes.io/os |
| 130 | + operator: In |
| 131 | + values: |
| 132 | + - linux |
| 133 | + - key: karpenter.sh/capacity-type |
| 134 | + operator: In |
| 135 | + values: |
| 136 | + - spot |
| 137 | + - key: karpenter.k8s.aws/instance-family |
| 138 | + operator: In |
| 139 | + values: # replace with your instance-family with gpu supported |
| 140 | + - g4dn |
| 141 | + - g5g |
| 142 | + taints: |
| 143 | + - effect: NoSchedule |
| 144 | + key: nvidia.com/gpu |
| 145 | + value: "true" |
| 146 | +``` |
| 147 | +
|
| 148 | +2. Deploy a model with flavors |
| 149 | +
|
| 150 | +```shell |
| 151 | +cat <<EOF | kubectl apply -f - |
| 152 | +apiVersion: llmaz.io/v1alpha1 |
| 153 | +kind: OpenModel |
| 154 | +metadata: |
| 155 | + name: qwen2-0--5b |
| 156 | +spec: |
| 157 | + familyName: qwen2 |
| 158 | + source: |
| 159 | + modelHub: |
| 160 | + modelID: Qwen/Qwen2-0.5B-Instruct |
| 161 | + inferenceConfig: |
| 162 | + flavors: |
| 163 | + # The g5g instance family in the aws cloud can provide the t4g GPU type. |
| 164 | + # we define the instance family in the node pool like llmaz-demo-gpu-nodepool. |
| 165 | + - name: t4g |
| 166 | + limits: |
| 167 | + nvidia.com/gpu: 1 |
| 168 | + # The flavorName is not recongnized by the Karpenter, so we need to specify the |
| 169 | + # instance-gpu-name via nodeSelector to match the t4g GPU type when node is provisioned |
| 170 | + # by Karpenter from multiple node pools. |
| 171 | + # |
| 172 | + # When you only have a single node pool to provision the GPU instance and the node pool |
| 173 | + # only has one GPU type, it is okay to not specify the nodeSelector. But in practice, |
| 174 | + # it is better to specify the nodeSelector to make the provisioned node more predictable. |
| 175 | + # |
| 176 | + # The available node labels for selecting the target GPU device is listed below: |
| 177 | + # karpenter.k8s.aws/instance-gpu-count |
| 178 | + # karpenter.k8s.aws/instance-gpu-manufacturer |
| 179 | + # karpenter.k8s.aws/instance-gpu-memory |
| 180 | + # karpenter.k8s.aws/instance-gpu-name |
| 181 | + nodeSelector: |
| 182 | + karpenter.k8s.aws/instance-gpu-name: t4g |
| 183 | + # The g4dn instance family in the aws cloud can provide the t4 GPU type. |
| 184 | + # we define the instance family in the node pool like llmaz-demo-gpu-nodepool. |
| 185 | + - name: t4 |
| 186 | + limits: |
| 187 | + nvidia.com/gpu: 1 |
| 188 | + # The flavorName is not recongnized by the Karpenter, so we need to specify the |
| 189 | + # instance-gpu-name via nodeSelector to match the t4 GPU type when node is provisioned |
| 190 | + # by Karpenter from multiple node pools. |
| 191 | + # |
| 192 | + # When you only have a single node pool to provision the GPU instance and the node pool |
| 193 | + # only has one GPU type, it is okay to not specify the nodeSelector. But in practice, |
| 194 | + # it is better to specify the nodeSelector to make the provisioned node more predictable. |
| 195 | + # |
| 196 | + # The available node labels for selecting the target GPU device is listed below: |
| 197 | + # karpenter.k8s.aws/instance-gpu-count |
| 198 | + # karpenter.k8s.aws/instance-gpu-manufacturer |
| 199 | + # karpenter.k8s.aws/instance-gpu-memory |
| 200 | + # karpenter.k8s.aws/instance-gpu-name |
| 201 | + nodeSelector: |
| 202 | + karpenter.k8s.aws/instance-gpu-name: t4 |
| 203 | +--- |
| 204 | +# Currently, the Playground resource type does not support to configure tolerations |
| 205 | +# for the generated pods. But luckily, when a pod with the `nvidia.com/gpu` resource |
| 206 | +# is created on the eks cluster, the generated pod will be tweaked with the following |
| 207 | +# tolerations: |
| 208 | +# - effect: NoExecute |
| 209 | +# key: node.kubernetes.io/not-ready |
| 210 | +# operator: Exists |
| 211 | +# tolerationSeconds: 300 |
| 212 | +# - effect: NoExecute |
| 213 | +# key: node.kubernetes.io/unreachable |
| 214 | +# operator: Exists |
| 215 | +# tolerationSeconds: 300 |
| 216 | +# - effect: NoSchedule |
| 217 | +# key: nvidia.com/gpu |
| 218 | +# operator: Exists |
| 219 | +apiVersion: inference.llmaz.io/v1alpha1 |
| 220 | +kind: Playground |
| 221 | +metadata: |
| 222 | + labels: |
| 223 | + llmaz.io/model-name: qwen2-0--5b |
| 224 | + name: qwen2-0--5b |
| 225 | +spec: |
| 226 | + backendRuntimeConfig: |
| 227 | + backendName: tgi |
| 228 | + # Due to the limitation of our aws account, we have to decrease the resources to match |
| 229 | + # the avaliable instance type which is g4dn.xlarge. If your account has no such limitation, |
| 230 | + # you can remove the custom resources settings below. |
| 231 | + resources: |
| 232 | + limits: |
| 233 | + cpu: "2" |
| 234 | + memory: 4Gi |
| 235 | + requests: |
| 236 | + cpu: "2" |
| 237 | + memory: 4Gi |
| 238 | + modelClaim: |
| 239 | + modelName: qwen2-0--5b |
| 240 | + replicas: 1 |
| 241 | +EOF |
| 242 | +``` |
0 commit comments