-
Notifications
You must be signed in to change notification settings - Fork 121
Expand file tree
/
Copy pathrecommendations.yaml
More file actions
407 lines (384 loc) · 19.6 KB
/
recommendations.yaml
File metadata and controls
407 lines (384 loc) · 19.6 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
- description: Deploy AKS cluster across availability zones
aprlGuid: 4f63619f-5001-439c-bacb-8de891287727
recommendationTypeId: 9f3263db-b9c0-43bb-8523-6800f9f50793
recommendationControl: HighAvailability
recommendationImpact: High
recommendationResourceType: Microsoft.ContainerService/managedClusters
recommendationMetadataState: Active
longDescription: |
Azure Availability Zones ensure high availability by offering independent locations within regions, equipped with their own power, cooling, and networking to ensure applications and data are protected from datacenter-level failures.
potentialBenefits: Enhanced fault tolerance for AKS
pgVerified: true
automationAvailable: true
tags: []
learnMoreLink:
- name: AKS Availability Zones
url: "https://learn.microsoft.com/azure/aks/availability-zones"
- description: Isolate system and application pods
aprlGuid: 5ee083cd-6ac3-4a83-8913-9549dd36cf56
recommendationTypeId: null
recommendationControl: HighAvailability
recommendationImpact: High
recommendationResourceType: Microsoft.ContainerService/managedClusters
recommendationMetadataState: Active
longDescription: |
AKS assigns the kubernetes.azure.com/mode: system label to nodes in system node pools signaling the preference for system pods should be scheduled there. The CriticalAddonsOnly=true:NoSchedule taint can be added to your system nodes to prohibit application pods from being scheduled on them.
potentialBenefits: Enhanced reliability via pod isolation
pgVerified: false
automationAvailable: true
tags: []
learnMoreLink:
- name: System and user node pools
url: "https://learn.microsoft.com/azure/aks/use-system-pools?tabs=azure-cli#system-and-user-node-pools"
- description: Disable local accounts
aprlGuid: ca324d71-54b0-4a3e-b9e4-10e767daa9fc
recommendationTypeId: null
recommendationControl: OtherBestPractices
recommendationImpact: High
recommendationResourceType: Microsoft.ContainerService/managedClusters
recommendationMetadataState: Disabled
longDescription: |
Local Kubernetes accounts in AKS, being non-auditable and legacy, are discouraged. Microsoft Entra's integration offers centralized management, multi-factor authentication, RBAC for detailed access, and a secure, scalable authentication system compatible with Azure and external identity providers.
potentialBenefits: Enhanced security and access control
pgVerified: false
automationAvailable: true
tags: []
learnMoreLink:
- name: Entra integration
url: "https://learn.microsoft.com/azure/aks/concepts-identity#azure-ad-integration"
- description: Configure Azure CNI networking for dynamic allocation of IPs or use CNI overlay
aprlGuid: c22db132-399b-4e7c-995d-577a60881be8
recommendationTypeId: null
recommendationControl: Scalability
recommendationImpact: Medium
recommendationResourceType: Microsoft.ContainerService/managedClusters
recommendationMetadataState: Active
longDescription: |
Azure CNI enhances cluster IP and network management, allowing dynamic IP allocation, scalable subnets, direct pod-VNET connectivity, and supports diverse network policies for pods and nodes with Azure Network Policies and Calico, optimizing network efficiency and security
potentialBenefits: Dynamic IP allocation, scalable subnets, direct VNET access
pgVerified: false
automationAvailable: true
tags: []
learnMoreLink:
- name: Configure Azure CNI networking
url: "https://learn.microsoft.com/azure/aks/configure-azure-cni-dynamic-ip-allocation"
- description: Enable the cluster auto-scaler on an existing cluster
aprlGuid: 902c82ff-4910-4b61-942d-0d6ef7f39b67
recommendationTypeId: null
recommendationControl: Scalability
recommendationImpact: High
recommendationResourceType: Microsoft.ContainerService/managedClusters
recommendationMetadataState: Active
longDescription: |
The cluster auto-scaler in AKS adjusts node counts based on pod resource needs and available capacity, enabling scaling as per demand to prevent outages.
potentialBenefits: Optimizes scaling and prevents outages
pgVerified: true
automationAvailable: true
tags: []
learnMoreLink:
- name: Use the Cluster Autoscaler on AKS
url: "https://learn.microsoft.com/azure/aks/cluster-autoscaler?tabs=azure-cli"
- description: Back up Azure Kubernetes Service
aprlGuid: 269a9f1a-6675-460a-831e-b05a887a8c4b
recommendationTypeId: null
recommendationControl: DisasterRecovery
recommendationImpact: Low
recommendationResourceType: Microsoft.ContainerService/managedClusters
recommendationMetadataState: Active
longDescription: |
AKS, popular for stateful apps needing backups, can now use Azure Backup to secure clusters and attached volumes through an installed Backup Extension, enabling backup and restore operations via a Backup Vault.
potentialBenefits: Ensures data safety for AKS
pgVerified: true
automationAvailable: true
tags: []
learnMoreLink:
- name: AKS Backups
url: "https://learn.microsoft.com/azure/backup/azure-kubernetes-service-cluster-backup"
- description: Use zone-redundant storage for persistent volumes when running multi-zone AKS
aprlGuid: d3111036-355d-431b-ab49-8ddad042800b
recommendationTypeId: null
recommendationControl: HighAvailability
recommendationImpact: Medium
recommendationResourceType: Microsoft.ContainerService/managedClusters
recommendationMetadataState: Active
longDescription: |
ZRS ensures data replication across three zones, protecting against zonal outages. It's available for Azure Disks, Container Storage, Files, and Blob by setting the SKU to ZRS in storage classes, enhancing multi-zone AKS clusters from v1.29.
potentialBenefits: Increases data durability and availability
pgVerified: true
automationAvailable: false
tags: []
learnMoreLink:
- name: Availability zones overview
url: "https://learn.microsoft.com/azure/reliability/availability-zones-overview?tabs=azure-cli"
- description: Upgrade Persistent Volumes using in-tree drivers to Azure CSI drivers
aprlGuid: b002c030-72e6-4a37-8217-1cb276c43169
recommendationTypeId: null
recommendationControl: OtherBestPractices
recommendationImpact: High
recommendationResourceType: Microsoft.ContainerService/managedClusters
recommendationMetadataState: Active
longDescription: |
From Kubernetes 1.26, Azure Disk and Azure File in-tree drivers are deprecated in favor of CSI drivers. Existing deployments remain operational but untested; users should switch to CSI drivers for new features and SKUs.
potentialBenefits: Ensures future compatibility
pgVerified: true
automationAvailable: true
tags: []
learnMoreLink:
- name: CSI Storage Drivers
url: "https://learn.microsoft.com/azure/aks/csi-storage-drivers"
- description: Implement Resource Quota to ensure that Kubernetes resources do not exceed hard resource limits
aprlGuid: 9a1c17e5-c9a0-43db-b920-adaf54d1bcb7
recommendationTypeId: null
recommendationControl: Scalability
recommendationImpact: Low
recommendationResourceType: Microsoft.ContainerService/managedClusters
recommendationMetadataState: Disabled
longDescription: |
A ResourceQuota object sets limits on resource use per namespace, controlling the number and type of objects created, and the total compute resources available.
potentialBenefits: Limits AKS resource usage per namespace
pgVerified: false
automationAvailable: false
tags: []
learnMoreLink:
- name: Resource Quotas
url: "https://kubernetes.io/docs/concepts/policy/resource-quotas/"
- description: Attach Virtual Nodes (ACI) to the AKS cluster
aprlGuid: b4639ca7-6308-429a-8b98-92f0bf9bf813
recommendationTypeId: null
recommendationControl: Scalability
recommendationImpact: Low
recommendationResourceType: Microsoft.ContainerService/managedClusters
recommendationMetadataState: Disabled
longDescription: |
To rapidly scale AKS workloads, utilize virtual nodes for quick pod provisioning, unlike Kubernetes auto-scaler. For clusters with availability zones, ensure one nodepool per AZ due to persistent volumes not working across AZs, preventing auto-scaler pod creation failures if lacking access.
potentialBenefits: Faster scaling with virtual nodes
pgVerified: false
automationAvailable: false
tags: []
learnMoreLink:
- name: Virtual Nodes
url: "https://learn.microsoft.com/azure/aks/virtual-nodes"
- description: Update AKS tier to Standard or Premium
aprlGuid: 0611251f-e70f-4243-8ddd-cfe894bec2e7
recommendationTypeId: null
recommendationControl: HighAvailability
recommendationImpact: High
recommendationResourceType: Microsoft.ContainerService/managedClusters
recommendationMetadataState: Active
longDescription: |
Production AKS clusters require the Standard or Premium tier for a financially backed SLA and enhanced node scalability, as the free service lacks these features. Use the Premium tier for mission-critical workloads.
potentialBenefits: SLA guarantee and better scalability
pgVerified: true
automationAvailable: true
tags: []
learnMoreLink:
- name: Pricing Tiers
url: "https://learn.microsoft.com/azure/aks/free-standard-pricing-tiers"
- description: Enable AKS Monitoring
aprlGuid: dcaf8128-94bd-4d53-9235-3a0371df6b74
recommendationTypeId: null
recommendationControl: MonitoringAndAlerting
recommendationImpact: High
recommendationResourceType: Microsoft.ContainerService/managedClusters
recommendationMetadataState: Active
longDescription: |
Azure Monitor enables real-time health and performance insights for AKS by collecting events, capturing container logs, and gathering CPU/Memory data from the Metrics API. It allows data visualization using Azure Monitor Container Insights, Prometheus, Grafana, or others.
potentialBenefits: Real-time AKS health/performance insights
pgVerified: true
automationAvailable: true
tags: []
learnMoreLink:
- name: Monitor AKS
url: "https://learn.microsoft.com/azure/aks/monitor-aks"
- description: Enable and remediate Azure Policies configured for AKS
aprlGuid: 26ebaf1f-c70d-4ebd-8641-4b60a0ce0094
recommendationTypeId: null
recommendationControl: OtherBestPractices
recommendationImpact: Low
recommendationResourceType: Microsoft.ContainerService/managedClusters
recommendationMetadataState: Disabled
longDescription: |
Azure Policies in AKS clusters help enforce governance best practices concerning security, authentication, provisioning, networking, and more, ensuring a robust and secure environment for operations.
potentialBenefits: Enhanced AKS governance and security
pgVerified: false
automationAvailable: true
tags: []
learnMoreLink:
- name: AKS Baseline - Policy Management
url: "https://learn.microsoft.com/azure/architecture/reference-architectures/containers/aks/baseline-aks?toc=https%3A%2F%2Flearn.microsoft.com%2Fazure%2Faks%2Ftoc.json&bc=https%3A%2F%2Flearn.microsoft.com%2Fazure%2Fbread%2Ftoc.json#policy-management"
- description: Enable GitOps when using DevOps frameworks
aprlGuid: 5f3cbd68-692a-4121-988c-9770914859a9
recommendationTypeId: null
recommendationControl: OtherBestPractices
recommendationImpact: Low
recommendationResourceType: Microsoft.ContainerService/managedClusters
recommendationMetadataState: Disabled
longDescription: |
GitOps, an operating model for cloud-native apps, uses Git for storing application and infrastructure code as a source of truth for continuous delivery.
potentialBenefits: Ensures AKS config consistency
pgVerified: false
automationAvailable: true
tags: []
learnMoreLink:
- name: GitOps with AKS
url: "https://learn.microsoft.com/azure/architecture/guide/aks/aks-cicd-github-actions-and-gitops"
- description: Use pod topology spread constraints to ensure that pods are spread across different nodes or zones
aprlGuid: 928fcc6f-5e9a-42d9-9bd4-260af42de2e5
recommendationTypeId: null
recommendationControl: HighAvailability
recommendationImpact: High
recommendationResourceType: Microsoft.ContainerService/managedClusters
recommendationMetadataState: Active
longDescription: |
Enhance availability and reliability by using pod topology spread constraints to control pod distribution based on node or zone topology, ensuring pods are spread across your cluster.
potentialBenefits: Ensures high availability and efficient use
pgVerified: true
automationAvailable: false
tags: []
learnMoreLink:
- name: Topology Spread Constraints
url: "https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/"
- description: Configures Pods Liveness, Readiness, and Startup Probes
aprlGuid: cd6791b1-c60e-4b37-ac98-9897b1e6f4b8
recommendationTypeId: null
recommendationControl: HighAvailability
recommendationImpact: High
recommendationResourceType: Microsoft.ContainerService/managedClusters
recommendationMetadataState: Active
longDescription: |
AKS kubelet controller uses liveness probes to validate containers and applications health, ensuring the system knows when to restart a container based on its health status.
potentialBenefits: Enhances container health monitoring
pgVerified: true
automationAvailable: false
tags: []
learnMoreLink:
- name: Configure probes
url: "https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/"
- description: Use deployments with multiple replicas in production applications to guarantee availability
aprlGuid: bcfe71f1-ebed-49e5-a84a-193b81ad5d27
recommendationTypeId: null
recommendationControl: HighAvailability
recommendationImpact: High
recommendationResourceType: Microsoft.ContainerService/managedClusters
recommendationMetadataState: Active
longDescription: |
Configuring multiple replicas in Pod or Deployment manifests stabilizes the number of replica Pods, ensuring that a specified number of identical Pods are always available, thereby guaranteeing their availability.
potentialBenefits: Ensures stable pod availability
pgVerified: true
automationAvailable: false
tags: []
learnMoreLink:
- name: Replica Sets
url: "https://kubernetes.io/docs/concepts/workloads/controllers/replicaset/"
- description: Configure system nodepool count
aprlGuid: 7f7ae535-a5ba-4665-b7e0-c451dbdda01f
recommendationTypeId: null
recommendationControl: HighAvailability
recommendationImpact: High
recommendationResourceType: Microsoft.ContainerService/managedClusters
recommendationMetadataState: Active
longDescription: |
The system node pool should be configured with a minimum node count of three to ensure critical system pods are resilient to node outages.
potentialBenefits: Ensures pod resilience
pgVerified: true
automationAvailable: true
tags: []
learnMoreLink:
- name: System nodepools
url: "https://learn.microsoft.com/azure/aks/use-system-pools?tabs=azure-cli"
- description: Configure user nodepool count
aprlGuid: 005ccbbd-aeab-46ef-80bd-9bd4479412ec
recommendationTypeId: null
recommendationControl: HighAvailability
recommendationImpact: High
recommendationResourceType: Microsoft.ContainerService/managedClusters
recommendationMetadataState: Active
longDescription: |
Configuring the user node pool with at least two nodes is essential for applications needing high availability, ensuring they remain operational and accessible without interruption.
potentialBenefits: Ensures high app availability
pgVerified: true
automationAvailable: true
tags: []
learnMoreLink:
- name: Azure Well-Architected Framework review for Azure Kubernetes Service (AKS)
url: "https://learn.microsoft.com/azure/well-architected/service-guides/azure-kubernetes-service#design-checklist"
- description: Configure pod disruption budgets (PDBs)
aprlGuid: a08a06a0-e41a-4b99-83bb-69ce8bca54cb
recommendationTypeId: null
recommendationControl: HighAvailability
recommendationImpact: Medium
recommendationResourceType: Microsoft.ContainerService/managedClusters
recommendationMetadataState: Active
longDescription: |
A Pod Disruption Budget is a Kubernetes resource configuring the minimum number or percentage of pods that should remain available during disruptions like maintenance or scaling, ensuring a minimum number of pods are always available in the cluster.
potentialBenefits: Ensures cluster resiliency during disruptions
pgVerified: true
automationAvailable: false
tags: []
learnMoreLink:
- name: Configure PDBs
url: "https://kubernetes.io/docs/tasks/run-application/configure-pdb/"
- description: Nodepool subnet size needs to accommodate maximum auto-scale settings
aprlGuid: e620fa98-7a40-41a0-bfc9-b4407297fb58
recommendationTypeId: null
recommendationControl: HighAvailability
recommendationImpact: High
recommendationResourceType: Microsoft.ContainerService/managedClusters
recommendationMetadataState: Disabled
longDescription: |
Nodepool subnets sized for max auto-scale settings enable AKS to efficiently scale out nodes, meeting increased demand while reducing resource constraints and potential service disruptions.
potentialBenefits: Efficient scaling, reduced disruptions
pgVerified: false
automationAvailable: true
tags: []
learnMoreLink:
- name: Azure CNI Dynamic IP Allocation
url: "https://learn.microsoft.com/azure/aks/configure-azure-cni-dynamic-ip-allocation"
- description: Subscription core quota should be increased if Node pool auto-scale settings exceed the quota
aprlGuid: a01afc4c-7439-4919-b2da-3565992ea2a7
recommendationTypeId: null
recommendationControl: HighAvailability
recommendationImpact: High
recommendationResourceType: Microsoft.ContainerService/managedClusters
recommendationMetadataState: Active
longDescription: |
Node pool settings should not exceed the subscription core quota to ensure AKS can scale out nodes efficiently, meeting increased demand while reducing resource constraints and potential service disruptions.
potentialBenefits: Reduced disruptions
pgVerified: false
automationAvailable: false
tags: []
learnMoreLink:
- name: Azure Quotas
url: "https://learn.microsoft.com/azure/quotas/quotas-overview"
- description: Use Azure Linux for Linux nodepools
aprlGuid: f46b0d1d-56ef-4795-b98a-f6ee00cb341a
recommendationTypeId: null
recommendationControl: HighAvailability
recommendationImpact: High
recommendationResourceType: Microsoft.ContainerService/managedClusters
recommendationMetadataState: Disabled
longDescription: |
Azure Linux on AKS boosts resiliency with a native image using validated, source-built components. It's lightweight, reducing the attack surface and maintenance. A Microsoft-hardened kernel, optimized for Azure, enhances stability and security for container workloads.
potentialBenefits: Reduced disruptions
pgVerified: false
automationAvailable: true
tags: []
learnMoreLink:
- name: Azure Linux
url: "https://learn.microsoft.com/azure/aks/use-azure-linux"
- description: Deploy at least two replicas of your application
aprlGuid: 9200aca6-0e83-4749-a5eb-e3939367bdc2
recommendationTypeId: null
recommendationControl: HighAvailability
recommendationImpact: High
recommendationResourceType: Microsoft.ContainerService/managedClusters
recommendationMetadataState: Disabled
longDescription: |
Deploying at least two replicas of your application ensures that your application is highly available and can tolerate node failures.
potentialBenefits: Ensures high app availability
pgVerified: false
automationAvailable: false
tags: []
learnMoreLink:
- name: Multi-replica apps
url: "https://learn.microsoft.com/azure/aks/best-practices-app-cluster-reliability#multi-replica-applications"