Commit 80bd7d4
Add GPU cost overview dashboards (DataDog#23455)
* Add GPU cost overview dashboards
Adds 5 dashboards to the GPU integration for monitoring GPU compute
spend and utilization across cloud providers and Kubernetes.
Dashboards:
- gpu_cost_overview: cross-cloud totals, spend by team/env, fleet utilization
- aws_gpu_cost_overview: AWS-specific with Capacity Block tracking
- azure_gpu_cost_overview: Azure GPU VM families (NC, NCv3, ND series)
- gcp_gpu_cost_overview: GCP GPU SKUs with On-Demand vs Committed coverage
- k8s_gpu_cost_overview: cluster/namespace allocation, idle cost attribution
Cost queries span cloud_cost amortized metrics plus unblended for
AWS Capacity Blocks (which are not captured in amortized). Utilization
widgets join GPU telemetry (gpu.sm_active, gpu.device.total) with
cost data for unit-economics views.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Refine GPU utilization metrics and dashboard polish
- Switch unhealthy KPIs to GPU Idle % (gr_engine_active) since
gpu.device.unhealthy is non-functional
- Add Healthy GPU Rate KPI on k8s using kubernetes_state.node.gpu_allocatable
/ gpu_capacity
- Add Spend on Idle GPUs KPI to per-cloud and overview dashboards
(excludes AWS Capacity Blocks since their cost is upfront and unrelated
to engine activity)
- Standardize utilization terminology: "Average GPU Utilization %" and
"GPU Idle %" across dashboards
- Drop redundant cloud provider prefix from per-cloud widget titles and
remove the "GPU Spend" group wrapper to match k8s dashboard layout
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Add GPU Monitoring setup CTA banner to per-cloud and k8s dashboards
Mirrors the existing banner from the overview dashboard so users on any
cloud-specific or Kubernetes view see the link to enable GPU Monitoring,
which populates utilization-driven widgets.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent 2f31885 commit 80bd7d4
6 files changed
Lines changed: 4987 additions & 1 deletion
0 commit comments