This document presents verified technical and environmental data for three leading AI models β OpenAI GPT-4, Anthropic Claude 3 Haiku, and Google Gemini Nano β focusing on energy, water, and sustainability context.
GPT-4, developed by OpenAI.
Hosted via Microsoft Azure OpenAI Service, which operates on Azureβs global data centers (specific regions not publicly disclosed).
Source: Introducing GPT-4 in Azure OpenAI Service β Microsoft Azure Blog
GPT-4 is widely considered a frontier model employing a Sparse Mixture-of-Experts (MoE) architecture.
This structure activates only a subset of parameters per inference, optimizing efficiency while maintaining scale.
Estimated total parameters exceed 1 trillion.
Sources:
- Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks β arXiv (2508.18672v2)
- Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models β arXiv (2501.12370v2)
Estimates vary between studies:
- Epoch AI (2024): β 0.3 Wh (0.0003 kWh) per query under typical load.
Source: Epoch AI β How Much Energy Does ChatGPT Use? - Other analysts: 0.3 β 1.8 Wh (0.0003 β 0.0018 kWh), depending on token count and hardware.
Note: OpenAI has not released official inference energy data.
Values depend on hardware (GPU vs TPU), data-center PUE, and carbon intensity.
Extrapolated from compute budgets and model size:
β 51,772,500 β 62,318,750 kWh (β 51.8 β 62.3 GWh) for full-scale training.
Source: The Carbon Footprint of ChatGPT β Sustainability by Numbers
Water consumption derives from data-center cooling processes:
- A single ChatGPT query may indirectly consume β 0.5 L of water.
- Generating a 100-word email β 0.14 kWh energy + 0.52 L water.
Source: The Verge β Sam Altman on ChatGPT Energy and Water Use
Environmental analyses generally apply:
- PUE (Power Usage Effectiveness): β 1.1 β 1.3 (Azure hyperscale data centers)
- CI (Carbon Intensity): β 0.3 β 0.4 kg COβe / kWh (depending on regional grid mix)
Claude 3 Haiku is the smallest and fastest member of Anthropicβs Claude 3 model family ( Haiku, Sonnet, Opus ), released March 2024.
It is optimized for low-latency, energy-efficient applications such as chatbots, summarization, and enterprise automation.
Sources:
Claude 3 Haiku is available through Anthropicβs API and AWS Bedrock.
AWS data centers maintain an average PUE of β 1.2.
Sources:
Community estimates place Claude 3 Haiku at β 20 B parameters.
The largest model in the family (Claude 3 Opus) is β 2 T parameters.
Source: ClaudeAI Community Discussion (Reddit)
Anthropic does not publish per-query energy figures.
Based on 10β30 B parameter transformers: β 0.05 β 0.1 Wh (0.00005 β 0.0001 kWh) per query.
Haiku is β 5Γ more efficient than Claude 3 Sonnet or Opus.
Sources:
Claude 3 models are trained on GPU clusters (NVIDIA A100/H100) via AWS.
Typical training energy for models of this scale: β 3,000 β 10,000 MWh.
Sources:
- Epoch AI β Machine Learning Trends (for compute/power scaling)
- Anthropic Responsible Scaling Policy
Anthropic does not publish direct figures; relies on AWS cooling efficiency and water recycling policies.
Sources:
- PUE: β 1.2 (AWS average)
- CI: β 0 β 0.2 kg COβe / kWh (based on regional renewable mix)
AWS targets 100 % renewable energy by 2025.
Sources: - AWS Global Infrastructure Efficiency Data
- Anthropic Responsible Scaling Policy
Gemini Nano, developed by Google DeepMind, is the smallest member of the Gemini family (Nano, Pro, Ultra).
Sources:
Runs on-device through Androidβs AICore system (launched in Android 14).
Deployed on optimized hardware (e.g., Pixel 8 Pro, Pixel 9 Series).
This local processing approach eliminates cloud compute energy and network latency.
Additional coverage: - Android Developers β Gemini Nano Overview
Deployed in quantized versions:
- Nano-1: β 1.8 B parameters
- Nano-2: β 3.25 B parameters
Reference: Exploding Topics β AI Model Parameters Database
- Median Cloud Gemini Inference: β 0.24 Wh per text prompt.
- On-Device Nano Estimate: β 0.01 Wh per query (benchmarks + design targets).
Note: Official Nano inference measurements are not yet public.
Source: Google Cloud Blog β Measuring the Environmental Impact of AI Inference
Gemini Nano was distilled from larger Gemini models trained on Google TPU v5e clusters.
Training energy estimated β 200 β 1,200 MWh (total, amortized across billions of devices).
Sources:
- Inference: Zero data-center water use (on-device).
- Training: Uses Google data centers with average WUE β 0.26 mL per median cloud query.
Source: Google Cloud Blog β Measuring the Environmental Impact of AI Inference
- Average PUE: 1.10 β 1.12 (Google Data Centers)
- Carbon Intensity (CI): β 0.03 g COβe / query (market-based)
Reflects Googleβs near-total renewable energy purchasing.
Source: Google Cloud Blog β Environmental Impact of AI Inference
| Model | Developer | Hosting Type | Est. Parameters | Inference Energy (Wh/query) | Training Energy (MWh) | PUE | CI (kg COβe/kWh) |
|---|---|---|---|---|---|---|---|
| GPT-4 | OpenAI | Cloud (Azure) | β 1 T + (MoE) | 0.3 β 1.8 | β 51 K β 62 K MWh | 1.1β1.3 | 0.3β0.4 |
| Claude 3 Haiku | Anthropic | Cloud (AWS Bedrock) | β 20 B | 0.05 β 0.1 | 3 K β 10 K | β 1.2 | 0β0.2 |
| Gemini Nano | Google DeepMind | On-Device (Android AICore) | 1.8β3.25 B | β 0.01 (on-device) | 200β1,200 | 1.10β1.12 | β 0.03 g COβe /query |