|
| 1 | +<!-- markdownlint-disable MD013 MD024 MD025 MD026 MD041 MD001 --> |
| 2 | +<!-- This disables line length, duplicate headings, multiple top-level headings, heading punctuation, first heading, and heading increment rules --> |
| 3 | +<!-- markdownlint-disable MD013 --> |
| 4 | +<!-- markdownlint-disable MD025 --> |
| 5 | +<!-- Multiple top-level headings needed for each model section --> |
| 6 | + |
| 7 | +# Comparative Environmental and Technical Overview of Modern AI Models |
| 8 | + |
| 9 | +This document presents verified technical and environmental data for three leading AI models β **OpenAI GPT-4**, **Anthropic Claude 3 Haiku**, and **Google Gemini Nano** β focusing on energy, water, and sustainability context. |
| 10 | + |
| 11 | +--- |
| 12 | + |
| 13 | +## Model 1: GPT-4 (OpenAI) |
| 14 | + |
| 15 | +### Model Name & Provider |
| 16 | + |
| 17 | +**GPT-4**, developed by **OpenAI**. |
| 18 | + |
| 19 | +### Hosting / Deployment |
| 20 | + |
| 21 | +Hosted via **Microsoft Azure OpenAI Service**, which operates on Azureβs global data centers (specific regions not publicly disclosed). |
| 22 | +Source: [Introducing GPT-4 in Azure OpenAI Service β Microsoft Azure Blog](https://azure.microsoft.com/en-us/blog/introducing-gpt-4-in-azure-openai-service/) |
| 23 | + |
| 24 | +### Estimated Model Size / Architecture |
| 25 | + |
| 26 | +GPT-4 is widely considered a **frontier model** employing a **Sparse Mixture-of-Experts (MoE)** architecture. |
| 27 | +This structure activates only a subset of parameters per inference, optimizing efficiency while maintaining scale. |
| 28 | +Estimated total parameters exceed **1 trillion**. |
| 29 | + |
| 30 | +Sources: |
| 31 | + |
| 32 | +- [*Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks* β arXiv (2508.18672v2)](https://arxiv.org/html/2508.18672v2) |
| 33 | +- [*Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models* β arXiv (2501.12370v2)](https://arxiv.org/html/2501.12370v2) |
| 34 | + |
| 35 | +### Estimated Energy (Inference) |
| 36 | + |
| 37 | +Estimates vary between studies: |
| 38 | + |
| 39 | +- **Epoch AI (2024):** β 0.3 Wh (0.0003 kWh) per query under typical load. |
| 40 | + Source: [Epoch AI β *How Much Energy Does ChatGPT Use?*](https://epoch.ai/gradient-updates/how-much-energy-does-chatgpt-use) |
| 41 | +- **Other analysts:** 0.3 β 1.8 Wh (0.0003 β 0.0018 kWh), depending on token count and hardware. |
| 42 | + |
| 43 | +**Note:** OpenAI has not released official inference energy data. |
| 44 | +Values depend on hardware (GPU vs TPU), data-center PUE, and carbon intensity. |
| 45 | + |
| 46 | +### Training Energy Estimates |
| 47 | + |
| 48 | +Extrapolated from compute budgets and model size: |
| 49 | +β 51,772,500 β 62,318,750 kWh (β 51.8 β 62.3 GWh) for full-scale training. |
| 50 | +Source: [*The Carbon Footprint of ChatGPT* β Sustainability by Numbers](https://sustainabilitybynumbers.com/how-much-energy-does-chatgpt-use/) |
| 51 | + |
| 52 | +### Water Usage |
| 53 | + |
| 54 | +Water consumption derives from data-center cooling processes: |
| 55 | + |
| 56 | +- A single ChatGPT query may indirectly consume β 0.5 L of water. |
| 57 | +- Generating a 100-word email β 0.14 kWh energy + 0.52 L water. |
| 58 | +Source: [The Verge β *Sam Altman on ChatGPT Energy and Water Use*](https://www.theverge.com/news/685045/sam-altman-average-chatgpt-energy-water?) |
| 59 | + |
| 60 | +### PUE / CI Context Used in Studies |
| 61 | + |
| 62 | +Environmental analyses generally apply: |
| 63 | + |
| 64 | +- **PUE (Power Usage Effectiveness):** β 1.1 β 1.3 (Azure hyperscale data centers) |
| 65 | +- **CI (Carbon Intensity):** β 0.3 β 0.4 kg COβe / kWh (depending on regional grid mix) |
| 66 | + |
| 67 | +--- |
| 68 | + |
| 69 | +## Model 2: Claude 3 Haiku (Anthropic) |
| 70 | + |
| 71 | +### Model Description |
| 72 | + |
| 73 | +**Claude 3 Haiku** is the smallest and fastest member of Anthropicβs Claude 3 model family ( Haiku, Sonnet, Opus ), released March 2024. |
| 74 | +It is optimized for low-latency, energy-efficient applications such as chatbots, summarization, and enterprise automation. |
| 75 | + |
| 76 | +Sources: |
| 77 | + |
| 78 | +- [Anthropic Blog β *Introducing the Claude 3 Model Family*](https://www.anthropic.com/news/claude-3-family) |
| 79 | +- [Anthropic 3 Announcement](https://www.anthropic.com/news/claude-3-7-sonnet) |
| 80 | + |
| 81 | +### Hosting / Deployment |
| 82 | + |
| 83 | +Claude 3 Haiku is available through Anthropicβs API and **AWS Bedrock**. |
| 84 | +AWS data centers maintain an average PUE of β 1.2. |
| 85 | +Sources: |
| 86 | + |
| 87 | +- [AWS Bedrock β *Use Claude on Bedrock*](https://aws.amazon.com/bedrock/claude/) |
| 88 | +- [AWS Sustainability Report 2024](https://sustainability.aboutamazon.com/reporting) |
| 89 | + |
| 90 | +### Estimated Model Size / Architecture |
| 91 | + |
| 92 | +Community estimates place Claude 3 Haiku at β 20 B parameters. |
| 93 | +The largest model in the family (Claude 3 Opus) is β 2 T parameters. |
| 94 | +Source: [ClaudeAI Community Discussion (Reddit)](https://www.reddit.com/r/ClaudeAI/) |
| 95 | + |
| 96 | +### Estimated Energy (Inference) |
| 97 | + |
| 98 | +Anthropic does not publish per-query energy figures. |
| 99 | +Based on 10β30 B parameter transformers: β 0.05 β 0.1 Wh (0.00005 β 0.0001 kWh) per query. |
| 100 | +Haiku is β 5Γ more efficient than Claude 3 Sonnet or Opus. |
| 101 | +Sources: |
| 102 | + |
| 103 | +- [Epoch AI β Machine Learning Trends (for compute/power scaling)](https://epoch.ai/trends) |
| 104 | +- [Anthropic 3 Announcement](https://www.anthropic.com/news/claude-3-7-sonnet) |
| 105 | + |
| 106 | +### Training Energy Estimates |
| 107 | + |
| 108 | +Claude 3 models are trained on GPU clusters (NVIDIA A100/H100) via AWS. |
| 109 | +Typical training energy for models of this scale: β 3,000 β 10,000 MWh. |
| 110 | +Sources: |
| 111 | + |
| 112 | +- [Epoch AI β Machine Learning Trends (for compute/power scaling)](https://epoch.ai/trends) |
| 113 | +- [Anthropic Responsible Scaling Policy](https://www-cdn.anthropic.com/872c653b2d0501d6ab44cf87f43e1dc4853e4d37.pdf) |
| 114 | + |
| 115 | +### Water Usage |
| 116 | + |
| 117 | +Anthropic does not publish direct figures; relies on AWS cooling efficiency and water recycling policies. |
| 118 | +Sources: |
| 119 | + |
| 120 | +- [AWS Water Stewardship Report](https://sustainability.aboutamazon.com/2024-amazon-sustainability-report-aws-summary.pdf) |
| 121 | +- [Anthropic Responsible Scaling Policy](https://www-cdn.anthropic.com/872c653b2d0501d6ab44cf87f43e1dc4853e4d37.pdf) |
| 122 | + |
| 123 | +### PUE / CI Context Used in Studies |
| 124 | + |
| 125 | +- **PUE:** β 1.2 (AWS average) |
| 126 | +- **CI:** β 0 β 0.2 kg COβe / kWh (based on regional renewable mix) |
| 127 | +AWS targets **100 % renewable energy by 2025**. |
| 128 | +Sources: |
| 129 | +- [AWS Global Infrastructure Efficiency Data](https://sustainability.aboutamazon.com/2024-amazon-sustainability-report-aws-summary.pdf) |
| 130 | +- [Anthropic Responsible Scaling Policy](https://www-cdn.anthropic.com/872c653b2d0501d6ab44cf87f43e1dc4853e4d37.pdf) |
| 131 | + |
| 132 | +--- |
| 133 | + |
| 134 | +## Model 3: Gemini Nano (Google DeepMind) |
| 135 | + |
| 136 | +### Model Name & Provider |
| 137 | + |
| 138 | +**Gemini Nano**, developed by **Google DeepMind**, is the smallest member of the Gemini family (Nano, Pro, Ultra). |
| 139 | +Sources: |
| 140 | + |
| 141 | +- [Google AI Blog β *Introducing Gemini*](https://blog.google/technology/ai/google-gemini-ai/) |
| 142 | + |
| 143 | +### Hosting / Deployment |
| 144 | + |
| 145 | +Runs **on-device** through Androidβs **AICore** system (launched in Android 14). |
| 146 | +Deployed on optimized hardware (e.g., Pixel 8 Pro, Pixel 9 Series). |
| 147 | +This local processing approach eliminates cloud compute energy and network latency. |
| 148 | +Additional coverage: - [Android Developers β *Gemini Nano Overview*](https://developer.android.com/ai/gemini-nano) |
| 149 | + |
| 150 | +### Estimated Model Size / Architecture |
| 151 | + |
| 152 | +Deployed in quantized versions: |
| 153 | + |
| 154 | +- **Nano-1:** β 1.8 B parameters |
| 155 | +- **Nano-2:** β 3.25 B parameters |
| 156 | +Reference: [Exploding Topics β AI Model Parameters Database](https://explodingtopics.com/blog/gpt-parameters) |
| 157 | + |
| 158 | +### Estimated Energy (Inference) |
| 159 | + |
| 160 | +- **Median Cloud Gemini Inference:** β 0.24 Wh per text prompt. |
| 161 | +- **On-Device Nano Estimate:** β 0.01 Wh per query (benchmarks + design targets). |
| 162 | +Note: Official Nano inference measurements are not yet public. |
| 163 | +Source: [Google Cloud Blog β *Measuring the Environmental Impact of AI Inference*](https://cloud.google.com/blog/products/infrastructure/measuring-the-environmental-impact-of-ai-inference/) |
| 164 | + |
| 165 | +### Training Energy Estimates |
| 166 | + |
| 167 | +Gemini Nano was distilled from larger Gemini models trained on **Google TPU v5e clusters**. |
| 168 | +Training energy estimated β 200 β 1,200 MWh (total, amortized across billions of devices). |
| 169 | +Sources: |
| 170 | + |
| 171 | +- [Google Cloud TPU Documentation](https://cloud.google.com/tpu/docs/) |
| 172 | +- [Google Cloud Blog β Environmental Impact of AI Inference](https://cloud.google.com/blog/products/infrastructure/measuring-the-environmental-impact-of-ai-inference/) |
| 173 | + |
| 174 | +### Water Usage |
| 175 | + |
| 176 | +- **Inference:** Zero data-center water use (on-device). |
| 177 | +- **Training:** Uses Google data centers with average WUE β 0.26 mL per median cloud query. |
| 178 | +Source: [Google Cloud Blog β *Measuring the Environmental Impact of AI Inference*](https://cloud.google.com/blog/products/infrastructure/measuring-the-environmental-impact-of-ai-inference/) |
| 179 | + |
| 180 | +### PUE / CI Context Used in Studies |
| 181 | + |
| 182 | +- **Average PUE:** 1.10 β 1.12 (Google Data Centers) |
| 183 | +- **Carbon Intensity (CI):** β 0.03 g COβe / query (market-based) |
| 184 | +Reflects Googleβs near-total renewable energy purchasing. |
| 185 | +Source: [Google Cloud Blog β Environmental Impact of AI Inference](https://cloud.google.com/blog/products/infrastructure/measuring-the-environmental-impact-of-ai-inference/) |
| 186 | + |
| 187 | +--- |
| 188 | + |
| 189 | +### Summary |
| 190 | + |
| 191 | +| Model | Developer | Hosting Type | Est. Parameters | Inference Energy (Wh/query) | Training Energy (MWh) | PUE | CI (kg COβe/kWh) | |
| 192 | +|:------|:-----------|:--------------|:----------------|:-----------------------------|:----------------------|:----|:----------------:| |
| 193 | +| GPT-4 | OpenAI | Cloud (Azure) | β 1 T + (MoE) | 0.3 β 1.8 | β 51 K β 62 K MWh | 1.1β1.3 | 0.3β0.4 | |
| 194 | +| Claude 3 Haiku | Anthropic | Cloud (AWS Bedrock) | β 20 B | 0.05 β 0.1 | 3 K β 10 K | β 1.2 | 0β0.2 | |
| 195 | +| Gemini Nano | Google DeepMind | On-Device (Android AICore) | 1.8β3.25 B | β 0.01 (on-device) | 200β1,200 | 1.10β1.12 | β 0.03 g COβe /query | |
| 196 | + |
| 197 | +--- |
0 commit comments