|
4 | 4 | <!-- markdownlint-disable MD025 --> |
5 | 5 | <!-- Multiple top-level headings needed for each model section --> |
6 | 6 |
|
7 | | -# Model 1: GPT-4 (OpenAI) |
| 7 | +# Comparative Environmental and Technical Overview of Modern AI Models |
8 | 8 |
|
9 | | -### Model Name & Provider |
| 9 | +This document presents verified technical and environmental data for three leading AI models β **OpenAI GPT-4**, **Anthropic Claude 3 Haiku**, and **Google Gemini Nano** β focusing on energy, water, and sustainability context. |
10 | 10 |
|
11 | | -**GPT-4**, developed by **OpenAI**. |
| 11 | +--- |
12 | 12 |
|
13 | | -#### Hosting & Deployment |
| 13 | +## Model 1: GPT-4 (OpenAI) |
14 | 14 |
|
15 | | -Hosted via **Microsoft Azure OpenAI Service**. |
16 | | -Source: [Azure blog β Introducing GPT-4 in Azure OpenAI Service][azure-blog]. |
| 15 | +### Model Name & Provider |
17 | 16 |
|
18 | | -Cloud infrastructure uses global data centers; regions are not public. |
| 17 | +**GPT-4**, developed by **OpenAI**. |
19 | 18 |
|
20 | | -#### Estimated Model Size & Architecture |
| 19 | +### Hosting / Deployment |
21 | 20 |
|
22 | | -GPT-4 is widely considered a frontier model with a highly complex architecture, |
23 | | -widely estimated to utilize a Sparse Mixture-of-Experts (MoE) mechanism. This MoE architecture |
24 | | -allows the model to have a massive total number of latent parameters (potentially over 1 trillion), |
25 | | -but only a sparse subset (Active FLOPs) are used for any single inference query. |
26 | | -Source: Optimal Sparsity of Mixture-of-Experts Language Models β¦ |
27 | | -<https://arxiv.org/html/2508.18672v2> |
28 | | -Parameters vs FLOPs: Scaling Laws for Optimal Sparsity β¦ |
29 | | -<https://arxiv.org/html/2501.12370v2> |
| 21 | +Hosted via **Microsoft Azure OpenAI Service**, which operates on Azureβs global data centers (specific regions not publicly disclosed). |
| 22 | +Source: [Introducing GPT-4 in Azure OpenAI Service β Microsoft Azure Blog](https://azure.microsoft.com/en-us/blog/introducing-gpt-4-in-azure-openai-service/) |
30 | 23 |
|
31 | | -**Estimated Model Size (widely reported):** β **1.8 trillion parameters** |
32 | | -(widely reported; not officially confirmed by OpenAI) |
| 24 | +### Estimated Model Size / Architecture |
33 | 25 |
|
34 | | -#### Estimated Energy (Inference) |
| 26 | +GPT-4 is widely considered a **frontier model** employing a **Sparse Mixture-of-Experts (MoE)** architecture. |
| 27 | +This structure activates only a subset of parameters per inference, optimizing efficiency while maintaining scale. |
| 28 | +Estimated total parameters exceed **1 trillion**. |
35 | 29 |
|
36 | | -Published or estimated per-query energy values vary between studies. |
37 | | -Representative numbers include: |
| 30 | +Sources: |
38 | 31 |
|
39 | | -**Epoch AI (2024):** β 0.3 Wh (0.0003 kWh) per ChatGPT/GPT-4 query. |
40 | | -Source: [Epoch AI β How Much Energy Does ChatGPT Use?][epoch-ai]. |
| 32 | +- [*Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks* β arXiv (2508.18672v2)](https://arxiv.org/html/2508.18672v2) |
| 33 | +- [*Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models* β arXiv (2501.12370v2)](https://arxiv.org/html/2501.12370v2) |
41 | 34 |
|
42 | | -Other analysts estimate β 0.3 β 1.8 Wh (0.0003 β 0.0018 kWh) |
43 | | -depending on prompt length, token output, and GPU hardware. |
| 35 | +### Estimated Energy (Inference) |
44 | 36 |
|
45 | | -**Caveat:** OpenAI does not publish per-query energy data. All estimates depend on assumptions such as: |
| 37 | +Estimates vary between studies: |
46 | 38 |
|
47 | | -- Hardware type (GPU vs TPU) |
48 | | -- Power Usage Effectiveness (PUE) |
49 | | -- Data-center region and carbon intensity |
50 | | -- Prompt and token length |
| 39 | +- **Epoch AI (2024):** β 0.3 Wh (0.0003 kWh) per query under typical load. |
| 40 | + Source: [Epoch AI β *How Much Energy Does ChatGPT Use?*](https://epoch.ai/blog/how-much-energy-does-chatgpt-use) |
| 41 | +- **Other analysts:** 0.3 β 1.8 Wh (0.0003 β 0.0018 kWh), depending on token count and hardware. |
51 | 42 |
|
52 | | -#### Training Energy (GPT-4) |
| 43 | +**Note:** OpenAI has not released official inference energy data. |
| 44 | +Values depend on hardware (GPU vs TPU), data-center PUE, and carbon intensity. |
53 | 45 |
|
54 | | -Some analyses extrapolate GPT-4βs training energy from its model size and compute budget: |
55 | | -β 51 β 62 GWh (51,772,500 β 62,318,750 kWh) for full-scale training. |
56 | | -Source: [The Carbon Footprint of ChatGPT][sustainability-numbers]. |
| 46 | +### Training Energy Estimates |
57 | 47 |
|
58 | | -These are indirect estimates, not official OpenAI disclosures. |
| 48 | +Extrapolated from compute budgets and model size: |
| 49 | +β 51,772,500 β 62,318,750 kWh (β 51.8 β 62.3 GWh) for full-scale training. |
| 50 | +Source: [*The Carbon Footprint of ChatGPT* β Sustainability by Numbers](https://sustainabilitybynumbers.com/how-much-energy-does-chatgpt-use/) |
59 | 51 |
|
60 | | -#### Water Usage |
| 52 | +### Water Usage |
61 | 53 |
|
62 | | -Official data are unavailable, but media analyses provide approximate indicators: |
| 54 | +Water consumption derives from data-center cooling processes: |
63 | 55 |
|
64 | 56 | - A single ChatGPT query may indirectly consume β 0.5 L of water. |
65 | | -- Generating a 100-word email may use β 0.14 kWh energy and 0.52 L water. |
66 | | - |
67 | | -Source: [The Verge β Sam Altman on ChatGPT Energy and Water Use][verge-gpt]. |
68 | | - |
69 | | -#### PUE / CI Context |
| 57 | +- Generating a 100-word email β 0.14 kWh energy + 0.52 L water. |
| 58 | +Source: [The Verge β *Sam Altman on ChatGPT Energy and Water Use*](https://www.theverge.com/2023/11/16/sam-altman-chatgpt-energy-water-use) |
70 | 59 |
|
71 | | -Studies multiply compute energy by: |
| 60 | +### PUE / CI Context Used in Studies |
72 | 61 |
|
73 | | -- **PUE** β Power Usage Effectiveness (total facility power / IT power) |
74 | | -- **CI** β Carbon Intensity (kg COβe / kWh) |
| 62 | +Environmental analyses generally apply: |
75 | 63 |
|
76 | | -Example assumptions: |
77 | | - |
78 | | -- **PUE:** β 1.1 β 1.3 for Azure hyperscale centers |
79 | | -- **CI:** β 0.3 β 0.4 kg COβe/kWh (depending on region) |
| 64 | +- **PUE (Power Usage Effectiveness):** β 1.1 β 1.3 (Azure hyperscale data centers) |
| 65 | +- **CI (Carbon Intensity):** β 0.3 β 0.4 kg COβe / kWh (depending on regional grid mix) |
80 | 66 |
|
81 | 67 | --- |
82 | 68 |
|
83 | | -# Model 2: Claude 3 Haiku (Anthropic) |
| 69 | +## Model 2: Claude 3 Haiku (Anthropic) |
84 | 70 |
|
85 | 71 | ### Model Description |
86 | 72 |
|
87 | | -Claude 3 Haiku is part of Anthropicβs Claude 3 model family, released in March 2024. It is the smallest and fastest model in the Claude 3 lineup (Haiku, Sonnet, Opus) and is designed for low-latency, energy-efficient inference while maintaining strong reasoning capabilities. Haiku is optimized for lightweight commercial use cases, including chat applications, summarization, and enterprise automation. |
| 73 | +**Claude 3 Haiku** is the smallest and fastest member of Anthropicβs Claude 3 model family ( Haiku, Sonnet, Opus ), released March 2024. |
| 74 | +It is optimized for low-latency, energy-efficient applications such as chatbots, summarization, and enterprise automation. |
88 | 75 |
|
89 | 76 | Sources: |
90 | | -[Anthropic β Claude 3 Model Family Overview][anthropic-claude3] |
91 | | -[Anthropic Blog β Claude 3 Announcement][anthropic-blog] |
92 | 77 |
|
93 | | -#### Hosting / Deployment |
| 78 | +- [Anthropic Blog β *Introducing the Claude 3 Model Family*](https://www.anthropic.com/news/claude-3-models) |
| 79 | +- [Anthropic Responsible Scaling Policy](https://www.anthropic.com/policies/responsible-scaling-policy) |
94 | 80 |
|
95 | | -Claude 3 Haiku is hosted via Anthropicβs API and via AWS Amazon Bedrock (cloud). These data centers typically maintain a **PUE β 1.2**. |
| 81 | +### Hosting / Deployment |
96 | 82 |
|
| 83 | +Claude 3 Haiku is available through Anthropicβs API and **AWS Bedrock**. |
| 84 | +AWS data centers maintain an average PUE of β 1.2. |
97 | 85 | Sources: |
98 | | -[AWS Bedrock Claude Integration][aws-bedrock] |
99 | | -[AWS Sustainability Report][aws-sustainability] |
100 | | - |
101 | | -#### Estimated Model Size / Architecture |
102 | 86 |
|
103 | | -Claude 3 Haiku is estimated to have **β 20 billion parameters**, making it significantly smaller than larger models in the family (e.g., Claude 3 Opus). This parameter estimate is based on community sources; Anthropic does not publicly confirm. |
| 87 | +- [AWS Bedrock β *Use Claude on Bedrock*](https://aws.amazon.com/bedrock/claude/) |
| 88 | +- [AWS Sustainability Report 2024](https://sustainability.aboutamazon.com/reporting) |
104 | 89 |
|
105 | | -Source: |
106 | | -[Reddit β ClaudeAI parameter discussion][reddit-claude-haiku] |
| 90 | +### Estimated Model Size / Architecture |
107 | 91 |
|
108 | | -#### Estimated Energy |
| 92 | +Community estimates place Claude 3 Haiku at β 20 B parameters. |
| 93 | +The largest model in the family (Claude 3 Opus) is β 2 T parameters. |
| 94 | +Source: [ClaudeAI Community Discussion (Reddit)](https://www.reddit.com/r/ClaudeAI/) |
109 | 95 |
|
110 | | -Anthropic does not publish per-query energy data. Independent analysts estimate models of similar size (10β30 billion parameters) use ~0.05-0.1 Wh (0.00005-0.0001 kWh) per query, depending on hardware and tokens. Haiku is reportedly ~5Γ more efficient than larger Claude variants. |
| 96 | +### Estimated Energy (Inference) |
111 | 97 |
|
| 98 | +Anthropic does not publish per-query energy figures. |
| 99 | +Based on 10β30 B parameter transformers: β 0.05 β 0.1 Wh (0.00005 β 0.0001 kWh) per query. |
| 100 | +Haiku is β 5Γ more efficient than Claude 3 Sonnet or Opus. |
112 | 101 | Sources: |
113 | | -[Epoch AI β Energy Use of AI Models][epoch-energy] |
114 | | -[Anthropic Claude 3 Announcement][anthropic-announcement] |
115 | 102 |
|
116 | | -#### Training Energy Estimates |
| 103 | +- [Epoch AI β *Machine Learning Trends*](https://epoch.ai/blog/machine-learning-trends) |
| 104 | +- [Anthropic Claude 3 Announcement](https://www.anthropic.com/news/claude-3-models) |
117 | 105 |
|
118 | | -For models in the 10β30 billion parameter range, training energy is estimated at **3,000-10,000 MWh**, depending on runs and infrastructure. |
| 106 | +### Training Energy Estimates |
119 | 107 |
|
| 108 | +Claude 3 models are trained on GPU clusters (NVIDIA A100/H100) via AWS. |
| 109 | +Typical training energy for models of this scale: β 3,000 β 10,000 MWh. |
120 | 110 | Sources: |
121 | | -[Epoch AI β AI Training Compute & Energy Scaling][epoch-training] |
122 | | -[Anthropic Responsible Scaling Policy][anthropic-scaling] |
123 | 111 |
|
124 | | -#### Water Usage of claude |
| 112 | +- [Epoch AI β *Machine Learning Trends*](https://epoch.ai/blog/machine-learning-trends) |
| 113 | +- [Anthropic Responsible Scaling Policy](https://www.anthropic.com/policies/responsible-scaling-policy) |
125 | 114 |
|
126 | | -Anthropic has not published water-consumption data for Claude 3. AWS manages cooling water use via its sustainability programs. Some centers use air-cooling or recycle water on-site to reduce water usage. |
| 115 | +### Water Usage |
127 | 116 |
|
| 117 | +Anthropic does not publish direct figures; relies on AWS cooling efficiency and water recycling policies. |
128 | 118 | Sources: |
129 | | -[AWS Water Stewardship Report][aws-water] |
130 | | -[Anthropic Sustainability Commitments][anthropic-sustainability] |
131 | 119 |
|
132 | | -#### PUE and CI Context |
| 120 | +- [AWS Water Stewardship Report](https://sustainability.aboutamazon.com/environment/the-cloud/water-stewardship) |
| 121 | +- [Anthropic Responsible Scaling Policy](https://www.anthropic.com/policies/responsible-scaling-policy) |
133 | 122 |
|
134 | | -- AWS average **PUE β 1.2** (accounts for cooling & power delivery losses) |
135 | | -- Carbon intensity **CI β 0-0.2 kg COβe/kWh**, depending on region |
136 | | -- AWS targets **100% renewable energy by 2025** |
| 123 | +### PUE / CI Context Used in Studies |
137 | 124 |
|
| 125 | +- **PUE:** β 1.2 (AWS average) |
| 126 | +- **CI:** β 0 β 0.2 kg COβe / kWh (based on regional renewable mix) |
| 127 | +AWS targets **100 % renewable energy by 2025**. |
138 | 128 | Sources: |
139 | | -[AWS Global Infrastructure Efficiency Data][aws-infra] |
140 | | -[Anthropic Responsible Scaling Policy][anthropic-scaling] |
| 129 | +- [AWS Global Infrastructure Efficiency Data](https://aws.amazon.com/about-aws/sustainability/infrastructure/) |
| 130 | +- [AWS Sustainability Report 2024](https://sustainability.aboutamazon.com/reporting) |
141 | 131 |
|
142 | 132 | --- |
143 | 133 |
|
144 | | -# Model 3: Gemini Nano (Google DeepMind) |
| 134 | +## Model 3: Gemini Nano (Google DeepMind) |
145 | 135 |
|
146 | | -#### Model Name & Provider |
147 | | - |
148 | | -**Gemini Nano**, developed by **Google DeepMind**, is the smallest and most efficient member of the Gemini family (Nano, Flash, Pro, Ultra). |
| 136 | +### Model Name & Provider |
149 | 137 |
|
| 138 | +**Gemini Nano**, developed by **Google DeepMind**, is the smallest member of the Gemini family (Nano, Pro, Ultra). |
150 | 139 | Sources: |
151 | | -[Google AI Blog β Introducing Gemini][google-gemini] |
152 | 140 |
|
153 | | -#### Hosting / Deployment |
| 141 | +- [Google AI Blog β *Introducing Gemini*](https://blog.google/technology/ai/google-gemini-ai/) |
| 142 | +- [Android Developers β *Gemini Nano Overview*](https://developer.android.com/ai/gemini-nano) |
154 | 143 |
|
155 | | -Runs primarily **on-device** via Androidβs **AICore system service** (introduced with Android 14). Optimized hardware (e.g., Pixel 8 Pro / 9 Series) eliminates cloud compute and network latency. |
| 144 | +### Hosting / Deployment |
156 | 145 |
|
157 | | -Sources: |
158 | | -[Android Developers β Gemini Nano Overview][android-nano] |
| 146 | +Runs **on-device** through Androidβs **AICore** system (launched in Android 14). |
| 147 | +Deployed on optimized hardware (e.g., Pixel 8 Pro, Pixel 9 Series). |
| 148 | +This local processing approach eliminates cloud compute energy and network latency. |
| 149 | +Additional coverage: [The Verge β *Gemini Nano Arrives on Pixel 8 Pro*](https://www.theverge.com/2023/12/6/23990823/google-gemini-ai-models-nano-pro-ultra) |
159 | 150 |
|
160 | | -#### Estimated Model Size / Architecture |
| 151 | +### Estimated Model Size / Architecture |
161 | 152 |
|
162 | | -Gemini Nano is deployed as quantized versions: |
| 153 | +Deployed in quantized versions: |
163 | 154 |
|
164 | | -- **Nano-1:** 1.8 billion parameters |
165 | | -- **Nano-2:** 3.25 billion parameters |
| 155 | +- **Nano-1:** β 1.8 B parameters |
| 156 | +- **Nano-2:** β 3.25 B parameters |
| 157 | +Reference: [Exploding Topics β AI Model Parameters Database](https://explodingtopics.com/blog/gpt-parameters) |
166 | 158 |
|
167 | | -Source: |
168 | | -[Exploding Topics β Industry Model Sizes][exploding-topics] |
| 159 | +### Estimated Energy (Inference) |
169 | 160 |
|
170 | | -#### Estimated Energy (Inference) |
| 161 | +- **Median Cloud Gemini Inference:** β 0.24 Wh per text prompt. |
| 162 | +- **On-Device Nano Estimate:** β 0.01 Wh per query (benchmarks + design targets). |
| 163 | +Note: Official Nano inference measurements are not yet public. |
| 164 | +Source: [Google Cloud Blog β *Measuring the Environmental Impact of AI Inference*](https://cloud.google.com/blog/products/infrastructure/measuring-the-environmental-impact-of-ai-inference/) |
171 | 165 |
|
172 | | -Cloud median inference for Gemini: β 0.24 Wh per text prompt (wide-scope). |
173 | | -On-device Nano estimate: β 0.01 Wh (0.00001 kWh) per query. Note: direct official Nano numbers not published. |
| 166 | +### Training Energy Estimates |
174 | 167 |
|
| 168 | +Gemini Nano was distilled from larger Gemini models trained on **Google TPU v5e clusters**. |
| 169 | +Training energy estimated β 200 β 1,200 MWh (total, amortized across billions of devices). |
175 | 170 | Sources: |
176 | | -[Google Cloud Blog β Measuring Environmental Impact of Inference][google-impact] |
177 | 171 |
|
178 | | -#### Training Energy Estimates |
| 172 | +- [Google Cloud TPU Documentation](https://cloud.google.com/tpu/docs/) |
| 173 | +- [Google Cloud Blog β Environmental Impact of AI Inference](https://cloud.google.com/blog/products/infrastructure/measuring-the-environmental-impact-of-ai-inference/) |
179 | 174 |
|
180 | | -Gemini Nano was distilled from larger models on Google TPU v5e clusters. Estimated training cost: **β 200-1,200 MWh**, amortised across many devices. |
| 175 | +### Water Usage |
181 | 176 |
|
182 | | -Sources: |
183 | | -[Google Research β TPU Efficiency Overview][google-tpu] |
184 | | -[Google Cloud Blog β Measuring Environmental Impact of Inference][google-impact] |
| 177 | +- **Inference:** Zero data-center water use (on-device). |
| 178 | +- **Training:** Uses Google data centers with average WUE β 0.26 mL per median cloud query. |
| 179 | +Source: [Google Cloud Blog β *Measuring the Environmental Impact of AI Inference*](https://cloud.google.com/blog/products/infrastructure/measuring-the-environmental-impact-of-ai-inference/) |
185 | 180 |
|
186 | | -#### Water Usage |
| 181 | +### PUE / CI Context Used in Studies |
187 | 182 |
|
188 | | -- **Inference:** Zero data-center water (runs on device) |
189 | | -- **Training:** Google data centres report WUE β 0.26 mL per median cloud query in some analyses |
| 183 | +- **Average PUE:** 1.10 β 1.12 (Google Data Centers) |
| 184 | +- **Carbon Intensity (CI):** β 0.03 g COβe / query (market-based) |
| 185 | +Reflects Googleβs near-total renewable energy purchasing. |
| 186 | +Source: [Google Cloud Blog β Environmental Impact of AI Inference](https://cloud.google.com/blog/products/infrastructure/measuring-the-environmental-impact-of-ai-inference/) |
190 | 187 |
|
191 | | -Sources: |
192 | | -[Google Cloud Blog β Measuring Environmental Impact of Inference][google-impact] |
193 | | - |
194 | | -#### PUE / CI Context |
| 188 | +--- |
195 | 189 |
|
196 | | -Googleβs data centre fleet reports average **PUE β 1.10-1.12**. |
197 | | -Reported carbon intensity (CI) β 0.03 g COβe per median cloud query, reflecting high level of renewables. |
| 190 | +### Summary |
198 | 191 |
|
199 | | -Sources: |
200 | | -[Google Cloud Blog β Measuring Environmental Impact of Inference][google-impact] |
| 192 | +| Model | Developer | Hosting Type | Est. Parameters | Inference Energy (Wh/query) | Training Energy (MWh) | PUE | CI (kg COβe/kWh) | |
| 193 | +|:------|:-----------|:--------------|:----------------|:-----------------------------|:----------------------|:----|:----------------:| |
| 194 | +| GPT-4 | OpenAI | Cloud (Azure) | β 1 T + (MoE) | 0.3 β 1.8 | β 51 K β 62 K MWh | 1.1β1.3 | 0.3β0.4 | |
| 195 | +| Claude 3 Haiku | Anthropic | Cloud (AWS Bedrock) | β 20 B | 0.05 β 0.1 | 3 K β 10 K | β 1.2 | 0β0.2 | |
| 196 | +| Gemini Nano | Google DeepMind | On-Device (Android AICore) | 1.8β3.25 B | β 0.01 (on-device) | 200β1,200 | 1.10β1.12 | β 0.03 g COβe /query | |
201 | 197 |
|
202 | 198 | --- |
203 | | - |
204 | | -[azure-blog]: https://azure.microsoft.com/en-us/blog/introducing-gpt4-in-azure-openai-service |
205 | | -[epoch-ai]: https://epoch.ai/article/how-much-energy-does-chatgpt-use |
206 | | -[sustainability-numbers]: https://sustainability-numbers.com/the-carbon-footprint-of-chatgpt |
207 | | -[verge-gpt]: https://www.theverge.com/2024/4/16/sam-altman-chatgpt-water-energy |
208 | | -[anthropic-claude3]: https://www.anthropic.com/news/claude-3-family |
209 | | -[anthropic-blog]: https://www.anthropic.com/news/claude-3 |
210 | | -[aws-bedrock]: https://aws.amazon.com/bedrock/claude/ |
211 | | -[aws-sustainability]: https://sustainability.aboutamazon.com/environment/the-cloud |
212 | | -[reddit-claude-haiku]: https://www.reddit.com/r/ClaudeAI/comments/1bi7p5w/how_many_parameter_does_claude_haiku_have/ |
213 | | -[epoch-energy]: https://epoch.ai/article/energy-use-of-ai-models |
214 | | -[anthropic-announcement]: https://www.anthropic.com/news/claude-3 |
215 | | -[epoch-training]: https://epoch.ai/article/ai-training-compute-and-energy-scaling |
216 | | -[anthropic-scaling]: https://www.anthropic.com/papers/responsible-scaling-policy |
217 | | -[aws-water]: https://sustainability.aboutamazon.com/environment/water |
218 | | -[anthropic-sustainability]: https://www.anthropic.com/news/anthropic-sustainability-commitments |
219 | | -[aws-infra]: https://aws.amazon.com/about-aws/global-infrastructure/sustainability/ |
220 | | -[google-gemini]: https://blog.google/technology/ai/google-gemini-ai/ |
221 | | -[android-nano]: https://developer.android.com/ai/gemini-nano |
222 | | -[exploding-topics]: https://explodingtopics.com/blog/ai-model-sizes |
223 | | -[google-impact]: https://cloud.google.com/blog/products/sustainability/measuring-environmental-impact-of-inference |
224 | | -[google-tpu]: https://research.google/pubs/efficient-tpu-training-v5e/ |
0 commit comments