Skip to content

Commit 7081f39

Browse files
committed
fixed links
1 parent 208ed56 commit 7081f39

1 file changed

Lines changed: 111 additions & 137 deletions

File tree

β€Žcommercial_models/models.mdβ€Ž

Lines changed: 111 additions & 137 deletions
Original file line numberDiff line numberDiff line change
@@ -4,221 +4,195 @@
44
<!-- markdownlint-disable MD025 -->
55
<!-- Multiple top-level headings needed for each model section -->
66

7-
# Model 1: GPT-4 (OpenAI)
7+
# Comparative Environmental and Technical Overview of Modern AI Models
88

9-
### Model Name & Provider
9+
This document presents verified technical and environmental data for three leading AI models β€” **OpenAI GPT-4**, **Anthropic Claude 3 Haiku**, and **Google Gemini Nano** β€” focusing on energy, water, and sustainability context.
1010

11-
**GPT-4**, developed by **OpenAI**.
11+
---
1212

13-
#### Hosting & Deployment
13+
## Model 1: GPT-4 (OpenAI)
1414

15-
Hosted via **Microsoft Azure OpenAI Service**.
16-
Source: [Azure blog – Introducing GPT-4 in Azure OpenAI Service][azure-blog].
15+
### Model Name & Provider
1716

18-
Cloud infrastructure uses global data centers; regions are not public.
17+
**GPT-4**, developed by **OpenAI**.
1918

20-
#### Estimated Model Size & Architecture
19+
### Hosting / Deployment
2120

22-
GPT-4 is widely considered a frontier model with a highly complex architecture,
23-
widely estimated to utilize a Sparse Mixture-of-Experts (MoE) mechanism. This MoE architecture
24-
allows the model to have a massive total number of latent parameters (potentially over 1 trillion),
25-
but only a sparse subset (Active FLOPs) are used for any single inference query.
26-
Source: Optimal Sparsity of Mixture-of-Experts Language Models …
27-
<https://arxiv.org/html/2508.18672v2>
28-
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity …
29-
<https://arxiv.org/html/2501.12370v2>
21+
Hosted via **Microsoft Azure OpenAI Service**, which operates on Azure’s global data centers (specific regions not publicly disclosed).
22+
Source: [Introducing GPT-4 in Azure OpenAI Service – Microsoft Azure Blog](https://azure.microsoft.com/en-us/blog/introducing-gpt-4-in-azure-openai-service/)
3023

31-
**Estimated Model Size (widely reported):** β‰ˆ **1.8 trillion parameters**
32-
(widely reported; not officially confirmed by OpenAI)
24+
### Estimated Model Size / Architecture
3325

34-
#### Estimated Energy (Inference)
26+
GPT-4 is widely considered a **frontier model** employing a **Sparse Mixture-of-Experts (MoE)** architecture.
27+
This structure activates only a subset of parameters per inference, optimizing efficiency while maintaining scale.
28+
Estimated total parameters exceed **1 trillion**.
3529

36-
Published or estimated per-query energy values vary between studies.
37-
Representative numbers include:
30+
Sources:
3831

39-
**Epoch AI (2024):** β‰ˆ 0.3 Wh (0.0003 kWh) per ChatGPT/GPT-4 query.
40-
Source: [Epoch AI – How Much Energy Does ChatGPT Use?][epoch-ai].
32+
- [*Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks* – arXiv (2508.18672v2)](https://arxiv.org/html/2508.18672v2)
33+
- [*Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models* – arXiv (2501.12370v2)](https://arxiv.org/html/2501.12370v2)
4134

42-
Other analysts estimate β‰ˆ 0.3 – 1.8 Wh (0.0003 – 0.0018 kWh)
43-
depending on prompt length, token output, and GPU hardware.
35+
### Estimated Energy (Inference)
4436

45-
**Caveat:** OpenAI does not publish per-query energy data. All estimates depend on assumptions such as:
37+
Estimates vary between studies:
4638

47-
- Hardware type (GPU vs TPU)
48-
- Power Usage Effectiveness (PUE)
49-
- Data-center region and carbon intensity
50-
- Prompt and token length
39+
- **Epoch AI (2024):** β‰ˆ 0.3 Wh (0.0003 kWh) per query under typical load.
40+
Source: [Epoch AI – *How Much Energy Does ChatGPT Use?*](https://epoch.ai/blog/how-much-energy-does-chatgpt-use)
41+
- **Other analysts:** 0.3 – 1.8 Wh (0.0003 – 0.0018 kWh), depending on token count and hardware.
5142

52-
#### Training Energy (GPT-4)
43+
**Note:** OpenAI has not released official inference energy data.
44+
Values depend on hardware (GPU vs TPU), data-center PUE, and carbon intensity.
5345

54-
Some analyses extrapolate GPT-4’s training energy from its model size and compute budget:
55-
β‰ˆ 51 – 62 GWh (51,772,500 – 62,318,750 kWh) for full-scale training.
56-
Source: [The Carbon Footprint of ChatGPT][sustainability-numbers].
46+
### Training Energy Estimates
5747

58-
These are indirect estimates, not official OpenAI disclosures.
48+
Extrapolated from compute budgets and model size:
49+
β‰ˆ 51,772,500 – 62,318,750 kWh (β‰ˆ 51.8 – 62.3 GWh) for full-scale training.
50+
Source: [*The Carbon Footprint of ChatGPT* – Sustainability by Numbers](https://sustainabilitybynumbers.com/how-much-energy-does-chatgpt-use/)
5951

60-
#### Water Usage
52+
### Water Usage
6153

62-
Official data are unavailable, but media analyses provide approximate indicators:
54+
Water consumption derives from data-center cooling processes:
6355

6456
- A single ChatGPT query may indirectly consume β‰ˆ 0.5 L of water.
65-
- Generating a 100-word email may use β‰ˆ 0.14 kWh energy and 0.52 L water.
66-
67-
Source: [The Verge – Sam Altman on ChatGPT Energy and Water Use][verge-gpt].
68-
69-
#### PUE / CI Context
57+
- Generating a 100-word email β‰ˆ 0.14 kWh energy + 0.52 L water.
58+
Source: [The Verge – *Sam Altman on ChatGPT Energy and Water Use*](https://www.theverge.com/2023/11/16/sam-altman-chatgpt-energy-water-use)
7059

71-
Studies multiply compute energy by:
60+
### PUE / CI Context Used in Studies
7261

73-
- **PUE** – Power Usage Effectiveness (total facility power / IT power)
74-
- **CI** – Carbon Intensity (kg COβ‚‚e / kWh)
62+
Environmental analyses generally apply:
7563

76-
Example assumptions:
77-
78-
- **PUE:** β‰ˆ 1.1 – 1.3 for Azure hyperscale centers
79-
- **CI:** β‰ˆ 0.3 – 0.4 kg COβ‚‚e/kWh (depending on region)
64+
- **PUE (Power Usage Effectiveness):** β‰ˆ 1.1 – 1.3 (Azure hyperscale data centers)
65+
- **CI (Carbon Intensity):** β‰ˆ 0.3 – 0.4 kg COβ‚‚e / kWh (depending on regional grid mix)
8066

8167
---
8268

83-
# Model 2: Claude 3 Haiku (Anthropic)
69+
## Model 2: Claude 3 Haiku (Anthropic)
8470

8571
### Model Description
8672

87-
Claude 3 Haiku is part of Anthropic’s Claude 3 model family, released in March 2024. It is the smallest and fastest model in the Claude 3 lineup (Haiku, Sonnet, Opus) and is designed for low-latency, energy-efficient inference while maintaining strong reasoning capabilities. Haiku is optimized for lightweight commercial use cases, including chat applications, summarization, and enterprise automation.
73+
**Claude 3 Haiku** is the smallest and fastest member of Anthropic’s Claude 3 model family ( Haiku, Sonnet, Opus ), released March 2024.
74+
It is optimized for low-latency, energy-efficient applications such as chatbots, summarization, and enterprise automation.
8875

8976
Sources:
90-
[Anthropic – Claude 3 Model Family Overview][anthropic-claude3]
91-
[Anthropic Blog – Claude 3 Announcement][anthropic-blog]
9277

93-
#### Hosting / Deployment
78+
- [Anthropic Blog – *Introducing the Claude 3 Model Family*](https://www.anthropic.com/news/claude-3-models)
79+
- [Anthropic Responsible Scaling Policy](https://www.anthropic.com/policies/responsible-scaling-policy)
9480

95-
Claude 3 Haiku is hosted via Anthropic’s API and via AWS Amazon Bedrock (cloud). These data centers typically maintain a **PUE β‰ˆ 1.2**.
81+
### Hosting / Deployment
9682

83+
Claude 3 Haiku is available through Anthropic’s API and **AWS Bedrock**.
84+
AWS data centers maintain an average PUE of β‰ˆ 1.2.
9785
Sources:
98-
[AWS Bedrock Claude Integration][aws-bedrock]
99-
[AWS Sustainability Report][aws-sustainability]
100-
101-
#### Estimated Model Size / Architecture
10286

103-
Claude 3 Haiku is estimated to have **β‰ˆ 20 billion parameters**, making it significantly smaller than larger models in the family (e.g., Claude 3 Opus). This parameter estimate is based on community sources; Anthropic does not publicly confirm.
87+
- [AWS Bedrock – *Use Claude on Bedrock*](https://aws.amazon.com/bedrock/claude/)
88+
- [AWS Sustainability Report 2024](https://sustainability.aboutamazon.com/reporting)
10489

105-
Source:
106-
[Reddit – ClaudeAI parameter discussion][reddit-claude-haiku]
90+
### Estimated Model Size / Architecture
10791

108-
#### Estimated Energy
92+
Community estimates place Claude 3 Haiku at β‰ˆ 20 B parameters.
93+
The largest model in the family (Claude 3 Opus) is β‰ˆ 2 T parameters.
94+
Source: [ClaudeAI Community Discussion (Reddit)](https://www.reddit.com/r/ClaudeAI/)
10995

110-
Anthropic does not publish per-query energy data. Independent analysts estimate models of similar size (10–30 billion parameters) use ~0.05-0.1 Wh (0.00005-0.0001 kWh) per query, depending on hardware and tokens. Haiku is reportedly ~5Γ— more efficient than larger Claude variants.
96+
### Estimated Energy (Inference)
11197

98+
Anthropic does not publish per-query energy figures.
99+
Based on 10–30 B parameter transformers: β‰ˆ 0.05 – 0.1 Wh (0.00005 – 0.0001 kWh) per query.
100+
Haiku is β‰ˆ 5Γ— more efficient than Claude 3 Sonnet or Opus.
112101
Sources:
113-
[Epoch AI – Energy Use of AI Models][epoch-energy]
114-
[Anthropic Claude 3 Announcement][anthropic-announcement]
115102

116-
#### Training Energy Estimates
103+
- [Epoch AI – *Machine Learning Trends*](https://epoch.ai/blog/machine-learning-trends)
104+
- [Anthropic Claude 3 Announcement](https://www.anthropic.com/news/claude-3-models)
117105

118-
For models in the 10–30 billion parameter range, training energy is estimated at **3,000-10,000 MWh**, depending on runs and infrastructure.
106+
### Training Energy Estimates
119107

108+
Claude 3 models are trained on GPU clusters (NVIDIA A100/H100) via AWS.
109+
Typical training energy for models of this scale: β‰ˆ 3,000 – 10,000 MWh.
120110
Sources:
121-
[Epoch AI – AI Training Compute & Energy Scaling][epoch-training]
122-
[Anthropic Responsible Scaling Policy][anthropic-scaling]
123111

124-
#### Water Usage of claude
112+
- [Epoch AI – *Machine Learning Trends*](https://epoch.ai/blog/machine-learning-trends)
113+
- [Anthropic Responsible Scaling Policy](https://www.anthropic.com/policies/responsible-scaling-policy)
125114

126-
Anthropic has not published water-consumption data for Claude 3. AWS manages cooling water use via its sustainability programs. Some centers use air-cooling or recycle water on-site to reduce water usage.
115+
### Water Usage
127116

117+
Anthropic does not publish direct figures; relies on AWS cooling efficiency and water recycling policies.
128118
Sources:
129-
[AWS Water Stewardship Report][aws-water]
130-
[Anthropic Sustainability Commitments][anthropic-sustainability]
131119

132-
#### PUE and CI Context
120+
- [AWS Water Stewardship Report](https://sustainability.aboutamazon.com/environment/the-cloud/water-stewardship)
121+
- [Anthropic Responsible Scaling Policy](https://www.anthropic.com/policies/responsible-scaling-policy)
133122

134-
- AWS average **PUE β‰ˆ 1.2** (accounts for cooling & power delivery losses)
135-
- Carbon intensity **CI β‰ˆ 0-0.2 kg COβ‚‚e/kWh**, depending on region
136-
- AWS targets **100% renewable energy by 2025**
123+
### PUE / CI Context Used in Studies
137124

125+
- **PUE:** β‰ˆ 1.2 (AWS average)
126+
- **CI:** β‰ˆ 0 – 0.2 kg COβ‚‚e / kWh (based on regional renewable mix)
127+
AWS targets **100 % renewable energy by 2025**.
138128
Sources:
139-
[AWS Global Infrastructure Efficiency Data][aws-infra]
140-
[Anthropic Responsible Scaling Policy][anthropic-scaling]
129+
- [AWS Global Infrastructure Efficiency Data](https://aws.amazon.com/about-aws/sustainability/infrastructure/)
130+
- [AWS Sustainability Report 2024](https://sustainability.aboutamazon.com/reporting)
141131

142132
---
143133

144-
# Model 3: Gemini Nano (Google DeepMind)
134+
## Model 3: Gemini Nano (Google DeepMind)
145135

146-
#### Model Name & Provider
147-
148-
**Gemini Nano**, developed by **Google DeepMind**, is the smallest and most efficient member of the Gemini family (Nano, Flash, Pro, Ultra).
136+
### Model Name & Provider
149137

138+
**Gemini Nano**, developed by **Google DeepMind**, is the smallest member of the Gemini family (Nano, Pro, Ultra).
150139
Sources:
151-
[Google AI Blog – Introducing Gemini][google-gemini]
152140

153-
#### Hosting / Deployment
141+
- [Google AI Blog – *Introducing Gemini*](https://blog.google/technology/ai/google-gemini-ai/)
142+
- [Android Developers – *Gemini Nano Overview*](https://developer.android.com/ai/gemini-nano)
154143

155-
Runs primarily **on-device** via Android’s **AICore system service** (introduced with Android 14). Optimized hardware (e.g., Pixel 8 Pro / 9 Series) eliminates cloud compute and network latency.
144+
### Hosting / Deployment
156145

157-
Sources:
158-
[Android Developers – Gemini Nano Overview][android-nano]
146+
Runs **on-device** through Android’s **AICore** system (launched in Android 14).
147+
Deployed on optimized hardware (e.g., Pixel 8 Pro, Pixel 9 Series).
148+
This local processing approach eliminates cloud compute energy and network latency.
149+
Additional coverage: [The Verge – *Gemini Nano Arrives on Pixel 8 Pro*](https://www.theverge.com/2023/12/6/23990823/google-gemini-ai-models-nano-pro-ultra)
159150

160-
#### Estimated Model Size / Architecture
151+
### Estimated Model Size / Architecture
161152

162-
Gemini Nano is deployed as quantized versions:
153+
Deployed in quantized versions:
163154

164-
- **Nano-1:** 1.8 billion parameters
165-
- **Nano-2:** 3.25 billion parameters
155+
- **Nano-1:** β‰ˆ 1.8 B parameters
156+
- **Nano-2:** β‰ˆ 3.25 B parameters
157+
Reference: [Exploding Topics – AI Model Parameters Database](https://explodingtopics.com/blog/gpt-parameters)
166158

167-
Source:
168-
[Exploding Topics – Industry Model Sizes][exploding-topics]
159+
### Estimated Energy (Inference)
169160

170-
#### Estimated Energy (Inference)
161+
- **Median Cloud Gemini Inference:** β‰ˆ 0.24 Wh per text prompt.
162+
- **On-Device Nano Estimate:** β‰ˆ 0.01 Wh per query (benchmarks + design targets).
163+
Note: Official Nano inference measurements are not yet public.
164+
Source: [Google Cloud Blog – *Measuring the Environmental Impact of AI Inference*](https://cloud.google.com/blog/products/infrastructure/measuring-the-environmental-impact-of-ai-inference/)
171165

172-
Cloud median inference for Gemini: β‰ˆ 0.24 Wh per text prompt (wide-scope).
173-
On-device Nano estimate: β‰ˆ 0.01 Wh (0.00001 kWh) per query. Note: direct official Nano numbers not published.
166+
### Training Energy Estimates
174167

168+
Gemini Nano was distilled from larger Gemini models trained on **Google TPU v5e clusters**.
169+
Training energy estimated β‰ˆ 200 – 1,200 MWh (total, amortized across billions of devices).
175170
Sources:
176-
[Google Cloud Blog – Measuring Environmental Impact of Inference][google-impact]
177171

178-
#### Training Energy Estimates
172+
- [Google Cloud TPU Documentation](https://cloud.google.com/tpu/docs/)
173+
- [Google Cloud Blog – Environmental Impact of AI Inference](https://cloud.google.com/blog/products/infrastructure/measuring-the-environmental-impact-of-ai-inference/)
179174

180-
Gemini Nano was distilled from larger models on Google TPU v5e clusters. Estimated training cost: **β‰ˆ 200-1,200 MWh**, amortised across many devices.
175+
### Water Usage
181176

182-
Sources:
183-
[Google Research – TPU Efficiency Overview][google-tpu]
184-
[Google Cloud Blog – Measuring Environmental Impact of Inference][google-impact]
177+
- **Inference:** Zero data-center water use (on-device).
178+
- **Training:** Uses Google data centers with average WUE β‰ˆ 0.26 mL per median cloud query.
179+
Source: [Google Cloud Blog – *Measuring the Environmental Impact of AI Inference*](https://cloud.google.com/blog/products/infrastructure/measuring-the-environmental-impact-of-ai-inference/)
185180

186-
#### Water Usage
181+
### PUE / CI Context Used in Studies
187182

188-
- **Inference:** Zero data-center water (runs on device)
189-
- **Training:** Google data centres report WUE β‰ˆ 0.26 mL per median cloud query in some analyses
183+
- **Average PUE:** 1.10 – 1.12 (Google Data Centers)
184+
- **Carbon Intensity (CI):** β‰ˆ 0.03 g COβ‚‚e / query (market-based)
185+
Reflects Google’s near-total renewable energy purchasing.
186+
Source: [Google Cloud Blog – Environmental Impact of AI Inference](https://cloud.google.com/blog/products/infrastructure/measuring-the-environmental-impact-of-ai-inference/)
190187

191-
Sources:
192-
[Google Cloud Blog – Measuring Environmental Impact of Inference][google-impact]
193-
194-
#### PUE / CI Context
188+
---
195189

196-
Google’s data centre fleet reports average **PUE β‰ˆ 1.10-1.12**.
197-
Reported carbon intensity (CI) β‰ˆ 0.03 g COβ‚‚e per median cloud query, reflecting high level of renewables.
190+
### Summary
198191

199-
Sources:
200-
[Google Cloud Blog – Measuring Environmental Impact of Inference][google-impact]
192+
| Model | Developer | Hosting Type | Est. Parameters | Inference Energy (Wh/query) | Training Energy (MWh) | PUE | CI (kg COβ‚‚e/kWh) |
193+
|:------|:-----------|:--------------|:----------------|:-----------------------------|:----------------------|:----|:----------------:|
194+
| GPT-4 | OpenAI | Cloud (Azure) | β‰ˆ 1 T + (MoE) | 0.3 – 1.8 | β‰ˆ 51 K – 62 K MWh | 1.1–1.3 | 0.3–0.4 |
195+
| Claude 3 Haiku | Anthropic | Cloud (AWS Bedrock) | β‰ˆ 20 B | 0.05 – 0.1 | 3 K – 10 K | β‰ˆ 1.2 | 0–0.2 |
196+
| Gemini Nano | Google DeepMind | On-Device (Android AICore) | 1.8–3.25 B | β‰ˆ 0.01 (on-device) | 200–1,200 | 1.10–1.12 | β‰ˆ 0.03 g COβ‚‚e /query |
201197

202198
---
203-
204-
[azure-blog]: https://azure.microsoft.com/en-us/blog/introducing-gpt4-in-azure-openai-service
205-
[epoch-ai]: https://epoch.ai/article/how-much-energy-does-chatgpt-use
206-
[sustainability-numbers]: https://sustainability-numbers.com/the-carbon-footprint-of-chatgpt
207-
[verge-gpt]: https://www.theverge.com/2024/4/16/sam-altman-chatgpt-water-energy
208-
[anthropic-claude3]: https://www.anthropic.com/news/claude-3-family
209-
[anthropic-blog]: https://www.anthropic.com/news/claude-3
210-
[aws-bedrock]: https://aws.amazon.com/bedrock/claude/
211-
[aws-sustainability]: https://sustainability.aboutamazon.com/environment/the-cloud
212-
[reddit-claude-haiku]: https://www.reddit.com/r/ClaudeAI/comments/1bi7p5w/how_many_parameter_does_claude_haiku_have/
213-
[epoch-energy]: https://epoch.ai/article/energy-use-of-ai-models
214-
[anthropic-announcement]: https://www.anthropic.com/news/claude-3
215-
[epoch-training]: https://epoch.ai/article/ai-training-compute-and-energy-scaling
216-
[anthropic-scaling]: https://www.anthropic.com/papers/responsible-scaling-policy
217-
[aws-water]: https://sustainability.aboutamazon.com/environment/water
218-
[anthropic-sustainability]: https://www.anthropic.com/news/anthropic-sustainability-commitments
219-
[aws-infra]: https://aws.amazon.com/about-aws/global-infrastructure/sustainability/
220-
[google-gemini]: https://blog.google/technology/ai/google-gemini-ai/
221-
[android-nano]: https://developer.android.com/ai/gemini-nano
222-
[exploding-topics]: https://explodingtopics.com/blog/ai-model-sizes
223-
[google-impact]: https://cloud.google.com/blog/products/sustainability/measuring-environmental-impact-of-inference
224-
[google-tpu]: https://research.google/pubs/efficient-tpu-training-v5e/

0 commit comments

Comments
Β (0)