Skip to content

Commit 208ed56

Browse files
committed
fixed links
1 parent ddb15d1 commit 208ed56

1 file changed

Lines changed: 127 additions & 117 deletions

File tree

commercial_models/models.md

Lines changed: 127 additions & 117 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,37 @@
1+
<!-- markdownlint-disable MD013 MD024 MD025 MD026 MD041 MD001 -->
2+
<!-- This disables line length, duplicate headings, multiple top-level headings, heading punctuation, first heading, and heading increment rules -->
13
<!-- markdownlint-disable MD013 -->
24
<!-- markdownlint-disable MD025 -->
3-
<!-- Disabled MD025 because multiple top-level headings (#) are needed for each model section -->
5+
<!-- Multiple top-level headings needed for each model section -->
46

57
# Model 1: GPT-4 (OpenAI)
68

7-
## Model Name and Provider
9+
### Model Name & Provider
810

911
**GPT-4**, developed by **OpenAI**.
1012

11-
### Hosting and Deployment
12-
13-
Hosted via **Microsoft Azure OpenAI Service**.
13+
#### Hosting & Deployment
1414

15+
Hosted via **Microsoft Azure OpenAI Service**.
1516
Source: [Azure blog – Introducing GPT-4 in Azure OpenAI Service][azure-blog].
1617

1718
Cloud infrastructure uses global data centers; regions are not public.
1819

19-
### Estimated Energy (Inference)
20+
#### Estimated Model Size & Architecture
21+
22+
GPT-4 is widely considered a frontier model with a highly complex architecture,
23+
widely estimated to utilize a Sparse Mixture-of-Experts (MoE) mechanism. This MoE architecture
24+
allows the model to have a massive total number of latent parameters (potentially over 1 trillion),
25+
but only a sparse subset (Active FLOPs) are used for any single inference query.
26+
Source: Optimal Sparsity of Mixture-of-Experts Language Models …
27+
<https://arxiv.org/html/2508.18672v2>
28+
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity …
29+
<https://arxiv.org/html/2501.12370v2>
30+
31+
**Estimated Model Size (widely reported):****1.8 trillion parameters**
32+
(widely reported; not officially confirmed by OpenAI)
33+
34+
#### Estimated Energy (Inference)
2035

2136
Published or estimated per-query energy values vary between studies.
2237
Representative numbers include:
@@ -27,188 +42,183 @@ Source: [Epoch AI – How Much Energy Does ChatGPT Use?][epoch-ai].
2742
Other analysts estimate ≈ 0.3 – 1.8 Wh (0.0003 – 0.0018 kWh)
2843
depending on prompt length, token output, and GPU hardware.
2944

30-
**Caveat:** OpenAI does not publish per-query energy data.
31-
All estimates depend on assumptions such as:
45+
**Caveat:** OpenAI does not publish per-query energy data. All estimates depend on assumptions such as:
3246

33-
- Hardware type (GPU vs TPU)
34-
- Power Usage Effectiveness (PUE)
35-
- Data center region and carbon intensity
47+
- Hardware type (GPU vs TPU)
48+
- Power Usage Effectiveness (PUE)
49+
- Data-center region and carbon intensity
3650
- Prompt and token length
3751

38-
### Training Energy (GPT-4)
39-
40-
Some analyses extrapolate GPT-4’s training energy from model size and compute budget:
52+
#### Training Energy (GPT-4)
4153

42-
≈ 51 – 62 GWh (51 772 500 – 62 318 750 kWh) for full-scale training.
54+
Some analyses extrapolate GPT-4’s training energy from its model size and compute budget:
55+
≈ 51 – 62 GWh (51,772,500 – 62,318,750 kWh) for full-scale training.
4356
Source: [The Carbon Footprint of ChatGPT][sustainability-numbers].
4457

4558
These are indirect estimates, not official OpenAI disclosures.
4659

47-
### Model Size
60+
#### Water Usage
4861

49-
Estimated model size: **≈ 1.8 trillion parameters**
50-
(widely reported estimate; OpenAI has not publicly confirmed exact parameter count).
51-
Source: SemiAnalysis and other architecture analyses.
52-
53-
### Water Usage
54-
55-
Official data are unavailable, but media analyses suggest:
62+
Official data are unavailable, but media analyses provide approximate indicators:
5663

5764
- A single ChatGPT query may indirectly consume ≈ 0.5 L of water.
5865
- Generating a 100-word email may use ≈ 0.14 kWh energy and 0.52 L water.
5966

6067
Source: [The Verge – Sam Altman on ChatGPT Energy and Water Use][verge-gpt].
6168

62-
### PUE and CI Context
69+
#### PUE / CI Context
6370

64-
Studies multiply compute energy by:
71+
Studies multiply compute energy by:
6572

6673
- **PUE** – Power Usage Effectiveness (total facility power / IT power)
67-
- **CI** – Carbon Intensity (kg CO₂e / kWh electricity)
74+
- **CI** – Carbon Intensity (kg CO₂e / kWh)
6875

69-
Example assumptions:
76+
Example assumptions:
7077

7178
- **PUE:** ≈ 1.1 – 1.3 for Azure hyperscale centers
72-
- **CI:** ≈ 0.3 – 0.4 kg CO₂e / kWh (depending on region)
79+
- **CI:** ≈ 0.3 – 0.4 kg CO₂e/kWh (depending on region)
7380

7481
---
7582

7683
# Model 2: Claude 3 Haiku (Anthropic)
7784

78-
## Model Name & Provider
79-
80-
**Claude 3 Haiku**, developed by **Anthropic**.
81-
8285
### Model Description
8386

84-
Part of Anthropic’s Claude 3 family (Haiku, Sonnet, Opus).
85-
Released March 2024. Smallest and fastest model for low-latency, energy-efficient inference in chat, summarization, and automation.
87+
Claude 3 Haiku is part of Anthropic’s Claude 3 model family, released in March 2024. It is the smallest and fastest model in the Claude 3 lineup (Haiku, Sonnet, Opus) and is designed for low-latency, energy-efficient inference while maintaining strong reasoning capabilities. Haiku is optimized for lightweight commercial use cases, including chat applications, summarization, and enterprise automation.
8688

87-
Source: [Anthropic Blog – Claude 3 Technical Overview][anthropic-blog].
89+
Sources:
90+
[Anthropic – Claude 3 Model Family Overview][anthropic-claude3]
91+
[Anthropic Blog – Claude 3 Announcement][anthropic-blog]
8892

89-
### Model Size and Architecture
93+
#### Hosting / Deployment
9094

91-
Estimated model size: **≈ 7 billion parameters** (Haiku variant, optimized for efficiency and low-latency inference).
92-
Source: public model reports and community discussions.
95+
Claude 3 Haiku is hosted via Anthropic’s API and via AWS Amazon Bedrock (cloud). These data centers typically maintain a **PUE ≈ 1.2**.
9396

94-
### Hosting & Deployment
97+
Sources:
98+
[AWS Bedrock Claude Integration][aws-bedrock]
99+
[AWS Sustainability Report][aws-sustainability]
95100

96-
Hosted via Anthropic API and **Amazon Bedrock (AWS)**.
97-
These centers maintain **PUE ≈ 1.2**.
101+
#### Estimated Model Size / Architecture
98102

99-
Sources: [AWS Bedrock Claude Integration][aws-bedrock], [AWS Sustainability Report 2024][aws-report].
103+
Claude 3 Haiku is estimated to have **≈ 20 billion parameters**, making it significantly smaller than larger models in the family (e.g., Claude 3 Opus). This parameter estimate is based on community sources; Anthropic does not publicly confirm.
100104

101-
### Estimated Energy
105+
Source:
106+
[Reddit – ClaudeAI parameter discussion][reddit-claude-haiku]
102107

103-
Anthropic does not publish per-query energy data.
104-
Independent analysts estimate ≈ 0.05 – 0.1 Wh (0.00005 – 0.0001 kWh) per query based on token count and GPU efficiency.
108+
#### Estimated Energy
105109

106-
Claude 3 Haiku is faster and more efficient than larger Claude 3 models.
110+
Anthropic does not publish per-query energy data. Independent analysts estimate models of similar size (10–30 billion parameters) use ~0.05-0.1 Wh (0.00005-0.0001 kWh) per query, depending on hardware and tokens. Haiku is reportedly ~5× more efficient than larger Claude variants.
107111

108-
Sources: [Epoch AI – AI Training Compute and Energy Scaling][epoch-ai-training], [Anthropic Claude 3 Announcement][anthropic-announcement].
112+
Sources:
113+
[Epoch AI – Energy Use of AI Models][epoch-energy]
114+
[Anthropic Claude 3 Announcement][anthropic-announcement]
109115

110-
### Training Energy
116+
#### Training Energy Estimates
111117

112-
Claude 3 models are trained on GPU clusters (NVIDIA A100/H100) primarily hosted on AWS infrastructure.
113-
For models in the 10 – 30 B parameter range, training energy is typically **3 000 – 10 000 MWh**.
118+
For models in the 10–30 billion parameter range, training energy is estimated at **3,000-10,000 MWh**, depending on runs and infrastructure.
114119

115-
Sources: [Epoch AI – AI Training Compute and Energy Scaling][epoch-ai-training], [Anthropic Responsible Scaling Policy][anthropic-policy].
120+
Sources:
121+
[Epoch AI – AI Training Compute & Energy Scaling][epoch-training]
122+
[Anthropic Responsible Scaling Policy][anthropic-scaling]
116123

117-
### Water Usage of claude
124+
#### Water Usage of claude
118125

119-
Anthropic has not published specific water consumption figures for the Claude 3 family.
120-
As it relies on AWS data centers, cooling water use is managed under AWS sustainability strategy.
121-
AWS data centers in cooler regions use air cooling to reduce water footprint, while others recycle water on-site.
126+
Anthropic has not published water-consumption data for Claude 3. AWS manages cooling water use via its sustainability programs. Some centers use air-cooling or recycle water on-site to reduce water usage.
122127

123-
Sources: [AWS Water Stewardship Report][aws-water], [Anthropic Sustainability Commitments][anthropic-sustainability].
128+
Sources:
129+
[AWS Water Stewardship Report][aws-water]
130+
[Anthropic Sustainability Commitments][anthropic-sustainability]
124131

125-
### PUE & CI Context
132+
#### PUE and CI Context
126133

127-
AWS’s average **PUE ≈ 1.2** (accounts for cooling and power delivery losses).
128-
Carbon intensity (CI): ≈ 00.2 kg CO₂e / kWh, depending on regional renewable mix.
129-
AWS aims for 100 % renewable energy by 2025.
134+
- AWS average **PUE ≈ 1.2** (accounts for cooling & power delivery losses)
135+
- Carbon intensity **CI ≈ 0-0.2 kg CO₂e/kWh**, depending on region
136+
- AWS targets **100% renewable energy by 2025**
130137

131-
Sources: [AWS Global Infrastructure Efficiency Data][aws-efficiency], [Anthropic Responsible Scaling Policy][anthropic-policy].
138+
Sources:
139+
[AWS Global Infrastructure Efficiency Data][aws-infra]
140+
[Anthropic Responsible Scaling Policy][anthropic-scaling]
132141

133142
---
134143

135-
# Model 3: Gemini Nano (Google)
144+
# Model 3: Gemini Nano (Google DeepMind)
145+
146+
#### Model Name & Provider
136147

137-
## Model Name / Provider
148+
**Gemini Nano**, developed by **Google DeepMind**, is the smallest and most efficient member of the Gemini family (Nano, Flash, Pro, Ultra).
138149

139-
**Gemini Nano**, developed by **Google DeepMind**.
140-
Smallest member of the Gemini family (Nano, Pro, Ultra).
150+
Sources:
151+
[Google AI Blog – Introducing Gemini][google-gemini]
141152

142-
### Hosting / Deployment
153+
#### Hosting / Deployment
143154

144-
Runs on-device via **Android AICore** (subsystem introduced 2023).
145-
Designed for mobile hardware such as Pixel 8 Pro and Pixel 9.
146-
Reduces energy use by eliminating cloud compute and network load.
155+
Runs primarily **on-device** via Android’s **AICore system service** (introduced with Android 14). Optimized hardware (e.g., Pixel 8 Pro / 9 Series) eliminates cloud compute and network latency.
147156

148-
Sources: [Google AI Blog – Introducing Gemini][google-blog], [Android Developers – Gemini Nano Overview][android-dev], [The Verge – Gemini Nano on Pixel 8 Pro][verge-gemini].
157+
Sources:
158+
[Android Developers – Gemini Nano Overview][android-nano]
149159

150-
### Model Size / Architecture
160+
#### Estimated Model Size / Architecture
151161

152-
Gemini Nano variants (device-optimized):
162+
Gemini Nano is deployed as quantized versions:
153163

154-
- **Nano-1:** 1.8 billion parameters
155-
- **Nano-2:** 3.25 billion parameters
164+
- **Nano-1:** 1.8 billion parameters
165+
- **Nano-2:** 3.25 billion parameters
156166

157-
These use quantized weights tuned for on-device inference.
158-
Source: device benchmark reports and public parameter listings.
167+
Source:
168+
[Exploding Topics – Industry Model Sizes][exploding-topics]
159169

160-
### Estimated Energy of gemini
170+
#### Estimated Energy (Inference)
161171

162-
No official values.
163-
Device benchmarks show ≈ 0.01 Wh (0.00001 kWh) per query — 10 – 30× more efficient than GPT-4.
172+
Cloud median inference for Gemini: ≈ 0.24 Wh per text prompt (wide-scope).
173+
On-device Nano estimate: ≈ 0.01 Wh (0.00001 kWh) per query. Note: direct official Nano numbers not published.
164174

165-
Sources: [Google Pixel AI Benchmarks (2024)][google-pixel-ai], [Epoch AI – How Much Energy Does ChatGPT Use][epoch-ai].
175+
Sources:
176+
[Google Cloud Blog – Measuring Environmental Impact of Inference][google-impact]
166177

167-
### Training Energy of gemini
178+
#### Training Energy Estimates
168179

169-
Gemini Nano was distilled from larger Gemini models trained on **TPU v5e** clusters.
170-
Training energy for Nano ≈ 200 – 1 200 MWh (≈ 1 – 5 % of Gemini Ultra’s training compute).
180+
Gemini Nano was distilled from larger models on Google TPU v5e clusters. Estimated training cost: **≈ 200-1,200 MWh**, amortised across many devices.
171181

172-
Sources: [Google Research – Efficient TPU Training (2024)][google-tpu-paper], [Google Cloud Sustainability Report (2024)][google-cloud-sustainability].
182+
Sources:
183+
[Google Research – TPU Efficiency Overview][google-tpu]
184+
[Google Cloud Blog – Measuring Environmental Impact of Inference][google-impact]
173185

174-
### Water Usage of gemini
186+
#### Water Usage
175187

176-
Inference uses no data-center water since it runs locally on devices.
177-
Training used Google data centers with Water Usage Effectiveness (WUE) ≈ 0.18 L/kWh.
178-
Google targets net-positive water impact by 2030.
188+
- **Inference:** Zero data-center water (runs on device)
189+
- **Training:** Google data centres report WUE ≈ 0.26 mL per median cloud query in some analyses
179190

180-
Sources: [Google Environmental Report (2024)][google-env-report], [Bloomberg – Google AI’s Thirst for Water][bloomberg-water].
191+
Sources:
192+
[Google Cloud Blog – Measuring Environmental Impact of Inference][google-impact]
181193

182-
### PUE / CI Context
194+
#### PUE / CI Context
183195

184-
Google data centers report average **PUE ≈ 1.10 – 1.12**.
185-
Carbon Intensity (CI) ≈ 0.15 kg CO₂e / kWh due to 70 %+ renewable energy mix.
186-
On-device execution uses < 5 W of mobile power per inference.
196+
Google’s data centre fleet reports average **PUE ≈ 1.10-1.12**.
197+
Reported carbon intensity (CI) ≈ 0.03 g CO₂e per median cloud query, reflecting high level of renewables.
187198

188-
Sources: [Google Data Center Efficiency Overview (2024)][google-efficiency], [Google TPU v5e Efficiency Blog (2024)][google-tpu-blog].
199+
Sources:
200+
[Google Cloud Blog – Measuring Environmental Impact of Inference][google-impact]
189201

190202
---
191203

192-
[azure-blog]: https://azure.microsoft.com/en-us/blog/introducing-gpt4-in-azure-openai-service/
193-
[epoch-ai]: https://epoch.ai/gradient-updates/how-much-energy-does-chatgpt-use
194-
[sustainability-numbers]: https://www.sustainabilitybynumbers.com/p/carbon-footprint-chatgpt
195-
[verge-gpt]: https://www.theverge.com/2023/4/19/openai-ceo-sam-altman-chatgpt-energy-water-use
196-
[anthropic-blog]: https://www.anthropic.com/news/claude-3-family
197-
[aws-bedrock]: https://aws.amazon.com/bedrock/
198-
[aws-report]: https://aws.amazon.com/about-aws/sustainability/
199-
[anthropic-announcement]: https://www.anthropic.com/news/claude-3-models
200-
[epoch-ai-training]: https://epoch.ai/gradient-updates/ai-training-compute-energy-scaling
201-
[anthropic-policy]: https://www.anthropic.com/news/responsible-scaling-policy
202-
[aws-water]: https://aws.amazon.com/about-aws/sustainability/#water
203-
[anthropic-sustainability]: https://www.anthropic.com/sustainability
204-
[aws-efficiency]: https://aws.amazon.com/about-aws/sustainability/
205-
[google-blog]: https://blog.google/technology/ai/google-gemini-ai/
206-
[android-dev]: https://developer.android.com/ai/gemini-nano
207-
[verge-gemini]: https://www.theverge.com/2023/12/6/23990823/google-gemini-ai-models-nano-pro-ultra
208-
[google-pixel-ai]: https://ai.google/discover/pixel-ai/
209-
[google-tpu-paper]: https://arxiv.org/abs/2408.15734
210-
[google-cloud-sustainability]: https://sustainability.google/reports/environmental-report-2024/
211-
[google-env-report]: https://sustainability.google/reports/environmental-report-2024/
212-
[bloomberg-water]: https://www.bloomberg.com/news/articles/2023-08-09/google-ai-s-thirst-for-water-could-leave-towns-dry
213-
[google-efficiency]: https://cloud.google.com/sustainability/data-centers
214-
[google-tpu-blog]: https://cloud.google.com/blog/products/ai-machine-learning/introducing-tpu-v5e
204+
[azure-blog]: https://azure.microsoft.com/en-us/blog/introducing-gpt4-in-azure-openai-service
205+
[epoch-ai]: https://epoch.ai/article/how-much-energy-does-chatgpt-use
206+
[sustainability-numbers]: https://sustainability-numbers.com/the-carbon-footprint-of-chatgpt
207+
[verge-gpt]: https://www.theverge.com/2024/4/16/sam-altman-chatgpt-water-energy
208+
[anthropic-claude3]: https://www.anthropic.com/news/claude-3-family
209+
[anthropic-blog]: https://www.anthropic.com/news/claude-3
210+
[aws-bedrock]: https://aws.amazon.com/bedrock/claude/
211+
[aws-sustainability]: https://sustainability.aboutamazon.com/environment/the-cloud
212+
[reddit-claude-haiku]: https://www.reddit.com/r/ClaudeAI/comments/1bi7p5w/how_many_parameter_does_claude_haiku_have/
213+
[epoch-energy]: https://epoch.ai/article/energy-use-of-ai-models
214+
[anthropic-announcement]: https://www.anthropic.com/news/claude-3
215+
[epoch-training]: https://epoch.ai/article/ai-training-compute-and-energy-scaling
216+
[anthropic-scaling]: https://www.anthropic.com/papers/responsible-scaling-policy
217+
[aws-water]: https://sustainability.aboutamazon.com/environment/water
218+
[anthropic-sustainability]: https://www.anthropic.com/news/anthropic-sustainability-commitments
219+
[aws-infra]: https://aws.amazon.com/about-aws/global-infrastructure/sustainability/
220+
[google-gemini]: https://blog.google/technology/ai/google-gemini-ai/
221+
[android-nano]: https://developer.android.com/ai/gemini-nano
222+
[exploding-topics]: https://explodingtopics.com/blog/ai-model-sizes
223+
[google-impact]: https://cloud.google.com/blog/products/sustainability/measuring-environmental-impact-of-inference
224+
[google-tpu]: https://research.google/pubs/efficient-tpu-training-v5e/

0 commit comments

Comments
 (0)