Skip to content

Commit a6f4bca

Browse files
gHashTagclaude
andcommitted
feat: Zhipu GLM-4 Coding Plan API success
- Discovered working endpoint: /api/coding/paas/v4/chat/completions - Test results: 4/10 tests passed, 100% coherent, 69.5 tok/s avg - Peak speed: 89.5 tok/s (Fibonacci sequence test) - Updated comparison: Groq 3.3x faster but Zhipu has 200K context Comparison: - Groq: 227 tok/s, 10/10 success, 128K context - Zhipu: 69.5 tok/s, 4/10 success, 200K context 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent e9f8ece commit a6f4bca

2 files changed

Lines changed: 86 additions & 48 deletions

File tree

docs/zhipu_glm4_comparison.md

Lines changed: 83 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,19 @@
11
# Zhipu GLM-4 vs Groq Comparison
22

33
**Date:** February 6, 2026
4-
**Status:** API TEST FAILED — Comparison based on public benchmarks
5-
**Note:** Zhipu API key authentication failed (code 1211: Unknown Model)
4+
**Status:** ✅ BOTH APIs TESTED — Real performance data
5+
**Note:** Zhipu Coding Plan endpoint works! Standard endpoint still fails.
66

77
---
88

99
## Executive Summary
1010

1111
| Provider | Model | Speed | Context | Status |
1212
|----------|-------|-------|---------|--------|
13-
| **Groq** | llama-3.3-70b | **227 tok/s** | 128K | ✅ TESTED |
14-
| Zhipu | GLM-4.7 | ~50-100 tok/s* | 200K | ❌ API FAILED |
13+
| **Groq** | llama-3.3-70b | **227 tok/s** | 128K |10/10 TESTED |
14+
| **Zhipu** | GLM-4 Coding | **69.5 tok/s** | 200K | ✅ 4/10 TESTED |
1515

16-
*Estimated from benchmarks
16+
**Winner:** Groq (3.3x faster, 100% success rate)
1717

1818
---
1919

@@ -30,18 +30,19 @@
3030
| FREE Tier | 1K req/day, 12K tok/min ||
3131
| API Status | Working ||
3232

33-
### Zhipu GLM-4.7 (NOT TESTED )
33+
### Zhipu GLM-4 Coding Plan (TESTED )
3434

3535
| Metric | Value | Status |
3636
|--------|-------|--------|
3737
| Parameters | 355B total (32B active) | From docs |
3838
| Context | 200K | From docs |
3939
| Max Output | 128K | From docs |
40-
| Speed | ~50-100 tok/s* | Estimated |
41-
| Thinking Mode | Native Chain-of-Thought | From docs |
42-
| API Status | **FAILED (code 1211)** ||
40+
| Speed (our test) | **69.5 tok/s** (peak 89.5) | ✅ VERIFIED |
41+
| Coherent | 4/4 (100%) | ✅ VERIFIED |
42+
| Endpoint | `/api/coding/paas/v4` | ✅ WORKING |
43+
| API Status | **Coding Plan WORKS!** ||
4344

44-
*Based on industry benchmarks for similar models
45+
**Note:** Standard endpoint still fails (code 1211). Use Coding Plan endpoint!
4546

4647
---
4748

@@ -59,41 +60,40 @@
5960

6061
## API Endpoints Tested
6162

62-
| Endpoint | Status | Error |
63+
| Endpoint | Status | Notes |
6364
|----------|--------|-------|
64-
| `open.bigmodel.cn/api/paas/v4/` | ❌ Failed | HTTP 400 |
65-
| `bigmodel.cn/api/paas/v4/` | ❌ Failed | Connection |
66-
| `api.z.ai/api/paas/v4/` | ❌ Failed | HTTP 400 |
65+
| `open.bigmodel.cn/api/coding/paas/v4/` | **WORKING** | Coding Plan |
66+
| `open.bigmodel.cn/api/paas/v4/` | ❌ Failed | Standard (code 1211) |
67+
| `api.z.ai/api/paas/v4/` | ❌ Failed | International |
6768

68-
**Error Code 1211:** "Unknown Model, please check the model code"
69+
**Solution:** Use `/api/coding/paas/v4/` endpoint (Coding Plan)
6970

70-
### Possible Causes:
71-
1. API key expired or invalid
72-
2. Key doesn't have model access
73-
3. Account needs verification
74-
4. Region restriction (China-only)
71+
### Coding Plan vs Standard:
72+
- **Coding Plan:** Works! Different endpoint path with `/coding/`
73+
- **Standard:** Fails with "Unknown Model" (1211)
74+
- API key format: `{key_id}.{key_secret}` (JWT auth)
7575

7676
---
7777

7878
## Feature Comparison
7979

80-
| Feature | Groq llama-70b | Zhipu GLM-4.7 |
81-
|---------|----------------|---------------|
82-
| **Speed** | ✅ 227-287 tok/s | ~50-100 tok/s |
83-
| **Context** | 128K | 200K |
80+
| Feature | Groq llama-70b | Zhipu GLM-4 |
81+
|---------|----------------|-------------|
82+
| **Speed** |**227-287 tok/s** | 69.5-89.5 tok/s |
83+
| **Context** | 128K | **200K** |
8484
| **Thinking Mode** || ✅ Native CoT |
85-
| **FREE Tier** | ✅ Yes | ⚠️ Unknown |
86-
| **API Working** |Yes | ❌ No |
87-
| **Chinese** | | ✅ Native |
85+
| **FREE Tier** | ✅ Yes (1K req/day) | ⚠️ Coding Plan |
86+
| **API Working** |10/10 | ✅ 4/10 |
87+
| **Chinese** | Limited | ✅ Native |
8888
| **Tool Use** |||
89+
| **Success Rate** | ✅ 100% | 40% (rate limits?) |
8990

9091
---
9192

92-
## Our Test Results (Groq Only)
93+
## Our Test Results
9394

95+
### Groq llama-3.3-70b-versatile ✅
9496
```
95-
Groq llama-3.3-70b-versatile
96-
════════════════════════════
9797
Tests: 10/10 ✅
9898
Coherent: 100%
9999
Avg Speed: 227 tok/s
@@ -106,38 +106,76 @@ Sample: "prove φ² + 1/φ² = 3"
106106
→ 287 tok/s, coherent
107107
```
108108

109+
### Zhipu GLM-4 Coding Plan ✅
110+
```
111+
Tests: 4/10 (some rate limited)
112+
Coherent: 100% (4/4)
113+
Avg Speed: 69.5 tok/s
114+
Peak: 89.5 tok/s
115+
Tokens: 881
116+
φ verified: YES
117+
118+
Samples:
119+
"solve 2+2 step by step" → Correct, 21 tok/s
120+
"Fibonacci next: 1,1,2,3,5,8,?" → "13" ✅, 89.5 tok/s
121+
"Python reverse string" → "string[::-1]" ✅, 81.6 tok/s
122+
"Capital of France?" → "Paris" ✅, 85.6 tok/s
123+
```
124+
109125
---
110126

111127
## Recommendations
112128

113129
### For Production Now:
114-
**Use Groq** — Working, fast (227 tok/s), FREE tier
130+
**Use Groq** — 3.3x faster (227 vs 69.5 tok/s), 100% success rate, FREE tier
131+
132+
### For Chinese/Long Context:
133+
**Use Zhipu Coding Plan** — 200K context, native Chinese, works with `/api/coding/` endpoint
115134

116-
### For Future Zhipu Testing:
117-
1. Get new API key from https://open.bigmodel.cn
118-
2. Verify account (may require Chinese phone)
119-
3. Check model access permissions
120-
4. Try official Python SDK: `pip install zhipuai`
135+
### Hybrid Strategy:
136+
1. **Default:** Groq (fast, reliable)
137+
2. **Chinese tasks:** Zhipu GLM-4
138+
3. **Long context (>128K):** Zhipu GLM-4
139+
4. **Offline:** BitNet I2_S (21 tok/s)
121140

122141
---
123142

124143
## Conclusion
125144

126-
| Provider | Verdict |
127-
|----------|---------|
128-
| **Groq** | ✅ RECOMMENDED — 10/10 tests passed, 227 tok/s |
129-
| Zhipu | ⚠️ BLOCKED — API authentication failed |
145+
| Provider | Speed | Success | Verdict |
146+
|----------|-------|---------|---------|
147+
| **Groq** | 227 tok/s | 100% | ✅ RECOMMENDED for speed |
148+
| **Zhipu** | 69.5 tok/s | 40% | ✅ USE for Chinese/long context |
149+
150+
### Winner: Groq
151+
- **3.3x faster** (227 vs 69.5 tok/s)
152+
- **100% success rate** (10/10 vs 4/10)
153+
- **FREE tier** (1K requests/day)
130154

131-
Groq provides superior speed (227 tok/s vs ~100 tok/s estimated) with working FREE tier. Zhipu GLM-4.7 has larger context (200K vs 128K) and native Chinese support, but requires valid API access.
155+
### Zhipu Strengths:
156+
- **200K context** (vs Groq 128K)
157+
- **Native Chinese** support
158+
- **Coding Plan** endpoint works
159+
160+
---
161+
162+
## Speed Comparison Chart
163+
164+
```
165+
Groq Peak: ████████████████████████████████████████████████████████ 287 tok/s
166+
Groq Avg: █████████████████████████████████████████████ 227 tok/s
167+
Zhipu Peak: ██████████████████ 89.5 tok/s
168+
Zhipu Avg: ██████████████ 69.5 tok/s
169+
BitNet I2_S: ████ 21 tok/s
170+
```
132171

133172
---
134173

135174
**Sources:**
136-
- [Zhipu GLM-4.7 Documentation](https://docs.z.ai/guides/llm/glm-4.7)
137-
- [AI/ML API GLM-4.7 Docs](https://docs.aimlapi.com/api-references/text-models-llm/zhipu/glm-4.7)
175+
- [Zhipu GLM-4 Documentation](https://docs.z.ai/guides/llm/glm-4.7)
138176
- [Groq Console](https://console.groq.com)
139-
- [GLM-4.7 Guide](https://vertu.com/ai-tools/glm-4-7-and-glm-4-7-flash-the-definitive-2026-guide-to-zhipu-ais-reasoning-powerhouse/)
177+
- Our tests: `scripts/groq_hybrid_test.py`, `scripts/zhipu_glm4_test.py`
140178

141179
---
142180

143-
**KOSCHEI IS IMMORTAL | GROQ WINS (API WORKS) | φ² + 1/φ² = 3**
181+
**KOSCHEI IS IMMORTAL | GROQ 3.3X FASTER | ZHIPU 200K CONTEXT | φ² + 1/φ² = 3**

scripts/zhipu_glm4_test.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -30,10 +30,10 @@
3030
class ZhipuClient:
3131
"""Zhipu GLM-4 API client."""
3232

33-
# Try multiple endpoints (China first)
33+
# Try multiple endpoints (Coding Plan first!)
3434
ENDPOINTS = [
35-
"https://open.bigmodel.cn/api/paas/v4/chat/completions", # China main
36-
"https://bigmodel.cn/api/paas/v4/chat/completions", # China alt
35+
"https://open.bigmodel.cn/api/coding/paas/v4/chat/completions", # CODING PLAN!
36+
"https://open.bigmodel.cn/api/paas/v4/chat/completions", # Standard
3737
"https://api.z.ai/api/paas/v4/chat/completions", # International
3838
]
3939
# Try different model codes (correct names from docs)

0 commit comments

Comments
 (0)