Gemma 3 4B running: first 3B+ model on TurboQuant

unamedkr · claude · unamedkr · commit cbc8e494359e · 2026-03-31T23:17:09.000+09:00
- 5.2 tok/s on Gemma 3 4B (Q4, 6 threads, CPU)
- "capital of France" → "Paris" ✓
- Multi-shard safetensors (2 shards, 883 tensors)
- 3.2 GB TQM, mmap zero-copy loading
- README: 3 models now supported

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
diff --git a/README.ko.md b/README.ko.md
@@ -10,12 +10,14 @@ Qwen3.5 + Gemma 3 지원. Gemma 4 대응 준비 완료.
 [![Tests](https://img.shields.io/badge/tests-70%2B%20pass-brightgreen)]()
 [![License](https://img.shields.io/badge/license-Apache%202.0-blue)]()
 [![Qwen3.5](https://img.shields.io/badge/Qwen3.5--0.8B-82%20tok%2Fs-blue)]()
-[![Gemma3](https://img.shields.io/badge/Gemma3--270M-176%20tok%2Fs-blue)]()
+[![Gemma3-4B](https://img.shields.io/badge/Gemma3--4B-5.2%20tok%2Fs-blue)]()
+[![Gemma3-270M](https://img.shields.io/badge/Gemma3--270M-176%20tok%2Fs-blue)]()
 
 ### 지원 모델
 
 | 모델 | 파라미터 | 속도 (Q4, 6T) | 검증 |
 |------|----------|---------------|------|
+| **Gemma 3 4B** | 4B | 5.2 tok/s | "프랑스 수도" → "Paris" |
 | **Qwen3.5-0.8B** | 752M | 82 tok/s | PyTorch 대비 코사인 0.999 |
 | **Gemma 3 270M** | 270M | 176 tok/s | PyTorch 대비 레이어별 일치 |
 
diff --git a/README.md b/README.md
@@ -10,12 +10,14 @@ Qwen3.5 + Gemma 3 supported. Gemma 4 ready.
 [![Tests](https://img.shields.io/badge/tests-70%2B%20pass-brightgreen)]()
 [![License](https://img.shields.io/badge/license-Apache%202.0-blue)]()
 [![Qwen3.5](https://img.shields.io/badge/Qwen3.5--0.8B-82%20tok%2Fs-blue)]()
-[![Gemma3](https://img.shields.io/badge/Gemma3--270M-176%20tok%2Fs-blue)]()
+[![Gemma3-4B](https://img.shields.io/badge/Gemma3--4B-5.2%20tok%2Fs-blue)]()
+[![Gemma3-270M](https://img.shields.io/badge/Gemma3--270M-176%20tok%2Fs-blue)]()
 
 ### Supported Models
 
 | Model | Params | Speed (Q4, 6T) | Verified |
 |-------|--------|----------------|----------|
+| **Gemma 3 4B** | 4B | 5.2 tok/s | "capital of France" → "Paris" |
 | **Qwen3.5-0.8B** | 752M | 82 tok/s | logits 0.999 cosine vs PyTorch |
 | **Gemma 3 270M** | 270M | 176 tok/s | per-layer exact match vs PyTorch |