Skip to content

Commit cbc8e49

Browse files
unamedkrclaude
andcommitted
Gemma 3 4B running: first 3B+ model on TurboQuant
- 5.2 tok/s on Gemma 3 4B (Q4, 6 threads, CPU) - "capital of France" → "Paris" ✓ - Multi-shard safetensors (2 shards, 883 tensors) - 3.2 GB TQM, mmap zero-copy loading - README: 3 models now supported Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent c90f007 commit cbc8e49

2 files changed

Lines changed: 6 additions & 2 deletions

File tree

README.ko.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,12 +10,14 @@ Qwen3.5 + Gemma 3 지원. Gemma 4 대응 준비 완료.
1010
[![Tests](https://img.shields.io/badge/tests-70%2B%20pass-brightgreen)]()
1111
[![License](https://img.shields.io/badge/license-Apache%202.0-blue)]()
1212
[![Qwen3.5](https://img.shields.io/badge/Qwen3.5--0.8B-82%20tok%2Fs-blue)]()
13-
[![Gemma3](https://img.shields.io/badge/Gemma3--270M-176%20tok%2Fs-blue)]()
13+
[![Gemma3-4B](https://img.shields.io/badge/Gemma3--4B-5.2%20tok%2Fs-blue)]()
14+
[![Gemma3-270M](https://img.shields.io/badge/Gemma3--270M-176%20tok%2Fs-blue)]()
1415

1516
### 지원 모델
1617

1718
| 모델 | 파라미터 | 속도 (Q4, 6T) | 검증 |
1819
|------|----------|---------------|------|
20+
| **Gemma 3 4B** | 4B | 5.2 tok/s | "프랑스 수도" → "Paris" |
1921
| **Qwen3.5-0.8B** | 752M | 82 tok/s | PyTorch 대비 코사인 0.999 |
2022
| **Gemma 3 270M** | 270M | 176 tok/s | PyTorch 대비 레이어별 일치 |
2123

README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,12 +10,14 @@ Qwen3.5 + Gemma 3 supported. Gemma 4 ready.
1010
[![Tests](https://img.shields.io/badge/tests-70%2B%20pass-brightgreen)]()
1111
[![License](https://img.shields.io/badge/license-Apache%202.0-blue)]()
1212
[![Qwen3.5](https://img.shields.io/badge/Qwen3.5--0.8B-82%20tok%2Fs-blue)]()
13-
[![Gemma3](https://img.shields.io/badge/Gemma3--270M-176%20tok%2Fs-blue)]()
13+
[![Gemma3-4B](https://img.shields.io/badge/Gemma3--4B-5.2%20tok%2Fs-blue)]()
14+
[![Gemma3-270M](https://img.shields.io/badge/Gemma3--270M-176%20tok%2Fs-blue)]()
1415

1516
### Supported Models
1617

1718
| Model | Params | Speed (Q4, 6T) | Verified |
1819
|-------|--------|----------------|----------|
20+
| **Gemma 3 4B** | 4B | 5.2 tok/s | "capital of France" → "Paris" |
1921
| **Qwen3.5-0.8B** | 752M | 82 tok/s | logits 0.999 cosine vs PyTorch |
2022
| **Gemma 3 270M** | 270M | 176 tok/s | per-layer exact match vs PyTorch |
2123

0 commit comments

Comments
 (0)