We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 82f4241 commit 1932231Copy full SHA for 1932231
1 file changed
README.md
@@ -27,7 +27,7 @@
27
28
---
29
30
-## Try It Now (30 seconds)
+## Try It Now
31
32
```bash
33
git clone https://github.com/quantumaikr/TurboQuant.cpp
@@ -36,13 +36,13 @@ cd TurboQuant.cpp
36
cmake -B build -DCMAKE_BUILD_TYPE=Release -DTQ_BUILD_TESTS=ON -DTQ_BUILD_BENCH=ON
37
cmake --build build -j$(sysctl -n hw.ncpu 2>/dev/null || nproc)
38
39
-# See the A/B comparison yourself
40
-./build/ab_test
+# Run Qwen3.5-0.8B (download model first — see Getting Started)
+./build/tq_run model.safetensors -t tokenizer.json -p "What is AI?" -j 4 -q
41
42
-# Memory savings for real LLM models
43
-./build/demo_real_model
+# A/B comparison: FP16 vs quantized
+./build/ab_test
44
45
-# Speed: Integer attention vs FP32
+# Benchmarks
46
./build/speed_int_vs_float
47
```
48
0 commit comments