docs: refine Gemma4 perf stats, add pp/tg abbreviation key

pchalasani · pchalasani · commit 17d659a500ec · 2026-04-02T17:31:12.000-04:00
diff --git a/docs-site/src/content/docs/integrations/local-llms.mdx b/docs-site/src/content/docs/integrations/local-llms.mdx
@@ -437,8 +437,11 @@ llama-server \
 
 **Performance (M1 Max 64 GB, ~37K input tokens):**
 
-- Cold start: pp 395 tok/s, tg 40 tok/s (96s total)
-- Cached follow-up: pp 110 tok/s, tg 40 tok/s (6s total)
+pp = prompt processing, tg = token generation.
+
+- Cold start: pp 395 tok/s, tg 40 tok/s (~96s total)
+- Cached follow-up: tg 40 tok/s (~6s total, prompt
+  cached in ~0.4s)
 
 | Quant | Size | Notes |
 |-------|------|-------|
diff --git a/docs/local-llm-setup.md b/docs/local-llm-setup.md
@@ -278,8 +278,11 @@ llama-server -hf unsloth/gemma-4-26B-A4B-it-GGUF:UD-Q4_K_XL \
 
 **Performance (M1 Max 64 GB, ~37K input tokens):**
 
-- Cold start: pp 395 tok/s, tg 40 tok/s (96s total)
-- Cached follow-up: pp 110 tok/s, tg 40 tok/s (6s total)
+pp = prompt processing, tg = token generation.
+
+- Cold start: pp 395 tok/s, tg 40 tok/s (~96s total)
+- Cached follow-up: tg 40 tok/s (~6s total, prompt
+  cached in ~0.4s)
 
 **Quantization options:**