File tree Expand file tree Collapse file tree
docs-site/src/content/docs/integrations Expand file tree Collapse file tree Original file line number Diff line number Diff line change @@ -437,8 +437,11 @@ llama-server \
437437
438438** Performance (M1 Max 64 GB, ~ 37K input tokens):**
439439
440- - Cold start: pp 395 tok/s, tg 40 tok/s (96s total)
441- - Cached follow-up: pp 110 tok/s, tg 40 tok/s (6s total)
440+ pp = prompt processing, tg = token generation.
441+
442+ - Cold start: pp 395 tok/s, tg 40 tok/s (~ 96s total)
443+ - Cached follow-up: tg 40 tok/s (~ 6s total, prompt
444+ cached in ~ 0.4s)
442445
443446| Quant | Size | Notes |
444447| -------| ------| -------|
Original file line number Diff line number Diff line change @@ -278,8 +278,11 @@ llama-server -hf unsloth/gemma-4-26B-A4B-it-GGUF:UD-Q4_K_XL \
278278
279279** Performance (M1 Max 64 GB, ~ 37K input tokens):**
280280
281- - Cold start: pp 395 tok/s, tg 40 tok/s (96s total)
282- - Cached follow-up: pp 110 tok/s, tg 40 tok/s (6s total)
281+ pp = prompt processing, tg = token generation.
282+
283+ - Cold start: pp 395 tok/s, tg 40 tok/s (~ 96s total)
284+ - Cached follow-up: tg 40 tok/s (~ 6s total, prompt
285+ cached in ~ 0.4s)
283286
284287** Quantization options:**
285288
You can’t perform that action at this time.
0 commit comments