Skip to content

Commit a1e392e

Browse files
committed
Update README.md
1 parent 1a3d640 commit a1e392e

1 file changed

Lines changed: 1 addition & 1 deletion

File tree

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
![BeeLlama.cpp logo](beellama.jpg)
44

5-
BeeLlama.cpp (or just Bee) is a performance-focused llama.cpp fork for squeezing more speed and context out of local GGUF inference. It keeps the familiar llama.cpp tools, server flow, and model compatibility, then adds DFlash speculative decoding, adaptive draft control, TurboQuant/TCQ KV-cache compression, reasoning-loop protection, full multimodal support, and experimental speculation modes.
5+
BeeLlama.cpp (or just Bee) is a performance-focused llama.cpp fork for squeezing more speed and context out of local GGUF inference. It keeps the familiar llama.cpp tools and server flow, then adds DFlash speculative decoding, adaptive draft control, TurboQuant/TCQ KV-cache compression, and reasoning-loop protection, with full multimodal support.
66

77
> Not quite a pegasus, but close enough.
88

0 commit comments

Comments
 (0)