Prerequisites
Feature Description
The new 1 bit Bonsai 8B model has just been released.:
1-bit Bonsai 8B implements a proprietary 1-bit model design across the entire network: embeddings, attention layers, MLP layers, and the LM head are all 1-bit. There are no higher-precision escape hatches. It is a true 1-bit model, end to end, across 8.2 billion parameters.
Despite being 14x smaller than the 8B (16-bit) full-precision models in its parameter-count class, it performs competitively on standard benchmarks while operating at radically higher efficiency.
Too bad a fork of llama.cpp is needed.
Related feature request #12598 was unfortunately closed. Could Bonsai models support be added?
Motivation
Performances of this model are impressive. Reactions are very positive.
Possible Implementation
No response
Prerequisites
Feature Description
The new 1 bit Bonsai 8B model has just been released.:
Too bad a fork of llama.cpp is needed.
Related feature request #12598 was unfortunately closed. Could Bonsai models support be added?
Motivation
Performances of this model are impressive. Reactions are very positive.
Possible Implementation
No response