Releases: withcatai/node-llama-cpp
v3.0.0-beta.16
3.0.0-beta.16 (2024-04-13)
Bug Fixes
Features
inspect gpucommand: print device names (#198) (5ca33c7)inspect gpucommand: print env info (#202) (d332b77)- download models using the CLI (#191) (b542b53)
- interactively select a model from CLI commands (#191) (b542b53)
- change the default log level to warn (#191) (b542b53)
- token biases (#196) (3ad4494)
Shipped with llama.cpp release b2665
To use the latest
llama.cpprelease available, runnpx --no node-llama-cpp download --release latest. (learn more)
v3.0.0-beta.15
3.0.0-beta.15 (2024-04-04)
Bug Fixes
- create a context with no parameters (#188) (6267778)
- improve chat wrappers tokenization (#182) (35e6f50)
- use the new
llama.cppCUDA flag (#182) (35e6f50) - adapt to breaking
llama.cppchanges (#183) (6b012a6)
Features
- automatically adapt to current free VRAM state (#182) (35e6f50)
inspect ggufcommand (#182) (35e6f50)inspect measurecommand (#182) (35e6f50)readGgufFileInfofunction (#182) (35e6f50)- GGUF file metadata info on
LlamaModel(#182) (35e6f50) JinjaTemplateChatWrapper(#182) (35e6f50)- use the
tokenizer.chat_templateheader from thegguffile when available - use it to find a better specialized chat wrapper or useJinjaTemplateChatWrapperwith it as a fallback (#182) (35e6f50) - simplify generation CLI commands:
chat,complete,infill(#182) (35e6f50) - Windows on Arm prebuilt binary (#181) (f3b7f81)
Shipped with llama.cpp release b2608
To use the latest
llama.cpprelease available, runnpx --no node-llama-cpp download --release latest. (learn more)
v2.8.9
v3.0.0-beta.14
3.0.0-beta.14 (2024-03-16)
Bug Fixes
DisposedErrorwas thrown when calling.dispose()(#178) (315a3eb)- adapt to breaking
llama.cppchanges (#178) (315a3eb)
Features
- async model and context loading (#178) (315a3eb)
- automatically try to resolve
Failed to detect a default CUDA architectureCUDA compilation error (#178) (315a3eb) - detect
cmakebinary issues and suggest fixes on detection (#178) (315a3eb)
Shipped with llama.cpp release b2440
To use the latest
llama.cpprelease available, runnpx --no node-llama-cpp download --release latest. (learn more)
v3.0.0-beta.13
3.0.0-beta.13 (2024-03-03)
Bug Fixes
- adapt to
llama.cppbreaking change (#175) (5a70576) - return user-defined llama tokens (#175) (5a70576)
Features
- gguf parser (#168) (bcaab4f)
- use the best compute layer available by default (#175) (5a70576)
- more guardrails to prevent loading an incompatible prebuilt binary (#175) (5a70576)
inspectcommand (#175) (5a70576)GemmaChatWrapper(#175) (5a70576)TemplateChatWrapper(#175) (5a70576)
Shipped with llama.cpp release b2329
To use the latest
llama.cpprelease available, runnpx --no node-llama-cpp download --release latest. (learn more)
v3.0.0-beta.12
3.0.0-beta.12 (2024-02-24)
Bug Fixes
Features
Shipped with llama.cpp release b2254
To use the latest
llama.cpprelease available, runnpx --no node-llama-cpp download --release latest. (learn more)
v2.8.8
v3.0.0-beta.11
3.0.0-beta.11 (2024-02-18)
Features
- completion and infill (#164) (ede69c1)
- support configuring more options for
getLlamawhen using"lastBuild"(#164) (ede69c1) - export
resolveChatWrapperBasedOnWrapperTypeName(#165) (624fa30)
Shipped with llama.cpp release b2174
To use the latest
llama.cpprelease available, runnpx --no node-llama-cpp download --release latest. (learn more)
v2.8.7
v3.0.0-beta.10
3.0.0-beta.10 (2024-02-11)
Features
- get VRAM state (#161) (46235a2)
chatWrappergetter on aLlamaChatSession(#161) (46235a2)- minP support (#162) (47b476f)
Shipped with llama.cpp release b2127
To use the latest
llama.cpprelease available, runnpx --no node-llama-cpp download --release latest. (learn more)