Skip to content

kkrushnyakov/speculative_decoding

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

This is a tool to run llama.cpp in a remote Docker container with different parameters and to collect benchmarks from logs. It is designed to measure the influence of runtime parameters on model performance, especially for speculative decoding.

About

Speculative decoding benchmark tool for Llama cpp

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages