This is a tool to run llama.cpp in a remote Docker container with different parameters and to collect benchmarks from logs. It is designed to measure the influence of runtime parameters on model performance, especially for speculative decoding.
| Name | Name | Last commit date | ||
|---|---|---|---|---|
This is a tool to run llama.cpp in a remote Docker container with different parameters and to collect benchmarks from logs. It is designed to measure the influence of runtime parameters on model performance, especially for speculative decoding.