Skip to content

Commit bb5d977

Browse files
authored
add info about quantization in research_and_selection_of_llm_candidates.md
1 parent aa9ab60 commit bb5d977

1 file changed

Lines changed: 4 additions & 0 deletions

File tree

docs/immune/research_and_selection_of_llm_candidates.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,8 +52,12 @@ After an initial screening, a subset of ten language models was selected for fur
5252

5353
As part of this effort, nine targeted unit tests were developed to evaluate the key capabilities required for integration into the SLIPS immune architecture. All evaluations were conducted on a computer based on the x64 architecture. Evaluation on Raspberry Pi (RPI) hardware will be carried out in future tasks, and the results will be contrasted accordingly.
5454

55+
All the models were deployed on an [**Ollama**](https://ollama.com/) server, and all versions were quantized to Q4_K_M. This could be considered a limitation, as the performance is not equivalent to that of models using full precision. As a result, the reported performance provides only an initial indication of the models' true capabilities.
56+
5557
The evaluation tests were grouped into three main categories to assess distinct capabilities of the language models. **Information Extraction** tests focus on the model's ability to retrieve specific fields or data points from structured inputs like JSON or logs. **Summarization & Decision making** tests evaluate how well models can convert technical data into clear, human-readable insights and take a simple decision such as a classification. **Data Generation**, including API-compatible formatting, assesses whether models can produce well-structured outputs such as JSON objects, log entries, or function calls that align with defined schemas or interfaces.
5658

59+
60+
5761
### Evaluation Tests Overview
5862

5963
A list of the implemented test as well as a brief description is described as follows:

0 commit comments

Comments
 (0)