You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-**Integrated visualizations** via `seaborn`/`matplotlib`.
38
38
39
39
## Installation
@@ -67,6 +67,9 @@ You need two input files:
67
67
> [!NOTE]
68
68
> Embedding models have a maximum input length - more on this in the next section. If your documents exceed this length, they should be split into smaller chunks before evaluation to ensure compatibility with the models. All preprocessing (e.g., cleaning, tokenization) should be completed before evaluation, as it is not (yet) supported in this toolkit.
69
69
70
+
> [!NOTE]
71
+
> The current implementation of this toolkit assumes **exactly one relevant document per query**. All metrics (Accuracy@k, NDCG@k, etc.) are designed for this single-answer evaluation scenario. If you have multiple relevant documents per query, the current metrics will not produce meaningful results.
72
+
70
73
### Configuration
71
74
Create a YAML config to define datasets, models, and metrics. Use [`configs/example.yaml`](configs/example.yaml) as a template.
72
75
@@ -76,7 +79,10 @@ Key fields:
76
79
-`docs` and `queries`: paths to your documents and queries in CSV or parquet format
77
80
-`is-public-data`: set to true to use OpenAI query creator if data is public
78
81
-`max-len`: set to the **shortest model limit** to ensure fair evaluation with same input text length for all models
79
-
-`models`: define model backends and options
82
+
-`models`: define model backends and options. Supported backends are `huggingface`, `lexical`, and `open-ai`. HuggingFace models support the following optional parameters (check the usage examples on HuggingFace to see whether a model uses either of these parameters):
83
+
-`set_builtin_query_prompt` / `set_builtin_passage_prompt`: use a model's built-in named prompt for queries/passages
84
+
-`set_query_task_prompt` / `set_passage_task_prompt`: pass a task string to the encoder (e.g. for Jina models)
85
+
-`set_custom_query_prefix` / `set_custom_passage_prefix`: prepend a custom string to each query/passage at inference time (mutually exclusive with the built-in prompt options)
0 commit comments