You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* update readme
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* Update README for MTEB evaluation process clarity
---------
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copy file name to clipboardExpand all lines: README.md
+25-2Lines changed: 25 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,7 +19,30 @@
19
19
20
20
---
21
21
22
-
## What is Pipeline Evaluation?
22
+
## Evaluating single-model retrievers on ViDoRe v1–v3 with MTEB
23
+
We shifted from in-house evaluations to the general MTEB evaluation framework for retrieval models by moving to [MTEB](https://github.com/embeddings-benchmark/mteb/tree/main).
24
+
25
+
Here are the main steps to evaluate and submit your retriever to the ViDoRe V1-V3 leaderboards ; see the [MTEB official documentation](https://embeddings-benchmark.github.io/mteb/contributing/adding_a_model/) for full details. This section covers mteb leaderboards only; for our in-house pipeline leaderboard, see the section below.
26
+
27
+
1. Create your model implementation file (if it does not exist already) [here](https://github.com/embeddings-benchmark/mteb/tree/main/mteb/models/model_implementations), then open a PR to the [MTEB repository](https://github.com/embeddings-benchmark/mteb) with your changes; examples for Colpali-like models can be found in [this file](https://github.com/embeddings-benchmark/mteb/blob/main/mteb/models/model_implementations/colpali_models.py).
28
+
29
+
2. Evaluate your model:
30
+
```python
31
+
import mteb
32
+
from mteb.models.model_implementations.my_custom_model import MyCustomModel
33
+
34
+
my_model = MyCustomModel(my_args)
35
+
tasks = mteb.get_tasks(["ViDoRe (v3)"])
36
+
37
+
results = mteb.evaluate(my_model, tasks=tasks)
38
+
```
39
+
40
+
3. Open a PR on the [mteb_results_repo](https://github.com/embeddings-benchmark/results/tree/main) with the generated results file to submit your results to the leaderboard
41
+
42
+
4. To evaluate on private sets, once all this is done you can ask the MTEB team to evaluate your model on private ViDoRe v3 sets by opening a dedicated issue on [their repo](https://github.com/embeddings-benchmark/mteb/issues)
43
+
44
+
## Evaluating a complex pipeline
45
+
23
46
24
47
Pipeline evaluation allows you to evaluate **complete end-to-end retrieval systems** on the ViDoRe v3 benchmark datasets. Unlike traditional retriever evaluation that focuses on individual model components, pipeline evaluation lets you test:
25
48
@@ -28,7 +51,7 @@ Pipeline evaluation allows you to evaluate **complete end-to-end retrieval syste
-**Arbitrary retrieval logic** that goes beyond standard dense/sparse retrievers
30
53
31
-
## 📊 Results Repository & Submission Guidelines
54
+
###📊 Results Repository & Submission Guidelines
32
55
33
56
**This repository serves as the primary community results repository for visual document retrieval benchmarks using complex pipelines.** We encourage researchers and practitioners to submit their pipeline evaluation results to create a centralized location where the community can compare different approaches and track progress on ViDoRe v3 datasets.
0 commit comments