Add AVMeme-Exam zero shot classification task by deep9539 · Pull Request #4539 · embeddings-benchmark/mteb

deep9539 · 2026-04-28T07:50:30Z

I have outlined why this dataset is filling an existing gap in mteb
I have tested that the dataset runs with the mteb package.
I have run the following models on the task (adding the results to the pr). These can be run using the mteb run -m {model_name} -t {task_name} command.
- mteb/baseline-random encoder
- facebook/pe-av-small-16-frame or another small model (Takes too long, maybe a day)
I have checked that the performance is neither trivial (close to perfect scores) nor random.
I have considered the size of the dataset and reduced it if it is too big (e.g. 2048 examples for binary classification)

	mteb/baseline-random encoder	facebook/pe-av-small-16-frame
AVMemeAudioVideoZeroShotClassification	0.241111
AVMemeVideoZeroShotClassification	0.207778

AVMemeVideoZeroShotClassification x random encoder
AVMemeAudioVideoZeroShotClassification x random encoder

x-tabdeveloping

Looks good so far. Let's wait for what we do with the facebook encoder.

AdnanElAssadi56 · 2026-04-30T03:30:05Z

+        is_beta=True,
+    )
+    input_column_name = "video"
+    label_column_name: str = "category"


Are you using the correct label here? Is the dataset intended for "emotion"?

Done.

Thanks for catching this, I must be looking at wrong huggingface tab. You are right the dataset is not for emotion.

AdnanElAssadi56 · 2026-04-30T05:43:18Z

The category label for this dataset seems weird to me. I also saw it was chosen in the other merged task. maybe emotion is the better choice here; we'll have to look at original source and see the intention of the authors.

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

deep9539 · 2026-04-30T16:43:56Z

Thanks for changing the label @Samoed , I agree with you that emotion makes more sense than the previous label which was sound category.

deep9539 mentioned this pull request Apr 28, 2026

MVEB Overview #4130

Open

72 tasks

x-tabdeveloping reviewed Apr 28, 2026

View reviewed changes

Samoed reviewed Apr 28, 2026

View reviewed changes

Comment thread mteb/tasks/zeroshot_classification/eng/avmeme_exam_classification.py

Samoed added new dataset Issues related to adding a new task or dataset video video extension labels Apr 28, 2026

Add AVMeme-Exam zero shot classification task

1cef8bf

deep9539 force-pushed the avmeme_exam branch from 757b66a to 1cef8bf Compare April 30, 2026 01:56

Add a video of emotion text in candidate_labels

0125e37

AdnanElAssadi56 reviewed Apr 30, 2026

View reviewed changes

Comment thread mteb/tasks/zeroshot_classification/eng/avmeme_exam_classification.py Outdated

AdnanElAssadi56 reviewed Apr 30, 2026

View reviewed changes

deep9539 added 2 commits April 29, 2026 20:43

Remove emotion from the candidate label prefix for AVMEME

9c239fd

Ran make lint

eade654

Samoed reviewed Apr 30, 2026

View reviewed changes

Comment thread mteb/tasks/zeroshot_classification/eng/avmeme_exam_classification.py Outdated

Comment thread mteb/tasks/zeroshot_classification/eng/avmeme_exam_classification.py Outdated

Apply suggestions from code review

c4d1332

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

Samoed approved these changes Apr 30, 2026

View reviewed changes

Samoed reviewed Apr 30, 2026

View reviewed changes

Comment thread mteb/tasks/zeroshot_classification/eng/avmeme_exam_classification.py Outdated

Apply suggestions from code review

3c76882

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

isaac-chung merged commit 061edcd into embeddings-benchmark:main Apr 30, 2026
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add AVMeme-Exam zero shot classification task#4539

Add AVMeme-Exam zero shot classification task#4539
isaac-chung merged 6 commits intoembeddings-benchmark:mainfrom
deep9539:avmeme_exam

deep9539 commented Apr 28, 2026

Uh oh!

x-tabdeveloping left a comment

Uh oh!

Uh oh!

Uh oh!

AdnanElAssadi56 Apr 30, 2026

Uh oh!

deep9539 Apr 30, 2026

Uh oh!

AdnanElAssadi56 commented Apr 30, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

deep9539 commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

deep9539 commented Apr 28, 2026

Uh oh!

x-tabdeveloping left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

AdnanElAssadi56 Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

deep9539 Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

AdnanElAssadi56 commented Apr 30, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

deep9539 commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants