Skip to content

[MVEB] Added SomethingSomething zeroshot#4542

Draft
x-tabdeveloping wants to merge 6 commits intomainfrom
somethingsomething
Draft

[MVEB] Added SomethingSomething zeroshot#4542
x-tabdeveloping wants to merge 6 commits intomainfrom
somethingsomething

Conversation

@x-tabdeveloping
Copy link
Copy Markdown
Collaborator

@x-tabdeveloping x-tabdeveloping commented Apr 28, 2026

  • I have outlined why this dataset is filling an existing gap in the MVEB benchmark
  • I have tested that the dataset runs with the mteb package.
  • I have run the following models on the task (adding the results to the pr). These can be run using the mteb run -m {model_name} -t {task_name} command.
    • facebook/pe-av-small-16-frame
    • mteb/baseline-random-encoder
  • I have checked that the performance is neither trivial (both models gain close to perfect scores) nor random (both models gain close to random scores).
  • I have considered the size of the dataset and reduced it if it is too big (2048 examples is typically large enough for most tasks)

@x-tabdeveloping x-tabdeveloping mentioned this pull request Apr 28, 2026
75 tasks
Comment thread mteb/tasks/zeroshot_classification/eng/something_something_v2_classification.py Outdated
@Samoed Samoed added new dataset Issues related to adding a new task or dataset video video extension labels Apr 28, 2026
…classification.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
Comment thread mteb/tasks/zeroshot_classification/eng/something_something_v2_classification.py Outdated
Comment thread mteb/tasks/zeroshot_classification/eng/something_something_v2_classification.py Outdated
for name in self.dataset["test"].features[self.label_column_name].names
]

def dataset_transform(self, num_proc=None):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you push this using this command:
https://embeddings-benchmark.github.io/mteb/contributing/adding_a_dataset/#pushing-the-dataset-to-the-hub

then we avoid downloading the whole thing during eval

@KennethEnevoldsen KennethEnevoldsen changed the title WIP: Added SomethingSomething zeroshot [MVEB] Added SomethingSomething zeroshot Apr 30, 2026
@KennethEnevoldsen KennethEnevoldsen marked this pull request as draft April 30, 2026 10:30
@KennethEnevoldsen
Copy link
Copy Markdown
Contributor

labelled as MVEB and instead of WIP I converted it to a draft PR

Copy link
Copy Markdown
Contributor

@AdnanElAssadi56 AdnanElAssadi56 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good after comments above

x-tabdeveloping and others added 2 commits May 7, 2026 08:29
…classification.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
…classification.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new dataset Issues related to adding a new task or dataset video video extension

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants