feat: add VBench video generation evaluation benchmark by Luodian · Pull Request #1271 · EvolvingLMMs-Lab/lmms-eval

Luodian · 2026-03-26T16:05:58Z

Summary

Add VBench and VBench 2.0 benchmark suites for comprehensive video generation evaluation
Covers 40+ task variants: subject/background consistency, motion quality, spatial relationships, aesthetic quality, human action/anatomy preservation, dynamic attributes, material/mechanics/thermotics, and more
Includes build_dataset.py for dataset preparation from VBench prompts

References

VBench: https://arxiv.org/abs/2311.17982
VBench 2.0: https://arxiv.org/abs/2503.21755

Task list

vbench (group): VBench 1.0 tasks (15 variants)
vbench2 (group): VBench 2.0 tasks (13 variants)
vbench_all: Combined VBench 1.0 + 2.0

Test plan

Verify task registration with lmms-eval --tasks list | grep vbench
Run a single VBench task with a video generation model
Confirm scoring metrics align with VBench reference implementation

Add VBench and VBench 2.0 benchmark suites for comprehensive video generation evaluation. Covers 40+ task variants including subject/background consistency, motion quality, spatial relationships, aesthetic quality, human action/anatomy preservation, dynamic attributes, and more. References: - VBench: https://arxiv.org/abs/2311.17982 - VBench 2.0: https://arxiv.org/abs/2503.21755

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add VBench video generation evaluation benchmark#1271

feat: add VBench video generation evaluation benchmark#1271
Luodian wants to merge 1 commit intomainfrom
feat/vbench-benchmark

Luodian commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Luodian commented Mar 26, 2026

Summary

References

Task list

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant