Skip to content

feat: add VBench video generation evaluation benchmark#1271

Open
Luodian wants to merge 1 commit intomainfrom
feat/vbench-benchmark
Open

feat: add VBench video generation evaluation benchmark#1271
Luodian wants to merge 1 commit intomainfrom
feat/vbench-benchmark

Conversation

@Luodian
Copy link
Copy Markdown
Contributor

@Luodian Luodian commented Mar 26, 2026

Summary

  • Add VBench and VBench 2.0 benchmark suites for comprehensive video generation evaluation
  • Covers 40+ task variants: subject/background consistency, motion quality, spatial relationships, aesthetic quality, human action/anatomy preservation, dynamic attributes, material/mechanics/thermotics, and more
  • Includes build_dataset.py for dataset preparation from VBench prompts

References

Task list

  • vbench (group): VBench 1.0 tasks (15 variants)
  • vbench2 (group): VBench 2.0 tasks (13 variants)
  • vbench_all: Combined VBench 1.0 + 2.0

Test plan

  • Verify task registration with lmms-eval --tasks list | grep vbench
  • Run a single VBench task with a video generation model
  • Confirm scoring metrics align with VBench reference implementation

Add VBench and VBench 2.0 benchmark suites for comprehensive video
generation evaluation. Covers 40+ task variants including subject/background
consistency, motion quality, spatial relationships, aesthetic quality,
human action/anatomy preservation, dynamic attributes, and more.

References:
- VBench: https://arxiv.org/abs/2311.17982
- VBench 2.0: https://arxiv.org/abs/2503.21755
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant