-
Notifications
You must be signed in to change notification settings - Fork 42
Pull requests: AISBench/benchmark
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
feat(datasets): add HLE dataset
docs
feature
#301
opened May 15, 2026 by
ivanbao9783
Loading…
3 of 15 tasks
[bugfix]: fix answer extraction regex and evaluator bugs in MMLU-Pro
bugfix
#269
opened May 2, 2026 by
lvhua6352
Loading…
1 of 15 tasks
[Bugfix] Add pred and choices parsing to fix the issue of score=0 for…
bugfix
#258
opened Apr 20, 2026 by
Yanguan619
Loading…
1 of 15 tasks
[feature] Add new answer extract function for minmax in gpqa benchmark
feature
#247
opened Apr 14, 2026 by
SJTUyh
Collaborator
Loading…
1 of 15 tasks
Fix max_out_len handling for multi-turn ShareGPT conversations
bugfix
#243
opened Apr 12, 2026 by
Shadowless-ly
Loading…
6 of 15 tasks
Fix the issue where TTFT and TPOT have no data when running Kimi2.5 i…
#210
opened Mar 21, 2026 by
GaoHuaZhang
Collaborator
Loading…
15 tasks
update fix on textvqa, mmmu, mmstar, add patch for glm4.6v
bugfix
#188
opened Mar 13, 2026 by
Shane120283483
Loading…
1 of 15 tasks
[UT] Add new UT for Gedit feature
test-cases
#163
opened Mar 5, 2026 by
SJTUyh
Collaborator
Loading…
1 of 15 tasks
[feature] [sub feature 2] Dependency for qwen image edit run
feature
#151
opened Feb 13, 2026 by
SJTUyh
Collaborator
Loading…
1 of 15 tasks
[feature] [sub feature 3] Support qwen Image edit infer with gedit dataset
feature
#150
opened Feb 13, 2026 by
SJTUyh
Collaborator
Loading…
1 of 15 tasks
【TEST】补充math和agieval数据集的冒烟用例
test-cases
#145
opened Feb 11, 2026 by
GaoHuaZhang
Collaborator
Loading…
1 of 15 tasks
ProTip!
Filter pull requests by the default branch with base:master.