Add ResearchClawBench eval framework by black-yt · Pull Request #2174 · huggingface/huggingface.js

black-yt · 2026-05-15T05:28:58Z

Summary

Adds researchclawbench to the supported evaluation frameworks for benchmark dataset eval.yaml files.

ResearchClawBench is an end-to-end scientific research benchmark for AI agents and standalone LLMs, covering workflows from reading raw data and related work to producing code, figures, and publication-style reports.

Dataset prepared for the Hub Evaluation Results feature:
https://huggingface.co/datasets/InternScience/ResearchClawBench

The dataset repo already includes:

eval.yaml with evaluation_framework: researchclawbench
.eval_results/*.yaml entries following the benchmark result format

Reference similar benchmark setup:
https://huggingface.co/datasets/claw-eval/Claw-Eval

Change

Add researchclawbench to EVALUATION_FRAMEWORKS in packages/tasks/src/eval.ts.

Notes

This is intended to allow the ResearchClawBench dataset to be recognized as a Benchmark dataset and display the benchmark leaderboard/tag on the Hub.

Add ResearchClawBench eval framework

636b49e

black-yt requested review from SBrandeis, Wauplin, gary149, julien-c, ngxson and pcuenca as code owners May 15, 2026 05:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ResearchClawBench eval framework#2174

Add ResearchClawBench eval framework#2174
black-yt wants to merge 1 commit into
huggingface:mainfrom
black-yt:add-researchclawbench-eval-framework

black-yt commented May 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

black-yt commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Change

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

black-yt commented May 15, 2026 •

edited

Loading