|
1 | 1 | --- |
2 | 2 | layout: event |
3 | | -title: "🎓 Every Eval Ever: Building Community-Governed AI Evaluation Infrastructure" |
4 | | -subtitle: ACM FAccT 2026 Tutorial |
5 | | -team: Jan Batzner, Sree Harsha Nelaturu, Anastassia Kornilova, Avijit Ghosh, Angelie Kraft, Wm. Matthew Kennedy, Leon Staufer, David Hartmann, Usman Gohar, Michelle Lin, Yanan Long, Jennifer Mickel, Leshem Choshen, Irene Solaiman |
| 3 | +title: "FAccT 2026 Tutorial on Every Eval Ever" |
| 4 | +subtitle: Building Community-Governed AI Evaluation Infrastructure |
| 5 | +team: Jan Batzner, Sree Harsha Nelaturu, Anastassia Kornilova, Avijit Ghosh, Angelie Kraft, Usman Gohar, Michelle Lin, Yanan Long, Jennifer Mickel, Wm. Matthew Kennedy, Leon Staufer, David Hartmann, Leshem Choshen*, Irene Solaiman* |
6 | 6 | status: active |
7 | 7 | order: 1 |
8 | 8 | category: Organization |
9 | 9 | event_date: 2026-06-26 |
10 | | -location: ACM FAccT, Montreal |
| 10 | +location: Montreal (Canada) |
11 | 11 | host: EvalEval |
12 | 12 | description: | |
13 | | - A FAccT 2026 tutorial walking through Every Eval Ever — a community-governed open source infrastructure that unifies evaluation results under a shared metadata schema — and Evaluation Cards, an interpretive layer for evaluation reporting. |
| 13 | + A FAccT 2026 tutorial walking through Every Eval Ever — a community-governed open source infrastructure unifying evaluation results under a shared metadata schema — and Evaluation Cards, an interpretive layer for evaluation reporting. |
14 | 14 | --- |
15 | 15 |
|
16 | | -## 📖 About |
17 | | - |
| 16 | +## 🪧 About |
18 | 17 | Existing model evaluation results are scattered across leaderboards, papers, and technical reports in incompatible formats. This fragmentation obscures transparency, hinders progress, and disadvantages researchers, civil society, policymakers, and industry alike, especially those who can't afford to run evaluations from scratch. Built once, shared eval infrastructure serves us all. |
19 | 18 |
|
20 | | -In this tutorial, we walk through [Every Eval Ever](https://github.com/evaleval/every_eval_ever), a community-governed open source infrastructure that unifies all evaluation results under a shared metadata schema. We then present **Evaluation Cards**, an interface and interpretive layer for evaluation reporting designed around practitioner needs from stakeholder interviews, and show how participants can find, compare, and contribute evaluations themselves. |
21 | | - |
22 | | -All technical experience levels are welcome. **If you can, please bring a laptop or tablet!** 💻 |
| 19 | +In this tutorial, we walk through [**Every Eval Ever**](https://evalevalai.com/infrastructure/2026/02/17/everyevalever-launch/), a community-governed open source infrastructure that unifies all evaluation results under a shared metadata schema. We then present [**Evaluation Cards**](https://evalcards.evalevalai.com), an interface and interpretive layer for evaluation reporting designed around practitioner needs from stakeholder interviews, and show how participants can find, compare, and contribute evaluations themselves. |
23 | 20 |
|
24 | | -## 📅 Date & Location |
| 21 | +All technical experience levels are welcome. If you can, please bring a laptop or tablet! 💻 |
25 | 22 |
|
26 | | -- **When:** Friday, June 26, 2026 · 3:00 – 4:00 PM |
| 23 | +## 📅 In-Person FAccT Tutorial |
| 24 | +- **When:** Friday, June 26, 2026 · 3:00 – 4:00 PM (Canada) |
27 | 25 | - **Where:** ACM FAccT 2026, Montreal (in person) |
28 | 26 |
|
29 | | -## 🎤 Presenters |
| 27 | +## 🌐 Online FAccT Tutorial |
| 28 | +- **When:** Friday, June 26, 2026 · TBD |
| 29 | +- **Where:** Zoom Video Conference |
30 | 30 |
|
| 31 | +## 🏛️ Tutorial Program Committee |
31 | 32 | - Jan Batzner, Weizenbaum Institute, Technical University Munich |
32 | 33 | - Sree Harsha Nelaturu, Zuse Institute |
33 | 34 | - Anastassia Kornilova, Trustible |
34 | 35 | - Avijit Ghosh, Hugging Face |
35 | 36 | - Angelie Kraft, Weizenbaum Institute |
36 | | -- Wm. Matthew Kennedy, Oxford |
37 | | -- Leon Staufer, Cambridge |
38 | | -- David Hartmann, Weizenbaum Institute |
39 | 37 | - Usman Gohar, Iowa State University |
40 | 38 | - Michelle Lin, Mila, Quebec AI Institute |
41 | 39 | - Yanan Long, StickFluxLabs |
42 | 40 | - Jennifer Mickel, EleutherAI |
43 | | -- Leshem Choshen, MIT, IBM Research |
44 | | -- Irene Solaiman, Hugging Face |
45 | | - |
46 | | -## 🏛️ Organizers |
47 | | - |
48 | | -[EvalEval Coalition](/about/) |
| 41 | +- Wm. Matthew Kennedy, University of Oxford |
| 42 | +- Leon Staufer, University of Cambridge |
| 43 | +- David Hartmann, Weizenbaum Institute |
| 44 | +- Leshem Choshen*, MIT, IBM Research, MIT-IBM Watson AI Lab |
| 45 | +- Irene Solaiman*, Hugging Face |
49 | 46 |
|
50 | 47 | ## 📬 Contact |
51 | | - |
52 | | -- [evalevalpc@googlegroups.com](mailto:evalevalpc@googlegroups.com) |
| 48 | +We are looking forward to meeting you! For any questions, please reach the [EvalEval Organizing Team here](mailto:evalevalpc@googlegroups.com). |
0 commit comments