Standardize columns + Data refactor by bkorycki · Pull Request #1113 · mlcommons/modelbench

bkorycki · 2025-07-02T21:21:15Z

The overall goal of this PR is to standardize data formatting so that the outputs of one stage can be used as the input of another. And to avoid head aches.
Two new objects are instrumental in accomplishing this:

The Data Schema objects, which define column names in one place.
Dataset objects which handle both reading from input and writing to output.
- The different types of datasets will read CSV rows and output the relevant object for each row e.g. TestItem for prompts, SUTInteraction for prompts-responses, and AnnotatedSUTInteraction for annotations.
- The Annotation dataset serializes/deserializes annotations as json-strings. It can serialize pydantic objects or dictionaries. Annotations will always be deserialized as dictionaries.

Also in service of this goal, every output file (aside from metadata.json) is a CSV file in which one row corresponds to one prompt and one sut response and/or one annotation. No more annotations.jsonl.

…ack as dictionaries

github-actions · 2025-07-02T21:21:24Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

rogthefrog

This is great and very clear.

rogthefrog · 2025-07-03T17:29:33Z

+
+    quoting = csv.QUOTE_ALL
+
+    def __init__(self, path: Union[str, Path], mode: str):


Would it be useful to have a BaseDatasetReader and BaseDatasetWriter instead of handling both use cases in one class?

Hm that's interesting. But I don't like the idea of having 6 dataset classes instead of just 3. Do you think it would be worth it to split it up?

bkorycki added 14 commits June 18, 2025 13:43

Data schema objects

dcac744

pipeline runners use data schema for input

e29a343

Prompt runner outputs each sut response in different row

f4f2dcc

annotation data schema

a0a5bfa

New dataset objects

faf6fa6

Use PromptDataset as input to PromptRunner. Delete CsvPromptInput

ea68e93

Use PromptResponseDataset instead of CSVPromptOutput in prompt runner

5f8bc47

Quote all + only accept csv files in datasets

afe604a

Replace annotator input objects with PromptResponseDataset

679efea

New AnnotatedSUTInteraction object.

ee4f4c9

Annotation column is a json dict + annotation dataset tests

8a36bde

Annotation dataset dumps annotations as json strings and reads them b…

160253b

…ack as dictionaries

Use AnnotationDataset object in annotation runner

a20d5ec

mypy

d8e9a8d

bkorycki requested a review from a team as a code owner July 2, 2025 21:21

bkorycki temporarily deployed to Scheduled Testing July 2, 2025 21:21 — with GitHub Actions Inactive

bkorycki requested a review from rogthefrog July 2, 2025 21:24

rogthefrog approved these changes Jul 3, 2025

View reviewed changes

remove prints

9194c8b

bkorycki temporarily deployed to Scheduled Testing July 3, 2025 17:56 — with GitHub Actions Inactive

bkorycki merged commit afe1cbf into main Jul 3, 2025
4 checks passed

bkorycki deleted the standardize-columns branch July 3, 2025 21:17

github-actions Bot locked and limited conversation to collaborators Jul 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Standardize columns + Data refactor#1113

Standardize columns + Data refactor#1113
bkorycki merged 15 commits into
mainfrom
standardize-columns

bkorycki commented Jul 2, 2025 •

edited

Loading

Uh oh!

github-actions Bot commented Jul 2, 2025 •

edited

Loading

Uh oh!

rogthefrog left a comment

Uh oh!

rogthefrog Jul 3, 2025

Uh oh!

bkorycki Jul 3, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		quoting = csv.QUOTE_ALL

		def __init__(self, path: Union[str, Path], mode: str):

Conversation

bkorycki commented Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rogthefrog left a comment

Choose a reason for hiding this comment

Uh oh!

rogthefrog Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

bkorycki Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bkorycki commented Jul 2, 2025 •

edited

Loading

github-actions Bot commented Jul 2, 2025 •

edited

Loading