Add benchmark version to result json.#1407
Conversation
|
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅ |
wpietri
left a comment
There was a problem hiding this comment.
This solves the stated need, but I'm wondering if we should just put in the full benchmark uid twice: once in the string form and once in the full JSON form. Seems like the real problem here is "want a piece of the benchmark UID without parsing it", so unless we're sure version is the only thing we need, I'd rather do it all at once unless there's some reason not to.
@rogthefrog, given that you're both the requester and the person who has spent the most time on structured benchmark UIDs, what are your thoughts?
Conclusion from standup was that we'd put everything in. Will update the PR! |
| elif isinstance(o, BenchmarkDefinition): | ||
| return {"uid": o.uid, "hazards": o.hazards()} | ||
| def_dict = {"uid": o.uid, "hazards": o.hazards()} | ||
| def_dict.update(o.uid_dict) |
There was a problem hiding this comment.
Why do we need both the uid and the uid dictionary? Do they contain different information?
There was a problem hiding this comment.
The dictionary contains all the elements of the uid, but it doesn't explain how to turn the elements into the full string. So if anybody needs to compare to a string uid, they'll need either import modelbench or duplicate the logic. Plus with file format it's generally better to be backwards compatible, so that we don't have to hunt down and rewrite anything that parses the results JSON.
There was a problem hiding this comment.
As an example of what uid_dict looks like:
{'class': 'general_purpose_ai_chat_benchmark', 'version': '1.1', 'locale': 'en_us', 'prompt_set': 'practice', 'evaluator': 'default'}
For #1406