Skip to content

implementing converse benchmark#28

Merged
cemde merged 7 commits into
parameterlab:mainfrom
agoel00:converse
Feb 12, 2026
Merged

implementing converse benchmark#28
cemde merged 7 commits into
parameterlab:mainfrom
agoel00:converse

Conversation

@agoel00
Copy link
Copy Markdown
Contributor

@agoel00 agoel00 commented Feb 11, 2026

Description

This PR adds a new benchmark implementation in the MASEval framework based on https://arxiv.org/abs/2511.05359.

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Code quality improvement (refactoring, formatting, etc.)

Checklist

Contribution

Documentation

  • Added/updated docstrings for new/modified functions as instructed CONTRIBUTING.md
  • Updated relevant documentation in docs/ (if applicable)
  • Tag github issue with this PR (if applicable)

Changelog

  • Added entry to CHANGELOG.md under [Unreleased] section
    • Use Added section for new features
    • Use Changed section for modifications to existing functionality
    • Use Fixed section for bug fixes
    • Use Removed section for deprecated/removed features
  • OR this is a documentation-only change (no changelog needed)

Example:
- Support for multi-agent tracing (PR:#123)

Architecture (if applicable)

  • Core/Interface separation: Changes in maseval/core/ do NOT import from maseval/interface/
  • Dependencies: New core dependencies added sparingly; framework integrations go to optional dependencies

Additional Notes

Comment thread docs/benchmark/converse.md
Comment thread CHANGELOG.md
agoel00 and others added 5 commits February 11, 2026 15:12
…onverse

Pulling latest updates from main
* reentered BENCHMARKS placeholder
* improved consistency in type hinting
* added tests for data loading
@cemde cemde merged commit 4227cba into parameterlab:main Feb 12, 2026
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants