Skip to content

Latest commit

 

History

History
19 lines (12 loc) · 1.15 KB

File metadata and controls

19 lines (12 loc) · 1.15 KB

FAQ

Q: Who is this library for?

Anyone! We had a few groups in mind when building MASEval.

  1. Benchmark Developers: Researchers proposing new benchmarks for multi-agent systems can use MASEval to handle all the boilerplate.
  2. Benchmark Consumers: Researchers studying multi-agent systems can use MASEval as a unified interface across different benchmarks.
  3. System Comparison: Developers who want to test different agentic systems against each other can do so with MASEval.

Q: I am looking for a specific feature, but I cannot find it.

  1. Check this documentation.
  2. If the feature does not exist, please open an issue on GitHub. Feature requests are welcome.
  3. Consider implementing it yourself. Check out the contributing guide for details.

Q: Can I only test multi-agent systems?

No. MASEval works well for single-agent systems too. We designed the library to handle the complexity of multi-agent systems, but single-agent evaluation is fully supported. You can even run model comparisons, for example GPT against Claude.