diff --git a/README.md b/README.md index 83e6ef1d..57e000af 100644 --- a/README.md +++ b/README.md @@ -364,6 +364,7 @@ Please review our [CONTRIBUTING.md](https://github.com/EthicalML/awesome-product * [RagaAI Catalyst](https://github.com/raga-ai-hub/RagaAI-Catalyst) ![](https://img.shields.io/github/stars/raga-ai-hub/RagaAI-Catalyst.svg?cacheSeconds=86400) - Prometheus-Eval is a collection of tools for training, evaluating, and using language models specialized in evaluating other language models. * [Ragas](https://github.com/explodinggradients/ragas) ![](https://img.shields.io/github/stars/explodinggradients/ragas.svg?cacheSeconds=86400) - Ragas is a framework to evaluate RAG pipelines. * [RewardBench](https://github.com/allenai/reward-bench) ![](https://img.shields.io/github/stars/allenai/reward-bench.svg?cacheSeconds=86400) - RewardBench is a benchmark designed to evaluate the capabilities and safety of reward models. +* Rhesis AI](https://github.com/rhesis-ai/rhesis): Collaborative agent testing for teams. Define expected behavior, generate and run test scenarios, and review failures collaboratively. * [RLBench](https://github.com/stepjam/RLBench) ![](https://img.shields.io/github/stars/stepjam/RLBench.svg?cacheSeconds=86400) - RLBench is an ambitious large-scale benchmark and learning environment designed to facilitate research in a number of vision-guided manipulation research areas, including: reinforcement learning, imitation learning, multi-task learning, geometric computer vision, and in particular, few-shot learning. * [SimplerEnv](https://github.com/simpler-env/SimplerEnv) ![](https://img.shields.io/github/stars/simpler-env/SimplerEnv.svg?cacheSeconds=86400) - SimplerEnv is a simulated manipulation policy evaluation environments for real robot setups. * [SwanLab](https://github.com/SwanHubX/SwanLab) ![](https://img.shields.io/github/stars/SwanHubX/SwanLab.svg?cacheSeconds=86400) - SwanLab is an AI training tracking and visualization tool.