Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -347,6 +347,7 @@ Please review our [CONTRIBUTING.md](https://github.com/EthicalML/awesome-product
* [HumanEval](https://github.com/openai/human-eval)![](https://img.shields.io/github/stars/openai/human-eval.svg?cacheSeconds=86400) - HumanEval is a benchmark for evaluating the functional correctness of code generation models using Python programming problems with unit tests.
* [Helicone](https://github.com/Helicone/helicone) ![](https://img.shields.io/github/stars/Helicone/helicone.svg?cacheSeconds=86400) - Helicone is the all-in-one, open-source LLM developer platform.
* [HELM](https://github.com/stanford-crfm/helm) ![](https://img.shields.io/github/stars/stanford-crfm/helm.svg?cacheSeconds=86400) - HELM (Holistic Evaluation of Language Models) provides tools for the holistic evaluation of language models, including standardized datasets, a unified API for various models, diverse metrics, r, and fairness perturbations, a prompt construction framework, and a proxy server for unified model access.
* [Ingero](https://github.com/ingero-io/ingero) ![](https://img.shields.io/github/stars/ingero-io/ingero.svg?cacheSeconds=86400) - Ingero is an open-source eBPF agent and MCP server for production ML, tracing the causal chain from Linux kernel events through CUDA API calls to Python source lines with <2% overhead, zero code changes, and one binary.
* [Inspect](https://github.com/UKGovernmentBEIS/inspect_ai) ![](https://img.shields.io/github/stars/UKGovernmentBEIS/inspect_ai.svg?cacheSeconds=86400) - Inspect is a framework for large language model evaluations.
* [JiWER](https://github.com/jitsi/jiwer) ![](https://img.shields.io/github/stars/jitsi/jiwer.svg?cacheSeconds=86400) - JiWER is a simple and fast python package to evaluate an automatic speech recognition system.
* [Laminar](https://github.com/lmnr-ai/lmnr) ![](https://img.shields.io/github/stars/lmnr-ai/lmnr.svg?cacheSeconds=86400) - Laminar is an open-source platform to trace, evaluate, label, and analyze LLM data for AI products.
Expand Down