Ready-to-use EvalView test configurations for LangGraph, CrewAI, AutoGen, Dify, OpenAI Assistants, Anthropic Claude, HuggingFace, Ollama, and more. Each example includes test cases, adapter configuration, and step-by-step instructions.
Working examples for the most popular AI agent frameworks.
# 1. Pick a framework below and follow its README
# 2. Start your agent
# 3. Run EvalView against it
evalview run --pattern examples/<framework>/test-case.yaml| Framework | Folder | What it tests |
|---|---|---|
| 🦜 LangGraph | langgraph/ | Multi-step research agent with tool calls |
| 🚢 CrewAI | crewai/ | Multi-agent team collaboration |
| 🤖 AutoGen | autogen/ | Multi-agent conversation patterns |
| 🎨 Dify | dify/ | Visual workflow builder |
| 💬 OpenAI Assistants | openai-assistants/ | Native OpenAI Assistants API |
| 🤖 Anthropic | anthropic/ | Claude direct API + Claude Agent SDK |
Start here — a complete working agent you can run in 2 minutes:
# Clone and run the demo agent
curl -O https://raw.githubusercontent.com/hidai25/eval-view/main/demo-agent/agent.py
pip install fastapi uvicorn
python agent.py
# Point EvalView at it
evalview runOr see EvalView catch a real regression without any setup:
evalview demoSee backend-implementations.md for copy-paste examples in FastAPI, Flask, Express.js, and streaming JSONL — with the exact request/response format EvalView expects.