Name	Name	Last commit message	Last commit date
parent directory ..
agent-test	agent-test
anthropic	anthropic
autogen	autogen
code-based-evals	code-based-evals
cohere	cohere
crewai	crewai
dify	dify
goosebench	goosebench
groq	groq
huggingface	huggingface
langgraph	langgraph
log-import	log-import
mcp	mcp
ollama	ollama
openai-assistants	openai-assistants
openclaw	openclaw
pydantic-ai	pydantic-ai
simulation	simulation
skills/test-skill	skills/test-skill
README.md	README.md
backend-implementations.md	backend-implementations.md
config.example.yaml	config.example.yaml
github-workflow-autopr.yml	github-workflow-autopr.yml
github-workflow-example.yml	github-workflow-example.yml
model_comparison_test.py	model_comparison_test.py
statistical-mode-example.yaml	statistical-mode-example.yaml
stock-analysis.yaml	stock-analysis.yaml
test_case_toml.toml	test_case_toml.toml

Name

Last commit message

Last commit date

agent-test

backend-implementations.md

config.example.yaml

github-workflow-autopr.yml

github-workflow-example.yml

model_comparison_test.py

statistical-mode-example.yaml

stock-analysis.yaml

test_case_toml.toml

EvalView Examples — Working Test Configurations for AI Agent Frameworks

Ready-to-use EvalView test configurations for LangGraph, CrewAI, AutoGen, Dify, OpenAI Assistants, Anthropic Claude, HuggingFace, Ollama, and more. Each example includes test cases, adapter configuration, and step-by-step instructions.

Working examples for the most popular AI agent frameworks.

Quick Start

# 1. Pick a framework below and follow its README
# 2. Start your agent
# 3. Run EvalView against it
evalview run --pattern examples/<framework>/test-case.yaml

Examples

Framework	Folder	What it tests
🦜 LangGraph	langgraph/	Multi-step research agent with tool calls
🚢 CrewAI	crewai/	Multi-agent team collaboration
🤖 AutoGen	autogen/	Multi-agent conversation patterns
🎨 Dify	dify/	Visual workflow builder
💬 OpenAI Assistants	openai-assistants/	Native OpenAI Assistants API
🤖 Anthropic	anthropic/	Claude direct API + Claude Agent SDK

New to EvalView?

Start here — a complete working agent you can run in 2 minutes:

# Clone and run the demo agent
curl -O https://raw.githubusercontent.com/hidai25/eval-view/main/demo-agent/agent.py
pip install fastapi uvicorn
python agent.py

# Point EvalView at it
evalview run

Or see EvalView catch a real regression without any setup:

evalview demo

Implementing Your Own Agent

See backend-implementations.md for copy-paste examples in FastAPI, Flask, Express.js, and streaming JSONL — with the exact request/response format EvalView expects.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

EvalView Examples — Working Test Configurations for AI Agent Frameworks

Quick Start

Examples

New to EvalView?

Implementing Your Own Agent

Questions?

FilesExpand file tree

examples

Directory actions

More options

Directory actions

More options

Latest commit

History

examples

Folders and files

parent directory

README.md

EvalView Examples — Working Test Configurations for AI Agent Frameworks

Quick Start

Examples

New to EvalView?

Implementing Your Own Agent

Questions?