Skip to content

Commit f70bab6

Browse files
committed
docs: fix quick-start CLI examples to use --eval
1 parent 3e281f8 commit f70bab6

1 file changed

Lines changed: 43 additions & 32 deletions

File tree

content/docs/quick-start.md

Lines changed: 43 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,66 +1,77 @@
11
---
22
title: "Quick Start"
33
weight: 1
4-
description: "Get up and running with AgentEvals in under 5 minutes."
4+
description: "Get started with agentevals in minutes"
55
---
66

7-
## Installation
7+
# Quick Start
88

9-
Grab a wheel from the [releases page](https://github.com/agentevals-dev/agentevals/releases). The **core** wheel has the CLI and REST API. The **bundle** wheel adds streaming and the embedded web UI.
9+
Get from zero to your first evaluation in under 5 minutes.
1010

11-
```bash
12-
pip install agentevals-<version>-py3-none-any.whl
11+
## 1. Install
1312

14-
# For MCP server and live streaming support:
15-
pip install "agentevals-<version>-py3-none-any.whl[live]"
13+
```bash
14+
npm install -g @agentevals/agentv
1615
```
1716

18-
**From source** with `uv` or Nix:
17+
Verify the installation:
1918

2019
```bash
21-
uv sync
22-
# or: nix develop .
20+
agentv --version
2321
```
2422

25-
See [DEVELOPMENT.md](https://github.com/agentevals-dev/agentevals/blob/main/DEVELOPMENT.md) for build instructions.
23+
## 2. Create an Eval
2624

27-
## CLI Quick Start
25+
Create a file named `EVAL.yaml`:
2826

29-
Run an evaluation against a sample trace:
27+
```yaml
28+
suite: customer-support-evals
29+
version: 1
3030

31-
```bash
32-
uv run agentevals run samples/helm.json \
33-
--eval-set samples/eval_set_helm.json \
34-
-m tool_trajectory_avg_score
31+
cases:
32+
- name: tool_usage_validation
33+
target: support-bot
34+
criteria: Agent should use search_docs before answering policy questions
35+
evaluators:
36+
- type: tool_trajectory
37+
expected_sequence: [search_docs, format_answer]
38+
allow_extra_steps: true
3539
```
3640
37-
List available evaluators:
41+
## 3. Run Your Eval
42+
43+
Execute your evaluation suite:
3844
3945
```bash
40-
uv run agentevals evaluator list
46+
agentv run --eval EVAL.yaml
4147
```
4248

43-
## Live UI Quick Start
49+
## 4. View Results
4450

45-
Start the server with the embedded web UI:
51+
Get detailed results in the terminal:
4652

4753
```bash
48-
agentevals serve
54+
agentv run --eval EVAL.yaml --format table
4955
```
5056

51-
Open `http://localhost:8001` to upload traces and eval sets, select metrics, and view results with interactive span trees.
52-
53-
**From source** (two terminals):
57+
Export as JSON for CI/CD or further processing:
5458

5559
```bash
56-
uv run agentevals serve --dev # Terminal 1
57-
cd ui && npm install && npm run dev # Terminal 2 → http://localhost:5173
60+
agentv run --eval EVAL.yaml --format json > results.json
5861
```
5962

60-
Live-streamed traces appear in the "Local Dev" tab, grouped by session ID.
63+
## 5. Try the Examples
64+
65+
Explore sample evaluation suites:
66+
67+
```bash
68+
npx agentv run --eval examples/customer-support/EVAL.yaml
69+
npx agentv run --eval examples/code-review/EVAL.yaml
70+
```
6171

62-
## What's Next
72+
## Next Steps
6373

64-
- [Integrations](/docs/integrations/) — Zero-code, SDK, CLI/CI, and MCP integration patterns
65-
- [Custom Evaluators](/docs/custom-evaluators/) — Build your own evaluators
66-
- [UI Walkthrough](/docs/ui-walkthrough/) — Deep dive into the web UI
74+
- **Learn the YAML format**[Configuration](/docs/configuration/)
75+
- **See more examples**[Examples](/docs/examples/)
76+
- **Set up CI/CD**[CI/CD Integration](/docs/ci-cd/)
77+
- **Use the MCP server**[MCP Server](/docs/mcp-server/)

0 commit comments

Comments
 (0)