docs: fix quick-start CLI examples to use --eval

sebbycorp · sebbycorp · commit f70bab610b04 · 2026-03-20T13:33:41.000-04:00
diff --git a/content/docs/quick-start.md b/content/docs/quick-start.md
@@ -1,66 +1,77 @@
 ---
 title: "Quick Start"
 weight: 1
-description: "Get up and running with AgentEvals in under 5 minutes."
+description: "Get started with agentevals in minutes"
 ---
 
-## Installation
+# Quick Start
 
-Grab a wheel from the [releases page](https://github.com/agentevals-dev/agentevals/releases). The **core** wheel has the CLI and REST API. The **bundle** wheel adds streaming and the embedded web UI.
+Get from zero to your first evaluation in under 5 minutes.
 
-```bash
-pip install agentevals-<version>-py3-none-any.whl
+## 1. Install
 
-# For MCP server and live streaming support:
-pip install "agentevals-<version>-py3-none-any.whl[live]"
+```bash
+npm install -g @agentevals/agentv
 ```
 
-**From source** with `uv` or Nix:
+Verify the installation:
 
 ```bash
-uv sync
-# or: nix develop .
+agentv --version
 ```
 
-See [DEVELOPMENT.md](https://github.com/agentevals-dev/agentevals/blob/main/DEVELOPMENT.md) for build instructions.
+## 2. Create an Eval
 
-## CLI Quick Start
+Create a file named `EVAL.yaml`:
 
-Run an evaluation against a sample trace:
+```yaml
+suite: customer-support-evals
+version: 1
 
-```bash
-uv run agentevals run samples/helm.json \
-  --eval-set samples/eval_set_helm.json \
-  -m tool_trajectory_avg_score
+cases:
+  - name: tool_usage_validation
+    target: support-bot
+    criteria: Agent should use search_docs before answering policy questions
+    evaluators:
+      - type: tool_trajectory
+        expected_sequence: [search_docs, format_answer]
+        allow_extra_steps: true
 ```
 
-List available evaluators:
+## 3. Run Your Eval
+
+Execute your evaluation suite:
 
 ```bash
-uv run agentevals evaluator list
+agentv run --eval EVAL.yaml
 ```
 
-## Live UI Quick Start
+## 4. View Results
 
-Start the server with the embedded web UI:
+Get detailed results in the terminal:
 
 ```bash
-agentevals serve
+agentv run --eval EVAL.yaml --format table
 ```
 
-Open `http://localhost:8001` to upload traces and eval sets, select metrics, and view results with interactive span trees.
-
-**From source** (two terminals):
+Export as JSON for CI/CD or further processing:
 
 ```bash
-uv run agentevals serve --dev     # Terminal 1
-cd ui && npm install && npm run dev  # Terminal 2 → http://localhost:5173
+agentv run --eval EVAL.yaml --format json > results.json
 ```
 
-Live-streamed traces appear in the "Local Dev" tab, grouped by session ID.
+## 5. Try the Examples
+
+Explore sample evaluation suites:
+
+```bash
+npx agentv run --eval examples/customer-support/EVAL.yaml
+npx agentv run --eval examples/code-review/EVAL.yaml
+```
 
-## What's Next
+## Next Steps
 
-- [Integrations](/docs/integrations/) — Zero-code, SDK, CLI/CI, and MCP integration patterns
-- [Custom Evaluators](/docs/custom-evaluators/) — Build your own evaluators
-- [UI Walkthrough](/docs/ui-walkthrough/) — Deep dive into the web UI
+- **Learn the YAML format** → [Configuration](/docs/configuration/)
+- **See more examples** → [Examples](/docs/examples/)
+- **Set up CI/CD** → [CI/CD Integration](/docs/ci-cd/)
+- **Use the MCP server** → [MCP Server](/docs/mcp-server/)