Skip to content

Commit 580dd3d

Browse files
committed
docs: standardize landing page interface wording
1 parent 2e13fb5 commit 580dd3d

1 file changed

Lines changed: 91 additions & 119 deletions

File tree

layouts/index.html

Lines changed: 91 additions & 119 deletions
Original file line numberDiff line numberDiff line change
@@ -1,141 +1,113 @@
1-
<!DOCTYPE html>
2-
<html lang="en">
3-
<head>
4-
<meta charset="UTF-8">
5-
<meta name="viewport" content="width=device-width, initial-scale=1.0">
6-
<title>{{ .Site.Title }}</title>
7-
{{ partial "head.html" . }}
8-
</head>
9-
<body>
10-
<div class="bg-grid"></div>
11-
12-
<header class="hero">
13-
<div class="hero-content">
14-
<div class="hero-badge">
15-
<span class="badge-dot"></span>
16-
Open source • Python SDK • OpenTelemetry native
17-
</div>
18-
<h1>Score your AI agent behavior from traces.</h1>
19-
<p class="hero-subtitle">
20-
AgentEvals is the open-source Python framework for scoring AI agent performance and behavior
21-
from OpenTelemetry traces. Test prompts, tools, memory, and workflows without re-running your agents.
22-
</p>
23-
<div class="hero-cta">
24-
<a href="/docs/quick-start/" class="btn btn-primary">Quick Start</a>
25-
<a href="https://github.com/agentevals-dev/agentevals" class="btn btn-secondary" target="_blank" rel="noopener">GitHub</a>
26-
</div>
27-
<div class="hero-meta">
28-
<span>CLI</span>
29-
<span>Custom Evaluators</span>
30-
<span>Web UI</span>
31-
<span>CI/CD</span>
1+
{{ define "main" }}
2+
<section class="hero">
3+
<div class="container hero-grid">
4+
<div>
5+
<p class="eyebrow">OpenTelemetry-native agent evaluation</p>
6+
<h1>Score AI agents from traces — no reruns required</h1>
7+
<p class="lead">
8+
agentevals turns OpenTelemetry traces into repeatable, rubric-based scores for tool use,
9+
handoffs, planning, and other agent behaviors.
10+
</p>
11+
<div class="hero-actions">
12+
<a class="btn btn-primary" href="/docs/quick-start/">Start with the CLI</a>
13+
<a class="btn btn-secondary" href="/docs/ui-walkthrough/">Open Web</a>
14+
</div>
3215
</div>
16+
<aside class="hero-card">
17+
<div class="terminal">
18+
<div class="terminal-bar">
19+
<span></span><span></span><span></span>
20+
</div>
21+
<pre><code>$ uv tool install agentevals
22+
$ agentevals score traces.jsonl \
23+
--config agentevals.yaml
24+
25+
✔ 184 traces scored
26+
✔ 91% tool-call success
27+
✔ Mean rubric score: 4.4 / 5.0</code></pre>
28+
</div>
29+
</aside>
3330
</div>
34-
</header>
31+
</section>
3532

36-
<main>
37-
<section class="features section">
33+
<section class="section">
34+
<div class="container">
3835
<div class="section-header">
39-
<span class="section-label">Why AgentEvals</span>
40-
<h2>Evaluation that matches how agents actually run.</h2>
41-
<p>Traditional evals re-run entire workflows. AgentEvals scores the traces you already collect, so you can measure behavior in realistic conditions.</p>
36+
<p class="eyebrow">Why agentevals</p>
37+
<h2>Evaluate behavior from the telemetry you already collect</h2>
38+
<p>
39+
Score agents against consistent rubrics using OpenTelemetry traces rather than replaying runs.
40+
Keep evaluations close to your production workflows and compare changes over time.
41+
</p>
4242
</div>
43-
4443
<div class="feature-grid">
4544
<article class="feature-card">
46-
<div class="feature-icon"></div>
47-
<h3>Trace-native evaluation</h3>
48-
<p>Built on OpenTelemetry traces so you can evaluate real production-like runs without replaying agent execution.</p>
45+
<h3>No reruns</h3>
46+
<p>Use recorded traces to evaluate real executions after the fact.</p>
4947
</article>
5048
<article class="feature-card">
51-
<div class="feature-icon"></div>
52-
<h3>Flexible scoring</h3>
53-
<p>Combine built-in evaluators with custom Python logic to measure correctness, tool usage, memory behavior, and more.</p>
49+
<h3>Behavior-first scoring</h3>
50+
<p>Measure task completion, tool use quality, handoffs, latency, and more.</p>
5451
</article>
5552
<article class="feature-card">
56-
<div class="feature-icon"></div>
57-
<h3>Works in your workflow</h3>
58-
<p>Run locally with the CLI, automate in CI/CD, or explore results visually in the web UI.</p>
53+
<h3>Built on OpenTelemetry</h3>
54+
<p>Plug into existing observability pipelines instead of inventing a parallel eval stack.</p>
5955
</article>
6056
</div>
61-
</section>
57+
</div>
58+
</section>
6259

63-
<section class="workflow section">
60+
<section class="section alt">
61+
<div class="container">
6462
<div class="section-header">
65-
<span class="section-label">How it works</span>
66-
<h2>From traces to scores in three steps.</h2>
63+
<p class="eyebrow">How it works</p>
64+
<h2>Two ways to evaluate</h2>
6765
</div>
68-
69-
<div class="workflow-steps">
70-
<div class="workflow-step">
71-
<span class="step-number">01</span>
72-
<h3>Collect traces</h3>
73-
<p>Instrument your agent with OpenTelemetry and emit traces for prompts, tool calls, memory operations, and outputs.</p>
74-
</div>
75-
<div class="workflow-step">
76-
<span class="step-number">02</span>
77-
<h3>Define evaluators</h3>
78-
<p>Choose built-in evaluators or create your own to score the behaviors that matter for your agent.</p>
79-
</div>
80-
<div class="workflow-step">
81-
<span class="step-number">03</span>
82-
<h3>Run evaluations</h3>
83-
<p>Score trace datasets through the CLI or web UI and compare results across prompts, models, or tool strategies.</p>
84-
</div>
66+
<div class="steps-grid two-up">
67+
<article class="step-card">
68+
<span class="step-number">1</span>
69+
<h3>CLI workflow</h3>
70+
<p>
71+
Run evaluations locally or in CI with config files and reproducible commands.
72+
</p>
73+
<a href="/docs/quick-start/">Open the CLI guide →</a>
74+
</article>
75+
<article class="step-card">
76+
<span class="step-number">2</span>
77+
<h3>Web workflow</h3>
78+
<p>
79+
Explore traces, inspect scores, and review rubric results in the browser.
80+
</p>
81+
<a href="/docs/ui-walkthrough/">Open the Web guide →</a>
82+
</article>
8583
</div>
86-
</section>
84+
</div>
85+
</section>
8786

88-
<section class="docs-preview section">
87+
<section class="section">
88+
<div class="container">
8989
<div class="section-header">
90-
<span class="section-label">Docs</span>
91-
<h2>Start with the path that fits your workflow.</h2>
90+
<p class="eyebrow">Docs</p>
91+
<h2>Start where you are</h2>
9292
</div>
93-
9493
<div class="docs-grid">
95-
{{ range where .Site.RegularPages "Section" "docs" }}
96-
<a class="doc-card" href="{{ .RelPermalink }}">
97-
<div>
98-
<h3>{{ .Title }}</h3>
99-
<p>{{ .Description }}</p>
100-
</div>
101-
<span class="doc-arrow"></span>
102-
</a>
103-
{{ end }}
94+
<a class="docs-card" href="/docs/quick-start/">
95+
<h3>Quick start</h3>
96+
<p>Install agentevals, run your first scoring pass, and inspect the output.</p>
97+
</a>
98+
<a class="docs-card" href="/docs/integrations/">
99+
<h3>Integrations</h3>
100+
<p>Connect agentevals with your existing tracing and observability stack.</p>
101+
</a>
102+
<a class="docs-card" href="/docs/custom-evaluators/">
103+
<h3>Custom evaluators</h3>
104+
<p>Define your own scoring logic and tailor rubrics to your agents.</p>
105+
</a>
106+
<a class="docs-card" href="/docs/ui-walkthrough/">
107+
<h3>Web walkthrough</h3>
108+
<p>See how to inspect traces and scores with the browser-based interface.</p>
109+
</a>
104110
</div>
105-
</section>
106-
107-
<section class="usage section">
108-
<div class="section-header">
109-
<span class="section-label">Usage</span>
110-
<h2>Two ways to evaluate.</h2>
111-
<p>Use the CLI for fast, scriptable scoring or the Web UI for visual exploration of evaluation results.</p>
112-
</div>
113-
114-
<div class="usage-grid">
115-
<article class="usage-card">
116-
<h3>CLI</h3>
117-
<p>Run evaluations locally or in CI with straightforward commands and structured outputs.</p>
118-
<pre><code>agentevals eval run config.yaml</code></pre>
119-
</article>
120-
<article class="usage-card">
121-
<h3>Web UI</h3>
122-
<p>Inspect trace datasets, compare runs, and review evaluator outputs in a visual interface.</p>
123-
<pre><code>agentevals ui</code></pre>
124-
</article>
125-
</div>
126-
</section>
127-
128-
<section class="cta section">
129-
<div class="cta-card">
130-
<span class="section-label">Get started</span>
131-
<h2>Bring evaluation into your agent development loop.</h2>
132-
<p>Install AgentEvals, connect your traces, and start measuring how your agent behaves in the real world.</p>
133-
<div class="hero-cta">
134-
<a href="/docs/quick-start/" class="btn btn-primary">Read the docs</a>
135-
<a href="https://github.com/agentevals-dev/agentevals" class="btn btn-secondary" target="_blank" rel="noopener">View on GitHub</a>
136-
</div>
137-
</div>
138-
</section>
139-
</main>
140-
</body>
141-
</html>
111+
</div>
112+
</section>
113+
{{ end }}

0 commit comments

Comments
 (0)