Skip to content

Commit c10de04

Browse files
committed
updates
1 parent 18313cb commit c10de04

1 file changed

Lines changed: 10 additions & 2 deletions

File tree

fern/observability/evals-quickstart.mdx

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -576,7 +576,7 @@ For complex validation criteria beyond pattern matching, use AI-powered judges t
576576
```
577577
You are an LLM-Judge. Evaluate ONLY the last assistant message in the mock conversation: {{messages[-1]}}.
578578
579-
Include the full conversation history for context: {{messages[0:-1]}}
579+
Include the full conversation history for context: {{messages}}
580580
581581
Decision rule:
582582
- PASS if ALL "pass criteria" are satisfied AND NONE of the "fail criteria" are triggered.
@@ -596,6 +596,12 @@ Output format: respond with exactly one word: pass or fail
596596
- No additional text
597597
```
598598

599+
<Note>
600+
**Template variables:**
601+
- `{{messages}}` - The entire conversation history (all messages exchanged)
602+
- `{{messages[-1]}}` - The last assistant message only
603+
</Note>
604+
599605
### Example: Evaluate helpfulness and tone
600606

601607
<Tabs>
@@ -630,7 +636,7 @@ curl -X POST "https://api.vapi.ai/eval" \
630636
"model": "gpt-4o",
631637
"messages": [{
632638
"role": "system",
633-
"content": "You are an LLM-Judge. Evaluate ONLY the last assistant message: {{messages[-1]}}.\n\nInclude context: {{messages[0:-1]}}\n\nDecision rule:\n- PASS if ALL pass criteria are met AND NO fail criteria are triggered.\n- Otherwise FAIL.\n\nPass criteria:\n- Response acknowledges the user request\n- Response offers specific help or next steps\n- Tone is professional and friendly\n\nFail criteria (any triggers FAIL):\n- Response is rude or dismissive\n- Response ignores the user request\n- Response provides no actionable information\n\nOutput format: respond with exactly one word: pass or fail"
639+
"content": "You are an LLM-Judge. Evaluate ONLY the last assistant message: {{messages[-1]}}.\n\nInclude context: {{messages}}\n\nDecision rule:\n- PASS if ALL pass criteria are met AND NO fail criteria are triggered.\n- Otherwise FAIL.\n\nPass criteria:\n- Response acknowledges the user request\n- Response offers specific help or next steps\n- Tone is professional and friendly\n\nFail criteria (any triggers FAIL):\n- Response is rude or dismissive\n- Response ignores the user request\n- Response provides no actionable information\n\nOutput format: respond with exactly one word: pass or fail"
634640
}]
635641
}
636642
}
@@ -1366,11 +1372,13 @@ Run multiple evals sequentially to validate all greeting scenarios.
13661372
</Card>
13671373

13681374
{" "}
1375+
13691376
<Card title="Assistants guide" icon="robot" href="/assistants/quickstart">
13701377
Create and configure assistants to test
13711378
</Card>
13721379

13731380
{" "}
1381+
13741382
<Card title="Tools documentation" icon="wrench" href="/tools/custom-tools">
13751383
Build custom tools and validate their behavior
13761384
</Card>

0 commit comments

Comments
 (0)