You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -422,6 +422,26 @@ for evaluator in evaluators.get("evaluatorSummaries", []):
422
422
423
423
Use custom evaluators when built-in ones don't cover your domain-specific quality criteria (e.g., financial accuracy, medical safety, brand voice compliance).
424
424
425
+
### Placeholder Reference
426
+
427
+
Each evaluation level supports a fixed set of placeholders (single braces) that get replaced with actual trace data:
428
+
429
+
| Level | Placeholder | Description |
430
+
|---|---|---|
431
+
| SESSION |`{context}`| User prompts, assistant responses, and tool calls across all turns |
432
+
| SESSION |`{available_tools}`| Available tool calls including ID, parameters, and description |
433
+
| TRACE |`{context}`| Previous turns + current turn's user prompt and tool calls |
434
+
| TRACE |`{assistant_turn}`| The assistant response for the current turn |
435
+
| TOOL_CALL |`{context}`| Previous turns + current turn's user prompt and prior tool calls |
436
+
| TOOL_CALL |`{tool_turn}`| The tool call under evaluation |
437
+
| TOOL_CALL |`{available_tools}`| Available tool calls including ID, parameters, and description |
438
+
439
+
> **Important:** Use single braces `{placeholder}`, not double braces `{{placeholder}}`. The instruction must include at least one placeholder.
440
+
441
+
### Create a Custom Evaluator (AWS SDK)
442
+
443
+
The `create_evaluator` API uses a nested `evaluatorConfig` structure with `llmAsAJudge` containing the instructions, rating scale, and model config:
current_evaluators = [e["evaluatorId"] for e in config.get("evaluators", [])]
576
+
577
+
# Add the custom evaluator
578
+
current_evaluators.append("domain_accuracy-XXXXXXXXXX") # Use the ID from create_evaluator
579
+
580
+
control_client.update_online_evaluation_config(
581
+
onlineEvaluationConfigId="your-config-id",
582
+
evaluators=[{"evaluatorId": eid} for eid in current_evaluators]
583
+
)
456
584
```
457
585
458
586
Custom evaluators can then be used in both online and on-demand evaluations just like built-in ones.
459
587
588
+
> **Note:** The service automatically appends a standardization prompt to your instructions that enforces `reason` and `score` output fields. Do not include output formatting instructions in your evaluator instructions.
0 commit comments