You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Community-maintained graders for [agentevals](https://github.com/agentevals-dev/agentevals) -- the agent evaluation framework built on Google ADK.
3
+
Community-maintained evaluators for [agentevals](https://github.com/agentevals-dev/agentevals) -- the agent evaluation framework built on Google ADK.
4
4
5
-
Graders are standalone scoring programs that evaluate agent traces. They read `EvalInput` JSON from stdin and write `EvalResult` JSON to stdout. This repository is the official index of community-contributed graders.
5
+
Evaluators are standalone scoring programs that evaluate agent traces. They read `EvalInput` JSON from stdin and write `EvalResult` JSON to stdout. This repository is the official index of community-contributed evaluators.
6
6
7
-
## Using community graders
7
+
## Using community evaluators
8
8
9
-
### Browse available graders
9
+
### Browse available evaluators
10
10
11
11
```bash
12
-
agentevals grader list --source github
12
+
agentevals evaluator list --source github
13
13
```
14
14
15
-
### Reference a community grader in your eval config
15
+
### Reference a community evaluator in your eval config
16
16
17
17
Add a `type: remote` entry to your `eval_config.yaml`:
@@ -45,34 +45,34 @@ agentevals run traces/my_trace.json \
45
45
--eval-set eval_set.json
46
46
```
47
47
48
-
The grader is downloaded automatically and cached in `~/.cache/agentevals/graders/`.
48
+
The evaluator is downloaded automatically and cached in `~/.cache/agentevals/evaluators/`.
49
49
50
-
## Contributing a grader
50
+
## Contributing an evaluator
51
51
52
-
### 1. Scaffold a new grader
52
+
### 1. Scaffold a new evaluator
53
53
54
54
```bash
55
55
pip install agentevals
56
-
agentevals grader init my_grader
56
+
agentevals evaluator init my_evaluator
57
57
```
58
58
59
59
This creates a directory ready to be added to this repo:
60
60
61
61
```
62
-
my_grader/
63
-
├── my_grader.py # your scoring logic
64
-
└── grader.yaml # metadata manifest
62
+
my_evaluator/
63
+
├── my_evaluator.py # your scoring logic
64
+
└── evaluator.yaml # metadata manifest
65
65
```
66
66
67
67
### 2. Implement your scoring logic
68
68
69
-
Edit `my_grader.py`. Your function receives an `EvalInput` with the agent's invocations and returns an `EvalResult` with a score between 0.0 and 1.0.
69
+
Edit `my_evaluator.py`. Your function receives an `EvalInput` with the agent's invocations and returns an `EvalResult` with a score between 0.0 and 1.0.
70
70
71
71
```python
72
72
from agentevals_grader_sdk import grader, EvalInput, EvalResult
2. Copy your evaluator directory into `evaluators/`:
134
137
135
138
```
136
-
graders/
137
-
├── my_grader/
138
-
│ ├── grader.yaml
139
-
│ └── my_grader.py
139
+
evaluators/
140
+
├── my_evaluator/
141
+
│ ├── evaluator.yaml
142
+
│ └── my_evaluator.py
140
143
├── response_quality/
141
144
│ └── ...
142
145
└── tool_coverage/
@@ -145,16 +148,16 @@ graders/
145
148
146
149
3. Open a PR against `main`
147
150
148
-
CI will automatically validate your grader (manifest, syntax, and smoke run). Once merged, a separate workflow regenerates `index.yaml`, and your grader becomes available to everyone via `agentevals grader list`.
151
+
CI will automatically validate your evaluator (manifest, syntax, and smoke run). Once merged, a separate workflow regenerates `index.yaml`, and your evaluator becomes available to everyone via `agentevals evaluator list`.
149
152
150
153
## Supported languages
151
154
152
-
Graders can be written in any language that reads JSON from stdin and writes JSON to stdout.
155
+
Evaluators can be written in any language that reads JSON from stdin and writes JSON to stdout.
| JavaScript |`.js`| No SDK yet -- just read stdin, write stdout |
158
161
| TypeScript |`.ts`| No SDK yet -- just read stdin, write stdout |
159
162
160
-
See the [custom graders documentation](https://github.com/agentevals-dev/agentevals/blob/main/docs/custom-graders.md) for the full protocol reference.
163
+
See the [custom evaluators documentation](https://github.com/agentevals-dev/agentevals/blob/main/docs/custom-evaluators.md) for the full protocol reference.
0 commit comments