Commit b425394
feat: add judge evaluation support to agent graphs
Adds per-node judge evaluation to agent graph execution. Each AIAgentConfig
now carries a pre-built Evaluator (mirroring AICompletionConfig) that the
provider-specific AgentGraphRunner invokes after each node's model response.
Results are tracked via the same AIConfigTracker used for that node's LLM
metrics, ensuring evaluation data is correlated correctly.
Key changes:
- New Evaluator class coordinating multiple judges; evaluate() returns an
asyncio Task so evaluation fires immediately and is awaited in flush()
- AIAgentConfig and AICompletionConfig carry an eager evaluator (kw_only field)
- LangGraph runner stores per-node eval tasks in _pending_eval_tasks and
flushes them via the callback handler's async flush() method
- OpenAI runner fires judge evaluation at handoff and final-segment points
- client._build_evaluator() handles empty/None judge config via Evaluator.noop()
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>1 parent da0c9c6 commit b425394
20 files changed
Lines changed: 311 additions & 115 deletions
File tree
- packages
- ai-providers
- server-ai-langchain
- src/ldai_langchain
- tests
- server-ai-openai
- src/ldai_openai
- tests
- sdk/server-ai
- src/ldai
- providers
- tests
Lines changed: 5 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
42 | | - | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
43 | 47 | | |
44 | 48 | | |
45 | 49 | | |
| |||
Lines changed: 22 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
| 3 | + | |
3 | 4 | | |
4 | | - | |
| 5 | + | |
5 | 6 | | |
6 | 7 | | |
7 | 8 | | |
| |||
67 | 68 | | |
68 | 69 | | |
69 | 70 | | |
70 | | - | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
71 | 76 | | |
72 | 77 | | |
73 | 78 | | |
| |||
79 | 84 | | |
80 | 85 | | |
81 | 86 | | |
| 87 | + | |
82 | 88 | | |
83 | 89 | | |
84 | 90 | | |
| |||
172 | 178 | | |
173 | 179 | | |
174 | 180 | | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
175 | 193 | | |
176 | 194 | | |
177 | 195 | | |
| |||
280 | 298 | | |
281 | 299 | | |
282 | 300 | | |
| 301 | + | |
283 | 302 | | |
284 | 303 | | |
285 | 304 | | |
| |||
299 | 318 | | |
300 | 319 | | |
301 | 320 | | |
302 | | - | |
| 321 | + | |
303 | 322 | | |
304 | 323 | | |
305 | 324 | | |
| |||
Lines changed: 15 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
188 | 188 | | |
189 | 189 | | |
190 | 190 | | |
191 | | - | |
| 191 | + | |
192 | 192 | | |
193 | 193 | | |
194 | 194 | | |
195 | 195 | | |
196 | 196 | | |
197 | 197 | | |
| 198 | + | |
| 199 | + | |
198 | 200 | | |
199 | 201 | | |
200 | 202 | | |
| |||
220 | 222 | | |
221 | 223 | | |
222 | 224 | | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
Lines changed: 3 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
| 9 | + | |
9 | 10 | | |
10 | 11 | | |
11 | 12 | | |
| |||
530 | 531 | | |
531 | 532 | | |
532 | 533 | | |
| 534 | + | |
533 | 535 | | |
534 | 536 | | |
535 | 537 | | |
| |||
553 | 555 | | |
554 | 556 | | |
555 | 557 | | |
| 558 | + | |
556 | 559 | | |
557 | 560 | | |
558 | 561 | | |
| |||
Lines changed: 2 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| 7 | + | |
7 | 8 | | |
8 | 9 | | |
9 | 10 | | |
| |||
20 | 21 | | |
21 | 22 | | |
22 | 23 | | |
| 24 | + | |
23 | 25 | | |
24 | 26 | | |
25 | 27 | | |
| |||
Lines changed: 25 additions & 14 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
| 20 | + | |
20 | 21 | | |
21 | 22 | | |
22 | 23 | | |
| |||
48 | 49 | | |
49 | 50 | | |
50 | 51 | | |
| 52 | + | |
51 | 53 | | |
52 | 54 | | |
53 | 55 | | |
| |||
317 | 319 | | |
318 | 320 | | |
319 | 321 | | |
320 | | - | |
| 322 | + | |
| 323 | + | |
321 | 324 | | |
322 | 325 | | |
323 | 326 | | |
| |||
327 | 330 | | |
328 | 331 | | |
329 | 332 | | |
330 | | - | |
| 333 | + | |
331 | 334 | | |
332 | 335 | | |
333 | 336 | | |
| |||
336 | 339 | | |
337 | 340 | | |
338 | 341 | | |
339 | | - | |
| 342 | + | |
| 343 | + | |
340 | 344 | | |
341 | 345 | | |
342 | 346 | | |
| |||
346 | 350 | | |
347 | 351 | | |
348 | 352 | | |
349 | | - | |
| 353 | + | |
350 | 354 | | |
351 | 355 | | |
352 | 356 | | |
353 | 357 | | |
354 | 358 | | |
355 | | - | |
| 359 | + | |
| 360 | + | |
356 | 361 | | |
357 | 362 | | |
358 | 363 | | |
| |||
366 | 371 | | |
367 | 372 | | |
368 | 373 | | |
369 | | - | |
| 374 | + | |
370 | 375 | | |
371 | 376 | | |
372 | 377 | | |
373 | 378 | | |
374 | 379 | | |
375 | 380 | | |
376 | 381 | | |
377 | | - | |
| 382 | + | |
| 383 | + | |
378 | 384 | | |
379 | 385 | | |
380 | 386 | | |
| |||
384 | 390 | | |
385 | 391 | | |
386 | 392 | | |
387 | | - | |
| 393 | + | |
388 | 394 | | |
389 | 395 | | |
390 | 396 | | |
391 | 397 | | |
392 | 398 | | |
393 | 399 | | |
394 | | - | |
| 400 | + | |
| 401 | + | |
395 | 402 | | |
396 | 403 | | |
397 | 404 | | |
| |||
408 | 415 | | |
409 | 416 | | |
410 | 417 | | |
| 418 | + | |
411 | 419 | | |
412 | 420 | | |
413 | 421 | | |
| |||
432 | 440 | | |
433 | 441 | | |
434 | 442 | | |
435 | | - | |
| 443 | + | |
436 | 444 | | |
437 | 445 | | |
438 | 446 | | |
439 | 447 | | |
440 | 448 | | |
441 | 449 | | |
442 | | - | |
| 450 | + | |
| 451 | + | |
443 | 452 | | |
444 | 453 | | |
445 | 454 | | |
446 | 455 | | |
447 | 456 | | |
448 | 457 | | |
449 | 458 | | |
450 | | - | |
| 459 | + | |
451 | 460 | | |
452 | 461 | | |
453 | 462 | | |
454 | 463 | | |
455 | 464 | | |
456 | 465 | | |
457 | | - | |
| 466 | + | |
| 467 | + | |
458 | 468 | | |
459 | 469 | | |
460 | 470 | | |
| |||
463 | 473 | | |
464 | 474 | | |
465 | 475 | | |
| 476 | + | |
466 | 477 | | |
467 | 478 | | |
468 | 479 | | |
| |||
483 | 494 | | |
484 | 495 | | |
485 | 496 | | |
486 | | - | |
| 497 | + | |
487 | 498 | | |
488 | 499 | | |
489 | 500 | | |
| |||
0 commit comments