feat(evaluators): add native run_async support to LLMEvaluator#11581
Conversation
|
@GovindhKishore is attempting to deploy a commit to the deepset Team on Vercel. A member of the Team first needs to authorize it. |
|
@sjrl The CI is failing because Is this approach fine and within scope? I plan to add matching Let me know if this works or if you prefer a different pattern! |
Yes please do! Your approach sounds good and I'll double check it once you have it in |
fa29ca2 to
8367773
Compare
Coverage reportClick to see where and how coverage changed
This report was generated by python-coverage-comment-action |
||||||||||||||||||||||||||||||||||||
@sjrl I've updated the implementation. Kindly take your time to leave a review. |
…nerators to LLMEvaluator
d82fc9a to
58ddbbf
Compare
Co-authored-by: Sebastian Husch Lee <10526848+sjrl@users.noreply.github.com>
Thanks a lot for your guidance and merge @sjrl |
Related Issues
Proposed Changes:
Added native asynchronous support (
run_async) toLLMEvaluator,FaithfulnessEvaluator, andContextRelevanceEvaluator.This allows evaluation pipelines to run concurrently inside asynchronous environments (like FastMCP or FastAPI) without stalling the main event loop during LLM network requests.How it works:
asyncio.to_threadinstead of blocking the event loop.FaithfulnessEvaluatorandContextRelevanceEvaluatorexplicitly mirror their synchronous counterparts, calling the parent async engine and running their unique metrics parsing loops over the concurrent batch results.async_tqdmto keep tracking completely non-blocking.raise_on_failure), safely caching individual row exceptions intoNaNtargets when disabled.How did you test it?
Added complete async test suites across all three evaluator test modules (
TestLLMEvaluatorAsync,TestFaithfulnessEvaluatorAsync, andTestContextRelevanceEvaluatorAsync). The tests cover:asyncio.to_threadwithout stalling the event loop.FaithfulnessEvaluatorandContextRelevanceEvaluatoraggregate data identically to their synchronous counterparts.Notes for the reviewer
The implementation structure mirrors the synchronous run method identically to keep the component maintenance straightforward. The async test fixtures utilize standard monkeypatch strategies to mirror existing testing conventions throughout the file.
Checklist
feat: add run_async support to LLMEvaluator