braintrust.Eval() appears to be unsafe under gevent monkey patching because the sync API drives the async evaluator with asyncio.run(). While that loop is active, sibling gevent greenlets on the same OS thread can observe it via asyncio.get_running_loop().
Minimal repro:
from gevent import monkey
monkey.patch_all()
import asyncio
import gevent
from braintrust import Eval
def slow_task(_input):
gevent.sleep(0.25)
return "output"
def score(input, output, expected=None):
return 1
def run_eval_greenlet():
Eval(
"gevent-asyncio-repro",
data=[{"input": "x", "expected": "output"}],
task=slow_task,
scores=[score],
no_send_logs=True,
max_concurrency=1,
)
def sibling_greenlet():
gevent.sleep(0.05)
try:
loop = asyncio.get_running_loop()
except RuntimeError:
print("PASS: sibling greenlet sees no running asyncio loop")
else:
raise AssertionError(f"sibling greenlet unexpectedly sees {loop!r}")
jobs = [gevent.spawn(run_eval_greenlet), gevent.spawn(sibling_greenlet)]
gevent.joinall(jobs)
for job in jobs:
job.get()
Actual result:
AssertionError: sibling greenlet unexpectedly sees <_UnixSelectorEventLoop running=True closed=False debug=False>
Expected result:
A sibling gevent greenlet that is not running Braintrust eval code should not observe Braintrust's internal asyncio loop as its own running loop.
Why this matters:
Frameworks and libraries commonly use asyncio.get_running_loop() as a guard to detect async context. Django is one example: sync ORM access is rejected when get_running_loop() succeeds. In a gevent WSGI worker, this means one request running braintrust.Eval() can make unrelated sibling requests in the same worker appear to be inside an async context.
This can effectively poison the entire WSGI worker while the eval runs, making various unrelated things fail entirely.
braintrust.Eval()appears to be unsafe under gevent monkey patching because the sync API drives the async evaluator withasyncio.run(). While that loop is active, sibling gevent greenlets on the same OS thread can observe it viaasyncio.get_running_loop().Minimal repro:
Actual result:
AssertionError: sibling greenlet unexpectedly sees <_UnixSelectorEventLoop running=True closed=False debug=False>
Expected result:
A sibling gevent greenlet that is not running Braintrust eval code should not observe Braintrust's internal asyncio loop as its own running loop.
Why this matters:
Frameworks and libraries commonly use asyncio.get_running_loop() as a guard to detect async context. Django is one example: sync ORM access is rejected when get_running_loop() succeeds. In a gevent WSGI worker, this means one request running braintrust.Eval() can make unrelated sibling requests in the same worker appear to be inside an async context.
This can effectively poison the entire WSGI worker while the eval runs, making various unrelated things fail entirely.