- cpython/Objects/genobject.c
- cpython/Include/genobject.h
There's a common definition among generator, coroutine, and async generator
#define _PyGenObject_HEAD(prefix) \
PyObject_HEAD \
/* Note: gi_frame can be NULL if the generator is "finished" */ \
struct _frame *prefix##_frame; \
/* True if generator is being executed. */ \
char prefix##_running; \
/* The code object backing the generator */ \
PyObject *prefix##_code; \
/* List of weak reference. */ \
PyObject *prefix##_weakreflist; \
/* Name of the generator. */ \
PyObject *prefix##_name; \
/* Qualified name of the generator. */ \
PyObject *prefix##_qualname; \
_PyErr_StackItem prefix##_exc_state;The definition of the generator object is less than 4 lines
typedef struct {
/* The gi_ prefix is intended to remind of generator-iterator. */
_PyGenObject_HEAD(gi)
} PyGenObject;which can be expanded to
typedef struct {
struct _frame *gi_frame;
char gi_running;
PyObject *gi_code;
PyObject *gi_weakreflist;
PyObject *gi_name;
PyObject *gi_qualname;
_PyErr_StackItem gi_exc_state;
} PyGenObject;We can draw the layout according to the code now
Let's define and iterate through a generator
def fib(n):
t = 0
i = 1
j = 1
r = 0
result = None
while t <= n:
print("result", repr(result))
if t < 2:
result = yield i
else:
r = i + j
result = yield r
i = j
j = r
t += 1
try:
1 / 0
except ZeroDivisionError:
r = yield "ZeroDivisionError"
print(repr(r))
try:
import empty
except ModuleNotFoundError:
result = yield "ModuleNotFoundError"
print("result", repr(result))
finally:
result = yield "ModuleNotFoundError finally"
print("result", repr(result))
raise StopIteration
>>> f = fib(5)
>>> type(f)
<class 'generator'>
>>> type(fib)
<class 'function'>
>>> f.gi_frame.f_lasti
-1We initialize a new generator. The f_lasti in gi_frame acts as the program counter in the Python virtual machine; it indicates the next instruction offset from the code block inside gi_code
>>> fib.__code__
<code object fib at 0x1041069c0, file "<stdin>", line 1>
>>> f.gi_code
<code object fib at 0x1041069c0, file "<stdin>", line 1>The gi_code inside the f object is the code object that represents the function fib.
The gi_running is 0, indicating the generator is not executing right now.
gi_name and gi_qualname both point to the same unicode object; all fields in gi_exc_state have value 0x00
>>> r = f.send(None)
result None
>>> f.gi_frame.f_lasti
52
>>> repr(r)
'1'Looking into the object f, nothing changed.
But the f_lasti in gi_frame is now at position 52 (the first place the keyword yield appears)
Step one more time. Due to the while loop, the f_lasti still points to the same position
>>> r = f.send("handsome")
result 'handsome'
>>> f.gi_frame.f_lasti
52
>>> repr(r)
'1'Send again. The f_lasti indicates the code offset at the position of the second yield
>>> r = f.send("handsome2")
result 'handsome2'
>>> f.gi_frame.f_lasti
68
>>> repr(r)
'2'repeat
>>> r = f.send("handsome3")
result 'handsome3'
>>> f.gi_frame.f_lasti
68
>>> repr(r)
'3'
>>> r = f.send("handsome4")
result 'handsome4'
>>> f.gi_frame.f_lasti
68
>>> repr(r)
'5'
>>> r = f.send("handsome5")
result 'handsome5'
>>> f.gi_frame.f_lasti
68
>>> repr(r)
'8'Now the while loop is terminated.
The f_lasti is at the position of the first except statement. exc_type points to the type of the exception, exc_value points to the instance of the exception, and exc_traceback points to the traceback object
>>> r = f.send("handsome6")
>>> f.gi_frame.f_lasti
120
>>> repr(r)
"'ZeroDivisionError'"The f_lasti is at the position of the second except statement. exc_type, exc_value, and exc_traceback now relate to ModuleNotFoundError
>>> r = f.send("handsome7")
'handsome7'
>>> f.gi_frame.f_lasti
168
>>> repr(r)
"'ModuleNotFoundError'"The f_lasti is at the position of the first finally statement. The ModuleNotFoundError is handled properly; at the top of the exception stack is the ZeroDivisionError.
Actually, the information about exception handling is stored in frame object. gi_exc_state is used for representing whether the current generator object is handling an exception and the details of the most nested exception
>>> r = f.send("handsome8")
result 'handsome8'
>>> f.gi_frame.f_lasti
198
>>> repr(r)
"'ModuleNotFoundError finally'"Now the StopIteration is raised.
The frameObject in the gi_frame field is freed.
The field gi_frame points to a null pointer, indicating that the generator is terminated.
The state in gi_exc_state is restored
>>> r = f.send("handsome9")
result 'handsome9'
Traceback (most recent call last):
File "<stdin>", line 30, in fib
StopIteration
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: generator raised StopIteration
>>> f.gi_frame.f_lasti
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'f_lasti'Most parts of the definition of the coroutine type and generator are the same.
The coroutine-only field named cr_origin, which tracks the traceback of the coroutine object, is disabled by default. It can be enabled by sys.set_coroutine_origin_tracking_depth. For more detail, please refer to docs.python.org(set_coroutine_origin_tracking_depth)
Let's try to run an example with the coroutine type defined to understand each field's meaning.
As usual, I've altered the source code so that my repr function is able to print all the low-level details of the object
import sys
import time
import asyncio
sys.set_coroutine_origin_tracking_depth(100)
cor_list = list()
async def cor(recursive_depth=1):
t1 = time.time()
try:
await asyncio.sleep(3)
1 / 0
except ZeroDivisionError:
if recursive_depth > 0:
r = cor(recursive_depth-1)
cor_list.append(r)
await r
t2 = time.time()
print("recursive_depth: %d, cost %.2f seconds" % (recursive_depth, t2 - t1))
def pr_cor_list():
for index, each in enumerate(cor_list):
print("index: %d, id: %d, each.cr_frame.f_lasti: %s" % (index, id(each), "None object" if each.cr_frame is None else str(each.cr_frame.f_lasti)))
print(repr(each))
async def test():
c = cor()
cor_list.append(c)
ts = time.time()
pending = [c]
pr_cor_list()
while pending:
done, pending = await asyncio.wait(pending, timeout=2)
ts_now = time.time()
print("%.2f seconds elapse" % (ts_now - ts, ))
pr_cor_list()
if __name__ == "__main__":
asyncio.run(test())If you call a function defined with the async keyword, the result is an object of type coroutine
>>> c = cor()
>>> type(c)
<class 'coroutine'>in the test function, before the first await statement at the moment
>>> cor_list[0].cr_origin
(('<stdin>', 2, 'test'), ('/Users/zpoint/Desktop/cpython/Lib/asyncio/events.py', 81, '_run'), ('/Users/zpoint/Desktop/cpython/Lib/asyncio/base_events.py', 1765, '_run_once'), ('/Users/zpoint/Desktop/cpython/Lib/asyncio/base_events.py', 544, 'run_forever'), ('/Users/zpoint/Desktop/cpython/Lib/asyncio/base_events.py', 576, 'run_until_complete'), ('/Users/zpoint/Desktop/cpython/Lib/asyncio/runners.py', 43, 'run'), ('<stdin>', 2, '<module>'))The content in field cr_origin on my computer is the calling stack from bottom to top
in the 2.01 seconds, nothing changed, except the f_lasti in the coroutine.cr_frame now points to the first await statement in cor function
in the 4.01 seconds, f_lasti in cor_list[0] now points to the await r expression, which in position 86
the exc_type, exc_value and exc_traceback holds information about the ZeroDivisionError, same as the generator object
the coroutine in cor_list[1] now stuck in the await asyncio.sleep(3) expression, the value in f_lasti is 20
the cr_code is the same as cor_list[0], but the cr_frame is different
every function call will create a new frame, more information about frame object
>>> cor_list[1].cr_origin
(('<stdin>', 8, 'cor'), ('/Users/zpoint/Desktop/cpython/Lib/asyncio/events.py', 81, '_run'), ('/Users/zpoint/Desktop/cpython/Lib/asyncio/base_events.py', 1765, '_run_once'), ('/Users/zpoint/Desktop/cpython/Lib/asyncio/base_events.py', 544, 'run_forever'), ('/Users/zpoint/Desktop/cpython/Lib/asyncio/base_events.py', 576, 'run_until_complete'), ('/Users/zpoint/Desktop/cpython/Lib/asyncio/runners.py', 43, 'run'), ('<stdin>', 2, '<module>'))in the 6.01 seconds, both cor_list[0] and cor_list[1] returned, and their cr_frame field becomes null pointer, the handling process is similar to the generator type
The layout of async generator is the same as the generator type, except for ag_finalizer, ag_hooks_inited, and ag_closed
The set_asyncgen_hooks function is used for setting up a firstiter and a finalizer. firstiter will be called when an asynchronous generator is iterated for the first time; finalizer will be called when an asynchronous generator is about to be garbage collected.
The run_forever function in the asyncio base event loop has defined
def run_forever(self):
...
old_agen_hooks = sys.get_asyncgen_hooks()
sys.set_asyncgen_hooks(firstiter=self._asyncgen_firstiter_hook,
finalizer=self._asyncgen_finalizer_hook)
try:
...
finally:
sys.set_asyncgen_hooks(*old_agen_hooks)you can define your own event loop to override the default firstiter and finalizer, please refer to poython3-doc set_asyncgen_hooks for more detail
# example of set_asyncgen_hooks
import sys
async def async_fib(n):
yield 1
def firstiter(async_gen):
print("in firstiter: ", async_gen)
def finalizer(async_gen):
print("in finalizer: ", async_gen)
sys.set_asyncgen_hooks(firstiter, finalizer)
>>> f = async_fib(3)
>>> f.__anext__()
in firstiter: <async_generator object async_fib at 0x10a98f598>
<async_generator_asend at 0x10a7487c8>Let's define and iterate through an async iterator
import asyncio
async def async_fib(n):
t = 0
i = 1
j = 1
r = 0
result = None
while t <= n:
print("result", repr(result))
await asyncio.sleep(3)
if t < 2:
result = yield i
else:
r = i + j
result = yield r
i = j
j = r
t += 1
class AsendTest(object):
def __init__(self, n):
self.f = async_fib(n)
self.loop = asyncio.get_event_loop()
async def make_the_call(self, val):
r = await self.f.asend(val)
print("repr asend", repr(r))
def __call__(self, *args, **kwargs):
self.loop.run_until_complete(self.make_the_call(args[0]))
a = AsendTest(3)
>>> type(a.f)
<class 'async_generator'>iterate through it
if you need more detail of __aiter__, __anext__ and etc, please refer to pep-0525
>>> a(None)
result None
repr asend 1
>>> a.f.ag_frame.f_lasti
68the ag_weakreflist points to a weak reference created by BaseEventLoop(asyncio->base_events.py)
it's used for shutdown all active asynchronous generators, read the source code for more detail
ag_finalizer now points to the finalizer, set up by BaseEventLoop(calling the sys.set_asyncgen_hooks method)
ag_hooks_inited is 1, indicate that hooks are set up
in the second time of the while loop, nothing changed
>>> a("handsome")
result 'handsome'
repr asend 1
>>> a.f.ag_frame.f_lasti
68now, the f_lasti indicate the position of the second yield stateement in the function async_fib
>>> a("handsome2")
result 'handsome2'
repr asend 2
>>> a.f.ag_frame.f_lasti
84>>> a("handsome3")
result 'handsome3'
repr asend 3
>>> a.f.ag_frame.f_lasti
84
>>> a("handsome4")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 9, in __call__
File "/Users/zpoint/Desktop/cpython/Lib/asyncio/base_events.py", line 589, in run_until_complete
return future.result()
File "<stdin>", line 6, in make_the_call
StopAsyncIterationnow, the ag_closed is set to 1 because of the termination of the async generator(StopAsyncIteration raised or aclose() is called)
the ag_frame is deallocated
the free list mechanism is used for type async_generator_asend and async_generator_wrapped_value
#ifndef _PyAsyncGen_MAXFREELIST
#define _PyAsyncGen_MAXFREELIST 80
#endif
static _PyAsyncGenWrappedValue *ag_value_freelist[_PyAsyncGen_MAXFREELIST];
static int ag_value_freelist_free = 0;
static PyAsyncGenASend *ag_asend_freelist[_PyAsyncGen_MAXFREELIST];
static int ag_asend_freelist_free = 0;because they both are short-living objects and are instantiated for every _anext_ call, free list are able to
- boost performance 6-10%
- reduce memory fragmentation
the id is the same, the address of previous async_generator_asend is reused
>>> f = async_fib(3)
>>> r = f.asend(None)
>>> type(r)
<class 'async_generator_asend'>
>>> id(r)
4376804088
>>> del r
>>> r = f.asend(None)
>>> id(r)
4376804088
















