Generator#
Generators are a powerful feature in Python for creating iterators. They allow
you to iterate over data without storing the entire sequence in memory, making
them ideal for processing large datasets or infinite sequences. This cheat sheet
covers generator functions, generator expressions, yield, yield from,
sending values to generators, and async generators.
Generator Function vs Generator Expression#
A generator function is defined like a normal function but uses yield to
produce a sequence of values. When called, it returns a generator object that
can be iterated over. A generator expression is a compact syntax similar to
list comprehensions but produces values lazily on demand.
# generator function
>>> def gen_func():
... yield 5566
...
>>> g = gen_func()
>>> g
<generator object gen_func at 0x...>
>>> next(g)
5566
# generator expression
>>> g = (x for x in range(3))
>>> next(g)
0
>>> next(g)
1
Yield Values from Generator#
The yield statement produces a value and suspends the generator’s execution.
When next() is called again, execution resumes from where it left off. This
example generates prime numbers by checking divisibility for each candidate.
>>> def prime(n):
... p = 2
... while n > 0:
... for x in range(2, p):
... if p % x == 0:
... break
... else:
... yield p
... n -= 1
... p += 1
...
>>> list(prime(5))
[2, 3, 5, 7, 11]
Unpack Generators#
Python 3.5+ (PEP 448) allows unpacking generators directly into lists, sets,
function arguments, and variables using the * operator. This provides a
convenient way to consume generator values without explicit iteration.
# PEP 448 - unpacking inside a list
>>> g1 = (x for x in range(3))
>>> g2 = (x**2 for x in range(2))
>>> [1, *g1, 2, *g2]
[1, 0, 1, 2, 2, 0, 1]
# unpacking inside a set
>>> g = (x for x in [5, 5, 6, 6])
>>> {*g}
{5, 6}
# unpacking to variables
>>> g = (x for x in range(3))
>>> a, b, c = g
>>> a, b, c
(0, 1, 2)
# extended unpacking
>>> g = (x for x in range(6))
>>> a, b, *c, d = g
>>> a, b, d
(0, 1, 5)
>>> c
[2, 3, 4]
# unpacking inside a function
>>> print(*(x for x in range(3)))
0 1 2
Iterable Class via Generator#
You can make a class iterable by implementing __iter__ as a generator method.
This approach is cleaner than implementing a separate iterator class. The
__reversed__ method can also be implemented as a generator to support the
built-in reversed() function.
>>> class Count:
... def __init__(self, n):
... self._n = n
... def __iter__(self):
... n = self._n
... while n > 0:
... yield n
... n -= 1
... def __reversed__(self):
... n = 1
... while n <= self._n:
... yield n
... n += 1
...
>>> list(Count(5))
[5, 4, 3, 2, 1]
>>> list(reversed(Count(5)))
[1, 2, 3, 4, 5]
Send Values to Generator#
Generators can receive values through the send() method. The sent value
becomes the result of the yield expression inside the generator. Before
sending values, you must start the generator by calling next() or
send(None) to advance it to the first yield.
>>> def spam():
... msg = yield
... print("Message:", msg)
...
>>> g = spam()
>>> next(g) # start generator
>>> try:
... g.send("Hello World!")
... except StopIteration:
... pass
Message: Hello World!
yield from Expression#
The yield from expression delegates iteration to another generator or
iterable. It automatically handles forwarding send(), throw(), and
close() calls to the subgenerator, making it ideal for creating generator
pipelines and recursive generators.
>>> def subgen():
... try:
... yield 9527
... except ValueError:
... print("got ValueError")
...
>>> def delegating_gen():
... yield from subgen()
...
>>> g = delegating_gen()
>>> next(g)
9527
>>> try:
... g.throw(ValueError)
... except StopIteration:
... pass
got ValueError
You can chain multiple yield from expressions together. The
inspect.getgeneratorstate() function helps track the generator’s lifecycle
through its states: GEN_CREATED, GEN_RUNNING, GEN_SUSPENDED, and GEN_CLOSED.
# yield from + yield from
>>> import inspect
>>> def subgen():
... yield from range(3)
...
>>> def delegating_gen():
... yield from subgen()
...
>>> g = delegating_gen()
>>> inspect.getgeneratorstate(g)
'GEN_CREATED'
>>> next(g)
0
>>> inspect.getgeneratorstate(g)
'GEN_SUSPENDED'
>>> g.close()
>>> inspect.getgeneratorstate(g)
'GEN_CLOSED'
yield from with Return#
Generators can return a value using the return statement. The returned value
is accessible through the value attribute of the StopIteration exception.
When using yield from, the return value of the subgenerator becomes the value
of the yield from expression.
>>> def average():
... total = .0
... count = 0
... while True:
... val = yield
... if not val:
... break
... total += val
... count += 1
... return total / count
...
>>> g = average()
>>> next(g)
>>> g.send(3)
>>> g.send(5)
>>> try:
... g.send(None)
... except StopIteration as e:
... print(e.value)
4.0
>>> def subgen():
... yield 9527
...
>>> def delegating_gen():
... yield from subgen()
... return 5566
...
>>> g = delegating_gen()
>>> next(g)
9527
>>> try:
... next(g)
... except StopIteration as e:
... print(e.value)
5566
Generate Sequences#
The yield from expression provides a concise way to yield all values from
an iterable. This is particularly useful for chaining multiple sequences together
or flattening nested structures.
>>> def chain():
... yield from 'ab'
... yield from range(3)
...
>>> list(chain())
['a', 'b', 0, 1, 2]
What RES = yield from EXP Does#
This snippet shows the simplified equivalent of what yield from does
internally, as described in PEP 380. It handles iteration, value passing via
send(), and captures the return value from the subgenerator.
# Simplified version (ref: PEP 380)
>>> def subgen():
... for x in range(3):
... yield x
...
>>> def delegating_gen():
... _i = iter(subgen())
... try:
... _y = next(_i)
... except StopIteration as _e:
... RES = _e.value
... else:
... while True:
... _s = yield _y
... try:
... _y = _i.send(_s)
... except StopIteration as _e:
... RES = _e.value
... break
...
>>> list(delegating_gen())
[0, 1, 2]
Check Generator Type#
Use types.GeneratorType to check if an object is a generator. This is useful
for writing functions that need to handle generators differently from other
iterables.
>>> from types import GeneratorType
>>> def gen_func():
... yield 5566
...
>>> isinstance(gen_func(), GeneratorType)
True
Check Generator State#
The inspect.getgeneratorstate() function returns the current state of a
generator. This is helpful for debugging and understanding the generator lifecycle.
The four possible states are: GEN_CREATED (not started), GEN_RUNNING (currently
executing), GEN_SUSPENDED (paused at yield), and GEN_CLOSED (completed or closed).
>>> import inspect
>>> def gen_func():
... yield 9527
...
>>> g = gen_func()
>>> inspect.getgeneratorstate(g)
'GEN_CREATED'
>>> next(g)
9527
>>> inspect.getgeneratorstate(g)
'GEN_SUSPENDED'
>>> g.close()
>>> inspect.getgeneratorstate(g)
'GEN_CLOSED'
Context Manager via Generator#
The @contextlib.contextmanager decorator transforms a generator function into
a context manager. Code before yield runs on entering the with block,
and code after yield (typically in finally) runs on exit. The yielded
value is bound to the variable after as.
>>> import contextlib
>>> @contextlib.contextmanager
... def mylist():
... try:
... l = [1, 2, 3, 4, 5]
... yield l
... finally:
... print("exit scope")
...
>>> with mylist() as l:
... print(l)
[1, 2, 3, 4, 5]
exit scope
What @contextmanager Does#
This snippet shows a simplified implementation of how @contextmanager works
internally. It wraps a generator in a class that implements the context manager
protocol (__enter__ and __exit__), handling both normal exit and
exception propagation.
class GeneratorCM:
def __init__(self, gen):
self._gen = gen
def __enter__(self):
return next(self._gen)
def __exit__(self, *exc_info):
try:
if exc_info[0] is None:
next(self._gen)
else:
self._gen.throw(*exc_info)
except StopIteration:
return True
raise
def contextmanager(func):
def run(*a, **k):
return GeneratorCM(func(*a, **k))
return run
Profile Code Block#
A practical example of using generator-based context managers to measure
execution time of code blocks. The yield statement marks the boundary
between setup (recording start time) and teardown (calculating elapsed time).
>>> import time
>>> from contextlib import contextmanager
>>> @contextmanager
... def profile(msg):
... try:
... s = time.time()
... yield
... finally:
... print(f'{msg} cost: {time.time() - s:.2f}s')
...
>>> with profile('block'):
... time.sleep(0.1)
block cost: 0.10s
yield from and __iter__#
When using yield from with a class instance, Python calls the object’s
__iter__ method to get an iterator. This allows custom classes to work
seamlessly with yield from delegation, enabling elegant composition of
iterables.
>>> class FakeGen:
... def __iter__(self):
... n = 0
... while n < 3:
... yield n
... n += 1
... def __reversed__(self):
... n = 2
... while n >= 0:
... yield n
... n -= 1
...
>>> def spam():
... yield from FakeGen()
...
>>> list(spam())
[0, 1, 2]
>>> list(reversed(FakeGen()))
[2, 1, 0]
Closure Using Generator#
Generators provide an elegant way to implement closures that maintain state
between calls. Each call to next() resumes execution and can access and
modify the enclosed variables. This is often cleaner than using nonlocal
or class-based approaches.
# generator version
>>> def closure_gen():
... x = 5566
... while True:
... x += 1
... yield x
...
>>> g = closure_gen()
>>> next(g)
5567
>>> next(g)
5568
Simple Scheduler#
This example demonstrates how generators can be used to implement cooperative multitasking. Each generator represents a task that yields control back to the scheduler. The scheduler uses a deque to round-robin between tasks, advancing each one step at a time.
>>> from collections import deque
>>> def fib(n):
... if n <= 2: return 1
... return fib(n-1) + fib(n-2)
...
>>> def g_fib(n):
... for x in range(1, n + 1):
... yield fib(x)
...
>>> q = deque([g_fib(3), g_fib(5)])
>>> def run():
... while q:
... try:
... t = q.popleft()
... print(next(t))
... q.append(t)
... except StopIteration:
... print("Task done")
...
>>> run()
1
1
1
1
2
2
Task done
3
5
Task done
Simple Round-Robin with Blocking#
A more advanced scheduler that handles I/O blocking using select(). Tasks
yield tuples indicating what operation they’re waiting for (‘recv’ or ‘send’)
and which socket. The scheduler moves blocked tasks to wait queues and only
runs them when their I/O is ready. This is the foundation of async I/O frameworks.
from collections import deque
from select import select
import socket
tasks = deque()
w_read = {}
w_send = {}
def run():
while any([tasks, w_read, w_send]):
while not tasks:
can_r, can_s, _ = select(w_read, w_send, [])
for _r in can_r:
tasks.append(w_read.pop(_r))
for _w in can_s:
tasks.append(w_send.pop(_w))
try:
task = tasks.popleft()
why, what = next(task)
if why == 'recv':
w_read[what] = task
elif why == 'send':
w_send[what] = task
except StopIteration:
pass
def server():
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
sock.bind(('localhost', 5566))
sock.listen(5)
while True:
yield 'recv', sock
conn, addr = sock.accept()
tasks.append(client_handler(conn))
def client_handler(conn):
while True:
yield 'recv', conn
msg = conn.recv(1024)
if not msg: break
yield 'send', conn
conn.send(msg)
conn.close()
tasks.append(server())
run()
Async Generator (Python 3.6+)#
Async generators combine async def with yield to create asynchronous
iterators. They can use await to pause for async operations between yields.
Use async for to iterate over async generators. This is essential for
streaming data from async sources like network connections or databases.
>>> import asyncio
>>> async def slow_gen(n, t):
... for x in range(n):
... await asyncio.sleep(t)
... yield x
...
>>> async def task(n):
... async for x in slow_gen(n, 0.1):
... print(x)
...
>>> asyncio.run(task(3))
0
1
2
Async Generator with try..finally#
Async generators support try..finally blocks for cleanup, just like regular
generators. The finally block executes when the generator is closed or
garbage collected, ensuring resources are properly released even if an exception
occurs during iteration.
>>> import asyncio
>>> async def agen(t):
... try:
... await asyncio.sleep(t)
... yield 1 / 0
... finally:
... print("finally")
...
>>> async def main():
... try:
... g = agen(0.1)
... await g.__anext__()
... except Exception as e:
... print(repr(e))
...
>>> asyncio.run(main())
finally
ZeroDivisionError('division by zero')
Send and Throw to Async Generator#
Async generators support asend() to send values and athrow() to throw
exceptions, similar to regular generators. These methods are coroutines that
must be awaited. This enables two-way communication with async generators for
building complex async data pipelines.
>>> import asyncio
>>> async def agen(n):
... try:
... for x in range(n):
... await asyncio.sleep(0.1)
... val = yield x
... print(f'got: {val}')
... except RuntimeError as e:
... yield repr(e)
...
>>> async def main():
... g = agen(5)
... ret = await g.asend(None) + await g.asend('foo')
... print(ret)
... ret = await g.athrow(RuntimeError('error'))
... print(ret)
...
>>> asyncio.run(main())
got: foo
1
RuntimeError('error')
Async Comprehension (Python 3.6+)#
PEP 530 introduced async comprehensions, allowing async for in list, set,
and dict comprehensions. This provides a concise way to collect values from
async generators. You can also use if clauses to filter values and
conditional expressions for transformations.
>>> import asyncio
>>> async def agen(n):
... for x in range(n):
... await asyncio.sleep(0.01)
... yield x
...
>>> async def main():
... ret = [x async for x in agen(5)]
... print(ret)
... ret = [x async for x in agen(5) if x < 3]
... print(ret)
... ret = {f'{x}': x async for x in agen(3)}
... print(ret)
...
>>> asyncio.run(main())
[0, 1, 2, 3, 4]
[0, 1, 2]
{'0': 0, '1': 1, '2': 2}
Simple Async Round-Robin#
This example shows cooperative multitasking with async generators. Multiple
async generators are scheduled in a deque, and the scheduler awaits each one
in turn using __anext__(). This pattern is useful for interleaving multiple
async data streams fairly.
>>> import asyncio
>>> from collections import deque
>>> async def agen(n):
... for x in range(n):
... await asyncio.sleep(0.1)
... yield x
...
>>> async def main():
... q = deque([agen(3), agen(5)])
... while q:
... try:
... g = q.popleft()
... print(await g.__anext__())
... q.append(g)
... except StopAsyncIteration:
... pass
...
>>> asyncio.run(main())
0
0
1
1
2
2
3
4
Async Generator vs Async Iterator Performance#
Async generators have better performance than manually implemented async iterators because they are optimized at the C level in CPython. This benchmark shows that async generators can be significantly faster for iteration-heavy workloads.
>>> import time
>>> import asyncio
>>> class AsyncIter:
... def __init__(self, n):
... self._n = n
... def __aiter__(self):
... return self
... async def __anext__(self):
... ret = self._n
... if self._n == 0:
... raise StopAsyncIteration
... self._n -= 1
... return ret
...
>>> async def agen(n):
... for i in range(n):
... yield i
...
>>> async def task_agen(n):
... s = time.time()
... async for _ in agen(n): pass
... cost = time.time() - s
... print(f"agen cost time: {cost}")
...
>>> async def task_aiter(n):
... s = time.time()
... async for _ in AsyncIter(n): pass
... cost = time.time() - s
... print(f"aiter cost time: {cost}")
...
>>> n = 10 ** 7
>>> asyncio.run(task_agen(n))
agen cost time: 1.2698817253112793
>>> asyncio.run(task_aiter(n))
aiter cost time: 4.168368101119995
yield from == await Expression#
Before Python 3.5 introduced async/await syntax, coroutines were
implemented using generators with @asyncio.coroutine decorator and
yield from. The await keyword is essentially equivalent to yield from
for coroutines. This example shows both the old and new syntax for an echo server.
import asyncio
import socket
loop = asyncio.get_event_loop()
host = 'localhost'
port = 5566
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM, 0)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
sock.setblocking(False)
sock.bind((host, port))
sock.listen(10)
# old syntax (Python 3.4)
@asyncio.coroutine
def echo_server():
while True:
conn, addr = yield from loop.sock_accept(sock)
loop.create_task(handler(conn))
@asyncio.coroutine
def handler(conn):
while True:
msg = yield from loop.sock_recv(conn, 1024)
if not msg:
break
yield from loop.sock_sendall(conn, msg)
conn.close()
# new syntax (Python 3.5+)
async def echo_server():
while True:
conn, addr = await loop.sock_accept(sock)
loop.create_task(handler(conn))
async def handler(conn):
while True:
msg = await loop.sock_recv(conn, 1024)
if not msg:
break
await loop.sock_sendall(conn, msg)
conn.close()
loop.create_task(echo_server())
loop.run_forever()
Simple Compiler Using Generators#
This advanced example from David Beazley demonstrates using generators to implement a simple expression compiler. It includes a tokenizer, parser, and evaluator using the visitor pattern with generators for stack-based evaluation.
import re
import types
from collections import namedtuple
tokens = [
r'(?P<NUMBER>\d+)',
r'(?P<PLUS>\+)',
r'(?P<MINUS>-)',
r'(?P<TIMES>\*)',
r'(?P<DIVIDE>/)',
r'(?P<WS>\s+)']
Token = namedtuple('Token', ['type', 'value'])
lex = re.compile('|'.join(tokens))
def tokenize(text):
scan = lex.scanner(text)
gen = (Token(m.lastgroup, m.group())
for m in iter(scan.match, None) if m.lastgroup != 'WS')
return gen
class Node:
_fields = []
def __init__(self, *args):
for attr, value in zip(self._fields, args):
setattr(self, attr, value)
class Number(Node):
_fields = ['value']
class BinOp(Node):
_fields = ['op', 'left', 'right']
def parse(toks):
lookahead, current = next(toks, None), None
def accept(*toktypes):
nonlocal lookahead, current
if lookahead and lookahead.type in toktypes:
current, lookahead = lookahead, next(toks, None)
return True
def expr():
left = term()
while accept('PLUS', 'MINUS'):
left = BinOp(current.value, left)
left.right = term()
return left
def term():
left = factor()
while accept('TIMES', 'DIVIDE'):
left = BinOp(current.value, left)
left.right = factor()
return left
def factor():
if accept('NUMBER'):
return Number(int(current.value))
else:
raise SyntaxError()
return expr()
class NodeVisitor:
def visit(self, node):
stack = [self.genvisit(node)]
ret = None
while stack:
try:
node = stack[-1].send(ret)
stack.append(self.genvisit(node))
ret = None
except StopIteration as e:
stack.pop()
ret = e.value
return ret
def genvisit(self, node):
ret = getattr(self, 'visit_' + type(node).__name__)(node)
if isinstance(ret, types.GeneratorType):
ret = yield from ret
return ret
class Evaluator(NodeVisitor):
def visit_Number(self, node):
return node.value
def visit_BinOp(self, node):
leftval = yield node.left
rightval = yield node.right
if node.op == '+':
return leftval + rightval
elif node.op == '-':
return leftval - rightval
elif node.op == '*':
return leftval * rightval
elif node.op == '/':
return leftval / rightval
def evaluate(exp):
toks = tokenize(exp)
tree = parse(toks)
return Evaluator().visit(tree)
print(evaluate('2 * 3 + 5 / 2')) # 8.5
print(evaluate('+'.join([str(x) for x in range(10000)]))) # 49995000