When the body of a function contains one or more occurrences of the keyword yield, the function is known as a generator. When you call a generator, the function body does not execute. Instead, calling the generator returns a special iterator object that wraps the function body, its local variables (including its parameters), and the current point of execution, which is initially the start of the function.
When the next method of this iterator object is called, the function body executes up to the next yield statement, which takes the form:
yield expression
When a yield statement executes, the function execution is "frozen," with current point of execution and local variables intact, and the expression following yield is returned as the result of the next method. When next is called again, execution of the function body resumes where it left off, again up to the next yield statement. If the function body ends, or executes a return statement, the iterator raises a StopIteration exception to indicate that the iteration is finished. return statements in a generator cannot contain expressions.
A generator is a very handy way to build an iterator. Since the most common way to use an iterator is to loop on it with a for statement, you typically call a generator like this:
for avariable in somegenerator(arguments):
For example, say that you want a sequence of numbers counting up from 1 to N and then down to 1 again. A generator can help:
def updown(N): for x in xrange(1, N): yield x for x in xrange(N, 0, -1): yield x for i in updown(3): print i # prints: 1 2 3 2 1
Here is a generator that works somewhat like the built-in xrange function, but returns a sequence of floating-point values instead of a sequence of integers:
This frange example is only somewhat like xrange because, for simplicity, it makes arguments start and stop mandatory, and silently assumes step is positive.
Generators are more flexible than functions that returns lists. A generator may build an unbounded iterator, meaning one that returns an infinite stream of results (to use only in loops that terminate by other means, e.g., via a break statement). Further, a generator-built iterator performs lazy evaluation: the iterator computes each successive item only when and if needed, just in time, while the equivalent function does all computations in advance and may require large amounts of memory to hold the results list. Therefore, if all you need is the ability to iterate on a computed sequence, it is often best to compute the sequence in a generator rather than in a function that returns a list. If the caller needs a list of all the items produced by some bounded generator G(arguments), the caller can simply use the following code:
resulting_list = list(G(arguments))
Generator expressions
Python 2.4 introduces an even simpler way to code particularly simple generators: generator expressions, commonly known as genexps. The syntax of a genexp is just like that of a list comprehension (as covered in "List comprehensions" on page 67) except that a genexp is enclosed in parentheses (()) instead of brackets ([]); the semantics of a genexp are the same as those of the corresponding list comprehension, except that a genexp produces an iterator yielding one item at a time, while a list comprehension produces a list of all results in memory (therefore, using a genexp, when appropriate, saves memory). For example, to sum the squares of all single-digit integers, in any modern Python, you can code sum([x*x for x in xrange(10)]); in Python 2.4, you can express this functionality even better, coding it as sum(x*x for x in xrange(10)) (just the same, but omitting the brackets), and obtain exactly the same result while consuming less memory. Note that the parentheses that indicate the function call also "do double duty" and enclose the genexp (no need for extra parentheses).