Python ‘yield’ Keyword—What Does It Do? [with Examples]

When you call a function with a yield keyword, the code in the function does not run. Instead, a generator object is created. You can store the generator object to a variable. This generator object has the ability to run the code inside the function on demand.

When you invoke the generator object, Python runs the code inside the generator function once. It stops when there is a yield keyword and delivers a value to the caller.

When you invoke the generator again, the execution continues from where it stopped. In other words, Python runs the function again, and stops at the next yield keyword, and delivers the value once again.

This process continues until there are no more values to yield.

A function that uses a yield keyword is a generator function. Generators are useful when you need to iterate values without storing them in memory.

This guide teaches you what the yield keyword does on a high level without technical details about iterators and iterables.

Why Yield in Python?

The yield keyword is useful when you are looping through a big group of values. The reason why you might want to use yield instead of return boils down to memory efficiency.

Let’s suppose you have a function that reads a text file to print each word in the console. To do this in a traditional way, you need to store the words in a list and loop through the list, right?

But what if there are like 1 billion words on the file?

A Python program cannot handle a list of 1 billion words. Thus, you cannot store the words in a list! To come over this, you need a mechanism in which you don’t have to store the words in a list yet you can still loop through them.

This is where you can use generators, that is, functions that yield values.

The idea of a generator is to not store all the words in memory at once. Instead, the generator loops through the collection one word at a time. It only stores the current word in memory. Besides, it knows how to get the next one. This way the generator can loop through the list of 1 billion words without having any trouble with memory consumption. Theoretically, you could use a generator like this to loop through an infinite number of words because practically no memory is needed.

The best part is that the generator syntax looks identical to applying for…in loop on a list. So even though the mechanism is entirely different, the syntax remains.

Example

When you call a function with a yield keyword, you create a generator object. When you invoke this generator object (for example by using a for loop), you are essentially asking it to give the next value in the group of values you are looping through. This process continues until there are no values left.

For example, here is a generator function that squares a list of numbers

def square(numbers):
    for n in numbers:
        yield n ** 2

Now, let’s call this function and print the result:

squares = square([1, 2, 3, 4, 5])
print(squares)

Output:

<generator object square at 0x7fd8f4dff580>

You might think this object stores the squares of the numbers [1, 2, 3, 4, 5], right? But this is not the case! The squares object doesn’t store a single value nor does it have a single square calculated anywhere. Instead, the generator object gives you the ability to calculate the squares on demand.

To actually calculate the squares, you need to invoke the generator object. One way to do this is by using the built-in next() function. This function asks the generator to compute the next squared value.

Let’s call the next() function five times and print the results:

print(next(squares))
print(next(squares))
print(next(squares))
print(next(squares))
print(next(squares))

Output:

1
4
9
16
25

As you can see, each next() function call delivers the next squared value in the list. In other words, each next() call runs the square() function once to compute and deliver the right squared value.

But how does it know which value to compute?

The generator object remembers where it left since the last time someone called next() on it. This is how it knows how to choose the next value correctly.

But this whole next() function thing is a bit confusing, isn’t it? Sure! You shouldn’t actually use the next() function to iterate a generator. Instead, you can use a good old for loop to invoke the generator.

def square(numbers):
    for n in numbers:
        yield n ** 2

squares = square([1, 2, 3, 4, 5])

for square in squares:
    print(square)

Output:

1
4
9
16
25

Notice that the for loop calls the next() function on the generator behind the scenes. It does this until there are no values for the generator to calculate.

I hope you now have a better understanding of the yield keyword.

Next, let’s take a quick look at how you can understand the intent of a yielding function better.

Shortcut for Understanding Yield in Action

If you are new to generators and yielding, here is a trick you can do to understand what the yielding code does.

Notice that this trick is not an equivalent replacement for the yield statement! But rather, it helps you understand what the code does if you are not comfortable reading code that uses the yield keyword.

Here is the trick. When you see a yield keyword:

  1. Add this line as the first line of the function: result = [].
  2. Replace all the yield val expressions with result.append(val).
  3. Add a return result to the bottom of the function.
  4. Now read the function and understand what it does.
  5. Compare the function to the original one.

Let’s see an example of applying this trick to a function with the yield keyword. Here is an example of a generator function:

def square(numbers):
    for n in numbers:
        yield n ** 2

To see what this function does, let’s apply the above steps to the function:

1. Add result = [] to the beginning of the function.

def square(numbers):
    result = []
    for n in numbers:
        yield n ** 2

2. Replace yield val with result.append(val).

def square(numbers):
    result = []
    for n in numbers:
        result.append(n ** 2)

3. Add return result to the bottom of the function.

def square(numbers):
    result = []
    for n in numbers:
        result.append(n ** 2)
    return result

4. Read the function and understand what it does.

So, now you can clearly see that this function takes a list of numbers, squares the numbers, and returns a list of squared numbers.

It’s important you understand that this modification is not a replacement for the yielding function! Instead, it helps you understand the original code better.

To fully understand what the yield keyword does, you need to understand what are iterables, iterators, and generators. Here is a complete guide you should read.

Conclusion

The yield keyword is a memory-efficient way to loop through a big collection of values. A function with the yield keyword is called a generator.

A generator function doesn’t store the iterated values in memory. Instead, it cares about the current value and knows how to get the next one. This makes it possible to loop through a huge number of values. You only need memory for a single value.

You can apply the traditional for loop syntax when iterating a generator object. When doing this, the generator generates the values on the go.

Further Reading