How Python Reduce() Function Works

How Python reduce() works

The Python reduce() function works such that it:

  1. Applies a function to the first and the second element of an iterable.
  2. Stores the result.
  3. Applies the function to the third element and the result.
  4. Continues this process until no values left.

In other words, the reduce() reduces an iterable into a single cumulative value.

For example, let’s calculate a sum of a list of numbers with the reduce() function:

from functools import reduce

numbers = [1, 2, 3, 4]
sum = reduce(lambda x, y: x + y, numbers) # returns 10

Here is an illustration of how it works:

The result of the previous function call is passed as an argument to the next.

But as it turns out, using reduce() is useless most of the time. Writing a for loop is a better option 99% of the time.

In this guide, you will learn how to:

  • Use the reduce() function.
  • Implement a custom reduce function to support understanding the concept.
  • Call reduce() function with lambda expressions.
  • Solve common tasks with reduce().

Introduction to reduce() in Python

Python’s reduce() function implements what is known as folding in mathematics. Folding means you reduce a list of numbers into a single value.

For example, you can fold a list of numbers to obtain the sum of it.

The reduce() function works for any iterable in Python—not only for lists!

Python’s reduce() function is part of the built-in functools library. Remember to import the library before reducing!

The reduction generally follows this procedure:

  1. Call a function on the first two elements of an iterable to calculate a partial result.
  2. Call the function on this partial result and the third element of the iterable to create a new partial result.
  3. Repeat this process until there are no values left in the iterable.
  4. Return the result.

The whole point of reduce() function is to replace loops with a more concise shorthand notation.

Why reduce() Is a Part of Functools

Initially, the reduce() function was a part of the built-in functions of Python 2.x.

Later on, the reduce() was moved to functools module for Python 3.x.

This move “made some space” for new useful built-in functions, such asmax(), min(), sum(), len(), all(), and so on.

These functions are more efficient, readable, and Pythonic than solving the related task with reduce().

Reduce() Function Syntax in Python

The reduce function follows this syntax:

functools.reduce(function, iterable[, initializer])

Where

  • Function is a mandatory argument.
  • Iterable is a mandatory argument.
  • Initializer is an optional argument.

Let’s take a closer look at what these arguments do.

Required Arguments: Function and Iterable

The first argument is called function.

This is the function that is cumulatively applied for each element in the iterable to fold the iterable into a single final value.

The second argument is called iterable.

This can be any python object that can be iterated over. These are for example lists, tuples, sets, dictionaries, and so on.

Optional Arguments: Initializer

The third argument is an optional argument initializer.

This is the initial value where the reduction starts from. The final value is accumulated on top of the initial value.

Example—How to Caluclate the Sum of a List

Let’s see a simple example of how to use the reduce() function in Python.

Let’s sum a list of numbers using the reduce() function.

To do this we need to:

  1. Import the reduce() function from the functools library.
  2. Have a list of numbers.
  3. Define a function that calculates the sum of two numbers.
  4. Reduce the list of numbers by calling the sum() function for each element cumulatively.
  5. Print the result.

Here is the code:

from functools import reduce

numbers = [1, 2, 3, 4]

def sum(a, b):
    return a + b

# Call sum() function on each element accumulatively
total = reduce(sum, numbers)

print(total)

Output:

10

Now you have a basic understanding of how the reduce() function works in Python.

Next, let’s implement our own naive version of it to support the understanding.

How to Implement Custom reduce() Function in Python

A great way to understand how reduce() function works is by implementing one.

The custom reducer function needs to take three parameters:

  1. A function that is cumulatively applied for each element of the iterable.
  2. An iterable, such as a list.
  3. An optional initial starting value.

As described earlier, the reducer accumulates a result over the iterable from left to right, eventually producing a single value.

Here is the code:

def my_reduce(function, iterable, start_value=None):
    index = 0

    if start_value is None:
        value = iterable[index]
    else:
        value = start_value

    while index < len(iterable) - 1:
        next_value = iterable[index + 1]
        value = function(value, next_value)
        index += 1

    return value

Here, the while loop takes care of accumulating the result value. It does this by calling the function on the previous value and the next element.

Now we can use this custom reducer.

For example, let’s calculate the sum of a list of numbers:

numbers = [1, 2, 3, 4]

# A function that calculates sum of two values
def sum(a, b):
    return a + b

# The reducer calls sum for each (result, element) pair accumulatively
total = my_reduce(sum, numbers)

print(total)

This produces the following result:

10

The reduce() Function and Lambda Expressions

Now that you understand how reducing works in Python, you are almost ready to see it in action.

In this guide, we are going to use lambda expressions inside the reduce() function calls.

If you don’t know what lambda expressions are, here is a quick primer.

A Lambda expression is a nameless function. It is used as a shorthand when we don’t want to define a separate function for a simple task.

To demonstrate, let’s use a reducer to get the sum of a list of numbers.

First, let’s define a function that calculates the sum of two numbers. Then let’s pass this function into the reduce() function call:

from functools import reduce

numbers = [1, 2, 3, 4]

def sum(a, b):
    return a + b

# Call sum() function on each element accumulatively
total = reduce(sum, numbers)

print(total)

Output:

10

This works.

But it is useless to have a separate definition for sum(). This is because it is such a simple function and we only use it once.

Here is where lambda expressions come in handy.

Instead of defining a separate function sum(), we can pass a nameless shorthand version of it into the reduce() call.

Here is how:

from functools import reduce

numbers = [1, 2, 3, 4]

total = reduce(lambda a, b: a + b, numbers)

print(total)

Here lambda a, b: a + b does the exact same as the sum() function earlier. It is a nameless function that takes two parameters and sums them up.

Output:

10

Learn more about lambda expressions in Python by reading this article.

Python reduce() Function in Action

Let’s see some common tasks that can be solved using reducers in Python.

But as you know, reducing is usually not the most optimal way to solve problems in Python. Thus, I’ve included better alternatives to solve the problems in each example.

Sum

As you already saw, you can use the reduce() function to calculate the sum of a list.

Here is the code:

from functools import reduce

numbers = [1, 2, 3, 4]

sum = reduce(lambda x, y: x + y, numbers)

print(sum)

Output:

10

Keep in mind counting the sum this way is not the most Pythonic nor readable.

There is a built-in function called sum(). You can use it to calculate the sum of a list of numbers.

For example:

sum([1, 2, 3, 4]) # returns 10

Product

Just like calculating a sum with reduce, you can compute an accumulated product with it.

For instance:

from functools import reduce

numbers = [1, 2, 3, 4]

product = reduce(lambda x, y: x * y, numbers)

print(product)

Output:

24

Meanwhile, this approach works, it may not be the most Pythonic way to tackle the problem.

As of Python 3.8, the math module has included a prod() function.

from math import prod

numbers = [1, 2, 3, 4]

product = prod(numbers)

print(product)

Output:

24

Max Value of a List

So far you have seen examples of how to do some arithmetic operations with the reduce() function.

But the reduce function accepts any function as an argument that takes two arguments and returns a result.

In Python, there is a built-in function max(). You can use it to figure out the greatest value.

For example, let’s figure out which number is greater with the max() function:

max(10, 1000) # Returns 1000

Now, we can use reduce() function in conjunction with the max() function to figure out the largest number in a list.

Here is the code:

from functools import reduce

numbers = [3, 99, 12, 3000, 2]

greatest = reduce(max, numbers)

print(greatest)

Just like the previous examples you’ve seen, it:

  • Calls the max() function on the first two elements.
  • Stores the greater number as a partial result.
  • Applies the max function on the next number in the list and the partial result.
  • It repeats this process until the list has no values.
  • Then it returns the greatest value.

As a result, it returns the greatest number in the list, which in this case is:

3000

Meanwhile, this approach works, it is recommended to use the built-in max() function to solve this task.

For example:

numbers = [3, 99, 12, 3000, 2]

greatest = max(numbers)

print(greatest)

Output:

3000

Min Value of a List

To find the smallest value of a list using reduce(), follow the logic of finding the maximum in the previous section.

For example:

from functools import reduce

numbers = [3, 99, 12, 3000, 2]

smallest = reduce(min, numbers)

print(smallest)

Output:

2

Just like it is not recommended to use reduce() to find the maximum, you should not use it to find the minimum of a list either.

Instead, use the built-in min() function. This makes the code more readable and Pythonic.

For example:

numbers = [3, 99, 12, 3000, 2]

smallest = min(numbers)

print(smallest)

Output:

2

All Values True/False?

You can use reduce() to find out if a list of booleans only contains True/False values.

To do this:

  • Create a function that takes two boolean values and checks if both are True. You can use a lambda function that looks like this: lambda a, b: a and b.
  • Then input this function the reducing function in the reduce() call.
  • The reduce() function then loops through the list and accumulates the boolean value result by applying and operations between the booleans.

Here is the code:

from functools import reduce

bools = [True, True, True]

all_true = reduce(lambda a, b: a and b, bools)

print(all_true)

Output:

True

To check if all booleans in a list are False, follow similar logic except but add not in the lambda function.

Here is the code

from functools import reduce

bools = [False, True, False]

all_false = reduce(lambda a, b: not a and b, bools)

print(all_false)

Output:

False

In reality, you should use the built-in functions all() and any() to check if all values are True or False respectively.

For instance:

bools = [False, False, False]

# All Trues?
print(all(bools))      # False

# All Falses?
print(not any(bools))  # True

When Use reduce() in Python

Use functools.reduce() if you really need it; however, 99 percent of the time an explicit for loop is more readable.

Source: What’s new in Python 3.0

Now that you understand how Python reduce() function works, it is a good time to discuss when you should or should not use it.

As suggested earlier, reduce() is something you usually want to avoid using.

For this exact reason, I showed you better alternatives in the examples of reducers in action.

But why is reducing bad?

Python’s reduce() function has a bad performance. This is because it calls the function multiple times. Naturally, this could lead to slow and inefficient code.

Also, reduce() compromises code readability.

For example, which approach do you understand better?

from functools import reduce

numbers = [3, 1, 22]

# Non-Pythonic reduce() + lambda approach:
greatest_num = reduce(lambda x, y: x if x > y else y, numbers)

# Pythonic max() function approach:
greatest_num = max(numbers)

Reduce() Function Performance Comparison

Let’s go back to the earlier examples and compare the performance of using reduce() vs. using a dedicated built-in function.

More specifically, let’s use reduce in three ways to compute a sum of a range of numbers:

  1. Reduce with a custom sum function.
  2. Reduce with a lambda expression
  3. Reduce with the operator.add

And let’s compare these to the built-in sum() function.

Here is the code:

>>> from functools import reduce
>>> from timeit import timeit
>>> 
>>> def add(a, b):
...    return a + b
... 
>>> # 1. Reduce() with a user-defined function
>>> call_add = "functools.reduce(add, range(50))"
>>> timeit(call_add, "import functools", globals={"add": add})

# 2. Reduce() with a Lambda expression
call_lambda = "functools.reduce(lambda x, y: x + y, range(50))"
timeit(call_lambda, "import functools")

# 3. Reduce() with Operator.add
call_operator_add = "functools.reduce(operator.add, range(50))"
timeit(call_operator_add, "import functools, operator")

# 4. Built-in sum() function
timeit("sum(range(50))", globals={"sum": sum})
5.3198932689999765
>>> 
>>> # 2. Reduce() with a Lambda expression
>>> call_lambda = "functools.reduce(lambda x, y: x + y, range(50))"
>>> timeit(call_lambda, "import functools")
5.042331120999961
>>> 
>>> # 3. Reduce() with Operator.add
>>> call_operator_add = "functools.reduce(operator.add, range(50))"
>>> timeit(call_operator_add, "import functools, operator")
2.154597568999975
>>> 
>>> # 4. Built-in sum() function
>>> timeit("sum(range(50))", globals={"sum": sum})
0.7353007320000415

As you can see, the time it takes to reduce the sum is significantly more than using the sum() function.

Here is a similar performance experiment with multiplication and reduce().

>>> from functools import reduce
>>> from timeit import timeit
>>> 
>>> def prod(a, b):
...    return a * b
... 
>>> # 1. Reduce() with a custom product function
>>> call_prod = "functools.reduce(prod, range(50))"
>>> timeit(call_prod, "import functools", globals={"prod": prod})

# 2. Reduce() with a Lambda expression
call_lambda = "functools.reduce(lambda x, y: x + y, range(50))"
timeit(call_lambda, "import functools")

# 3. Reduce() with Operator.prod
call_operator_mul = "functools.reduce(operator.mul, range(50))"
timeit(call_operator_mul, "import functools, operator")

# 4. Built-in prod() function
call_math_prod = "math.prod(range(50))"
timeit(call_math_prod, "import math")
4.836152304000052
>>> 
>>> # 2. Reduce() with a Lambda expression
>>> call_lambda = "functools.reduce(lambda x, y: x + y, range(50))"
>>> timeit(call_lambda, "import functools")
4.990506494000101
>>> 
>>> # 3. Reduce() with Operator.prod
>>> call_operator_mul = "functools.reduce(operator.mul, range(50))"
>>> timeit(call_operator_mul, "import functools, operator")
1.7283722960000887
>>> 
>>> # 4. Built-in prod() function
>>> call_math_prod = "math.prod(range(50))"
>>> timeit(call_math_prod, "import math")
0.7617921809999189

Here you can also see how much better dedicated math.prod() function is than any other reduction trick.

Long story short, do not use reduce.

Use the dedicated functionality for calculating results like this.

For example, use the built-in function sum() to get a sum of a list of numbers. And use math.prod() to get a total product of a list of numbers.

This is a more efficient and readable approach to tackle these problems.

Reduce() Function Readability

Let’s tackle the code readability aspect of reducing with one more coding example.

Say you want to figure out the product of all the odd numbers in a list.

The reduce() approach to this problem would look like this:

from functools import reduce

numbers = [1, 2, 3, 4, 5]

odd_prod = reduce(lambda x, y: x * y if y % 2 != 0 else x, numbers) # Returns 15 (1 * 3 * 5)

But is this code readable?

I would say no.

It takes a while to wrap your head around the lambda expression as well as the whole reduce thing.

What would be a more suitable approach?

For example, a regular for loop:

numbers = [1, 2, 3, 4, 5]

odd_prod = 1

for number in numbers:
    if number % 2 != 0:
        odd_prod *= number

# odd_prod is 15 (1 * 3 * 5)

Or another option is to use a generator expression like this:

from math import prod

numbers = [1, 2, 3, 4, 5]

def odd_product(numbers):
    return prod(num for num in numbers if num % 2 != 0)

odd_product(numbers) # Returns 15 (1 * 3 * 5)

These approaches are more Pythonic. Also, they don’t sacrifice code readability.

How Does Python Implement the Reduce() Function

Earlier in this guide, you implemented a version of reduce function yourself to support understanding. That implementation followed the same idea as Python’s own implementation of reduce().

But in case you are interested, the real implementation of reduce() uses iterables and iterators instead of while loops.

In case you are not interested in the implementation details, feel free to skip this part.

The “real implementation” of reduce() function can be found from Python’s open-source codebase on Github.

The real implementation of reduce() looks like this:

def reduce(function, sequence, initial=_initial_missing):
    """
    reduce(function, iterable[, initial]) -> value
    Apply a function of two arguments cumulatively to the items of a sequence
    or iterable, from left to right, so as to reduce the iterable to a single
    value.  For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates
    ((((1+2)+3)+4)+5).  If initial is present, it is placed before the items
    of the iterable in the calculation, and serves as a default when the
    iterable is empty.
    """

    it = iter(sequence)

    if initial is _initial_missing:
        try:
            value = next(it)
        except StopIteration:
            raise TypeError(
                "reduce() of empty iterable with no initial value") from None
    else:
        value = initial

    for element in it:
        value = function(value, element)

    return value

For simplicity, we can take out the “unnecessary” parts of the implementation to inspect how the iterators and iterables relate to reduce().

This simplifies the implementation roughly to something like this:

def reduce(function, iterable, initializer=None):
    # 1. Grab the iterator
    it = iter(iterable)
    
    # 2. Notice the possible initial value
    if initializer is None:
        value = next(it)
    else:
        value = initializer
    
    # 3. Loop through the iterable and accumulate a result
    for element in it:
        value = function(value, element)
    
    # 4. Return the result
    return value

As you can see, instead of using while loops, the reduce() function uses iterators to loop through the iterables.

In short, each iterable object in Python implements an iterator that can be used to loop through the iterable.

To understand iterators, consider this example.

When you do:

numbers = [1, 2, 3]

for number in numbers:
   print(number)

The for loop looks like this under the hood of Python:

# Mimicing for in loop in Python using iterators

numbers = [1, 2, 3]

# Grab the iterator from numbers that is used to loop through the numbers
it = iter(numbers)

while True:
   # If there are number left, get the next number
   try:
      next_num = next(it)
      print(next_num)
   # If no numbers left, stop iteration
   except StopIteration:
      break

This means the for loop:

  • The loop grabs the iterator object from the iterable.
  • It then enters an endless loop that terminates when there are no values left in the iterable.

Now if you look at the rough implementation of the reduce() function, you start to see more clearly how it works.

Iterators and iterables are big concepts in Python. To truly understand how they work, check out this article.

Conclusion

Today you learned how Python’s reduce() function works.

The reduce() function lets you perform actions on Python iterables that you would normally do with for loops.

To take home, reduce() is an inefficient way to solve problems. Most of the time, there is a built-in function that solves the problem way more efficiently. These include max(), min(), sum(), all(), any() and more.

99% of the time, you should not use reduce().

Today you learned how to solve problems with reduce(). In each case, you also learned that you can replace reduce() with more suitable built-in functions.

You learned how to implement a custom version of reduce(). You also saw how it is really implemented behind the scenes using iterators.

Thanks for reading. I hope you find it useful.

Happy coding!

Further Reading

50 Python Interview Questions with Answers

Python Tricks and Tips

Share on facebook
Facebook
Share on google
Google+
Share on twitter
Twitter
Share on linkedin
LinkedIn
Share on pinterest
Pinterest

Leave a Comment

Your email address will not be published.