Coding, Tech & Software Simplified

NumPy How to Compare Two Arrays

Matrix movie still

To check if two NumPy arrays A and B are equal:

  1. Use a comparison operator (==) to form a comparison array.
  2. Check if all the elements in the comparison array are True.

For example:

(A==B).all()

This is the easiest approach to comparing two arrays.

But this approach is not 100% reliable.

Instead, you should consider using the built-in np.array_equal() function for good measure.

np.array_equal(A, B)

This always produces the right result.

In this guide, you learn how to compare arrays in NumPy and how it differs from comparing regular lists in Python.

You will also learn about the issues with the (A == B).all() approach, and more importantly, how to fix those.

Comparing Arrays in NumPy

The easiest way to compare two NumPy arrays is to:

  1. Create a comparison array by calling == between two arrays.
  2. Call .all() method for the result array object to check if the elements are True.

Here is an example:

import numpy as np
  
A = np.array([[1, 1], [2, 2]])
B = np.array([[1, 1], [2, 2]])

equal_arrays = (A == B).all()
  
print(equal_arrays)

Output:

True

But how does it work? And why a simple comparison operator is not enough?

When you compare two Python lists A == B is enough.

But when you want to compare NumPy arrays, this is not the case.

This is because NumPy arrays are compared entirely differently than Python lists.

In particular, the NumPy arrays are compared element-wise.

Numpy array compare

Let’s try to compare two NumPy arrays like you would compare two lists:

import numpy as np
  
A = np.array([[1, 1], [2, 2]])
B = np.array([[1, 1], [2, 2]])
  
print(A == B)

As you can see, the result is a matrix, not a boolean:

[[ True  True]
 [ True  True]]

In this resulting matrix, each element is a result of a comparison of two corresponding elements in the two arrays.

To figure out if all the elements are equal, you have to check if all the elements in the comparison matrix evaluate True.

This is done using the .all() method.

Now that you understand how to compare two NumPy arrays, let’s discuss the problems that may arise.

Problems with (A==B).all()

Although the (A == B).all() approach looks simple, it has a couple of shortcomings to it you need to understand.

More importantly, you need to learn how to overcome these shortcomings.

Luckily, it is really easy.

Shortcoming 1: Empty Arrays Give the Wrong Result

If one of the compared NumPy arrays is empty, you get a wrong result.

For example:

import numpy as np

A = np.array([1])
B = np.array([])

print((A==B).all())

Output:

True

Here it still claims the arrays are equal, even though it is clearly not the case.

Solution: Use array_equal() Method

To overcome this issue, you should use the built-in array_equal method for comparing arrays.

For instance:

import numpy as np

A = np.array([1])
B = np.array([])

print(np.array_equal(A,B))

Output:

False

Shortcoming 2: Small Numeric Errors

It is quite common for NumPy arrays to have values with small numeric errors.

# should be [1.0, 2.0]
# but is [1.000001, 2.0]

This can happen due to a floating-point error which is really common.

As a result, you have arrays that are meant to be equal, but due to the small errors, comparing those yields False.

To solve this problem you have to relax the meaning of equality. In other words, you need to accept a small error in the values.

Solution: Use np.allclose() Method

The np.allclose() method checks if two NumPy arrays are equal or very close to being equal.

For instance, let’s compare two arrays that are almost equal to one another:

import numpy as np

A = np.array([[1.00001, 1], [2, 2]])
B = np.array([[1, 1], [2, 2.000002]])

print(np.allclose(A,B))

Output:

True

This works!

But what does it mean to be “closely equal”?

Being “closely equal” is characterized by tolerance levels, described by two (optional) parameters passed into the np.allclose() function call:

  • rtol. The relative tolerance.
  • atol. The absolute tolerance.

If the elements x and y satisfy the following equation given the tolerances rtol and atol:

abs(x - y) <= atol + rtol * abs(y)

Then the elements are “closely equal” to one another.

By default, these parameters are:

  • rtol = 10e-5
  • atol = 10e-8

To tweak these parameters, specify the new values in the allclose() function call as keyword arguments.

For instance:

import numpy as np

A = np.array([[1.00001, 1], [2, 2]])
B = np.array([[1, 1], [2, 2.000002]])

print(np.allclose(A, B, rtol=10e-6, atol=10e-7))

Output:

True

However, usually the default parameter values are enough!

Shortcoming 3: Arrays of Different Size

When the arrays are not the same size, comparisons like (A=B).all() will cause an error and the program will crash if not handled properly.

For example, let’s compare two 2D arrays with different numbers of array elements:

import numpy as np

A = np.array([[1, 1], [2, 2]])
B = np.array([[1, 1], [2, 2], [3, 3]])

print((A==B).all())

Output:

Traceback (most recent call last):
  File "example.py", line 6, in <module>
    print((A==B).all())
AttributeError: 'bool' object has no attribute 'all'

As you can see, this causes an error.

This is because when comparing arrays of different sizes, the comparison returns a single boolean value, False in this case.

So you end up trying to call False.all(), which obviously fails.

Solution: Use the np.array_equal() Function

Once again, it is safer to use the np.array_equal() function to compare the two arrays. It is because this function is designed to handle these cases to produce the correct results.

For instance, let’s compare two arrays of different sizes:

import numpy as np

A = np.array([[1, 1], [2, 2]])
B = np.array([[1, 1], [2, 2], [3, 3]])

print(np.array_equal(A, B))

Output:

False

Next, let’s discuss NumPy array comparisons other than being equal to.

Other Comparisons

So far I have assumed you are interested in the equality of the arrays.

However, there are four more comparisons you usually may want to perform:

  • Greater than
  • Greater than or equal
  • Less than
  • Less than or equal

These comparisons are easy to do with the built-in functions:

# A > B
numpy.greater(A, B)

# A >= B
numpy.greater_equal(A, B)

# A < B
numpy.less(A, B)

# A <= B
numpy.less_equal(A, B)

The result of these comparisons is not a single boolean value. Instead, these comparisons are made element by element. Thus the result is a matrix of Booleans for each comparison respectively.

Here is an example:

import numpy as np
  
A = np.array([1, 2, 3])
B = np.array([3, 2, 1])
  
print("Array A: ", A)
print("Array B: ", B)
  
print("A > B:")
print(np.greater(A, B))
  
print("A >= B:")
print(np.greater_equal(A, B))

print("A < B:")
print(np.less(A, B))
  
print("A <= B:")
print(np.less_equal(A, B))

Output:

Array A:  [1 2 3]
Array B:  [3 2 1]
A > B:
[False False  True]
A >= B:
[False  True  True]
A < B:
[ True False False]
A <= B:
[ True  True False]

To check how all the elements in array A compare to B, use the .all() function on the comparison array.

Conclusion

Today you learned how to compare two NumPy arrays.

To recap, given arrays A and B, you can check if they are equal by:

(A == B).all()

However, there are some drawbacks to this method.

  1. Empty arrays give the wrong result.
  2. Arrays of different size give the wrong result.

Thus, you should use the dedicated np.array_equal() function to make the comparison reliable.

Also, if you want to treat arrays with tiny numeric errors equal, use the np.allclose() function.

Other array comparisons are:

numpy.greater(A, B)
numpy.greater_equal(A, B)
numpy.less(A, B)
numpy.less_equal(A, B)

Thanks for reading.

Happy coding!

Further Reading

50 Websites to Learn Coding

Share

Share on twitter
Share on linkedin
Share on facebook
Share on pinterest
Share on email