How to Calculate Median Value in Python

How to Calculate Median in Python

To calculate the median value in Python:

  1. Import the statistics module.
  2. Call the statistics.median() function on a list of numbers.

For example, let’s calculate the median of a list of numbers:

import statistics

numbers = [1, 2, 3, 4, 5, 6, 7]
med = statistics.median(numbers)

print(med)

Output:

4

The median value is a common way to measure the “centrality” of a dataset.

If you are looking for a quick answer, I’m sure the above example will do. But to really learn what median really is, why it is useful, and how to find it, read along.

What Is the Median Value in Maths

Median is the middle value of a given dataset.

If you have a list of 3 numbers, the median is the second number as it is in the middle.

But in case you have a list of 4 values, there is no “middle value”. When calculating the median, of an even-sized dataset, the average of the two middle values is used.

Median odd or even number of items

Why and When Is Median Value Useful

When dealing with statistics, you usually want to have a single number that describes the nature of a dataset.

Think about your school grades for example. Instead of seeing the dozens of grades, you want to know the average (the mean).

Usually, measuring the “centrality” of a dataset means calculating the mean value. But if you have a skewed distribution, the mean value can be unintuitive.

Let’s say you drive to your nearby shopping mall 7 times. Usually, the drive takes around 10 minutes. But one day the traffic jam makes it last 2 hours.

Here is a list of driving times to the mall:

[9, 120, 10, 9, 10, 10, 10]

Now if you take the average of this list, you get ~25 minutes. But how well does this number really describe your trip?

Pretty badly.

As you can see, most of the time the trip takes around 10 minutes.

To better describe the driving time, you should use a median value instead. To calculate the median value, you need to sort the driving times first:

[9, 9, 10, 10, 10, 10, 120]

Then you can choose the middle value, which in this case is 10 minutes. 10 minutes describes your typical trip length way better than the 25, right?

The usefulness of calculating the median, in this case, is that the unusually high value of 120 does not matter.

In short, you can calculate the median value when measuring centrality with average is unintuitive.

How to Calculate the Median Value in Python

In Python, you can either create a function that calculates the median or use existent functionality.

How to Implement Median Function in Python

If you want to implement the median function, you need to understand the procedure of finding the median.

The median function works such that it:

  1. Takes a dataset as an input.
  2. Sorts the dataset.
  3. Checks if the dataset is odd/even in length.
  4. If the dataset is odd in length, the function picks the mid value and returns it.
  5. If the dataset is even, the function picks the two mid values, calculates the average, and returns the result.

Here is how it looks in code:

def median(data):
    sorted_data = sorted(data)
    data_len = len(sorted_data)

    middle = (data_len - 1) // 2

    if middle % 2:
        return sorted_data[middle]
    else:
        return (sorted_data[middle] + sorted_data[middle + 1]) / 2.0

Example usage:

numbers = [1, 2, 3, 4, 5, 6, 7]
med = median(numbers)

print(med)

Output:

4

Now, this is a valid approach if you need to write the median function yourself. But with common maths operations, you should use a built-in function to save time and headache.

Let’s next take a look at how to calculate the median with a built-in function in Python.

How to Use a Built-In Median Function in Python

In Python, there is a module called statistics. This module contains useful mathematical tools for data science and statistics.

One of the great methods of this module is the median() function.

As the name suggests, this function calculates the median of a given dataset.

To use the median function from the statistics module, remember to import it into your project.

Here is an example of calculating the median for a bunch of numbers:

import statistics

numbers = [1, 2, 3, 4, 5, 6, 7]
med = statistics.median(numbers)

print(med)

Result:

4

Conclusion

Today you learned how to calculate the median value in Python.

To recap, the median value is a way to measure the centrality of a dataset. It is useful when calculating the mean gives misleading results.

To calculate the median in Python, use the built-in median() function from the statistics module.

import statistics

numbers = [1, 2, 3, 4, 5, 6, 7]
med = statistics.median(numbers)

Thanks for reading.

Happy coding!

Further Reading

Python Tricks

How to Write to a File in Python

The with Statement in Python

Share on facebook
Share on twitter
Share on linkedin

Leave a Comment

Your email address will not be published.