Scatter Plots in Python

pen on paper

You can create scatter plots in Python by using the matplotlib as follows:

import matplotlib.pyplot as plt

plt.scatter(x, y)
plt.show()

Where x and y are lists of numbers or the data points for the plot.

For example, let’s create a scatter plot where x and y are lists of random numbers between 1 and 100:

import matplotlib.pyplot as plt
import random

x = [random.randint(1, 100) for n in range(100)]
y = [random.randint(1, 100) for n in range(100)]

plt.scatter(x, y)
plt.show()

Given randomized x and y data, the scatter plot looks something like this:

Scatter Plots in Python

Scatter plots are used to demonstrate the relationship between two variables. These relationships can be linear, non-linear, positive, negative, strong, or weak.

To create scatter plots for visualizing these relationships in Python, install the matplotlib library on your machine.

How to Install Matplotlib in Python

To create a scatter plot, you need to have matplotlib module installed.

In case you don’t have it, install it by running the following command in your command line:

pip install matplotlib

How to Create a Scatter Plot in Python

To create a scatter plot, you need to have a group of data points. Then use matplotlib.pyplot.scatter() for creating a scatter plot of the data.

For example, let’s create a scatter plot with 100 random x and y values as the data points:

import matplotlib.pyplot as plt
import random

x = [random.randint(1, 100) for n in range(100)]
y = [random.randint(1, 100) for n in range(100)]

plt.scatter(x, y)
plt.show()

The result looks like this:

Example—Randomly Distributed Data

This example uses numpy to generate random data from a normal distribution. Make sure to have numpy installed on your system:

pip install numpy

Let’s create two lists filled with 100 numbers picked from the normal distribution. Then let’s create a scatter plot from the randomized data:

import numpy
import matplotlib.pyplot as plt

x = numpy.random.normal(2.0, 1.0, 1000)
y = numpy.random.normal(8.0, 3.0, 1000)

plt.scatter(x, y)
plt.show()
  • The x data is from a normal distribution where the mean is 2.0 and STD 1.0.
  • The y data is from a normal distribution where the mean is 8.0 and STD 3.0.

This means we expect to see the x values centered around 2.0, and y values around 8.0. Also, the y values are going to be spread more than x values due to greater standard deviation.

Output:

The x values are centered around 2.0, and the y values are around 8.0.

Conclusion

Scatter plotting is a useful tool to observe relationships between two variables.

In Python, you can create a scatter plot with matplotlib:

import matplotlib.pyplot as plt
plt.scatter(x, y)

Where x and y are lists of numbers that act as data points.

Thanks for reading. I hope you enjoy it.

Happy coding!

Further Reading

50 Python Interview Questions with Answers

50+ Buzzwords of Web Development

Share on facebook
Facebook
Share on google
Google+
Share on twitter
Twitter
Share on linkedin
LinkedIn
Share on pinterest
Pinterest

Leave a Comment

Your email address will not be published. Required fields are marked *