Shallow Copy vs. Deep Copy in Python

Shallow copy vs Deepcopy in Python

In Python, a shallow copy is a “one-level-deep” copy. The copied object contains references to the child objects of the original object.

A deep copy is completely independent of the original object. It constructs a new collection object by recursively populating it with copies of the child objects.

Diagram explaining differences between shallow and deep copy
A picture is worth 1,000 words. Image by the author.

Copying or Not

In Python, you might use the = operator copy of an object. It is tempting to think this creates a new object, but it doesn’t. Instead, it creates a new variable that refers to the original object. This means changing a value in the copied object changes the value of the original object as well.

Let’s use lists to demonstrate this:

numbers = [1, 2, 3]
new_numbers = numbers

new_numbers[0] = 100

print('numbers: ', numbers)
print('new_numbers: ', new_numbers)

Output:

numbers: [100, 2, 3]
new_numbers: [100, 2, 3]

As numbers and new_numbers now point to the same list, updating one always updates the other.

To verify that both objects really refer to the same object, you can also use the id() method. Each Python object has a unique ID. If you check the IDs of the numbers and new_numbers, you can see they are the exact same:

print(id(numbers))
print(id(new_numbers))

Output:

139804802851400
139804802851400

Here is a helpful illustration:

A new list that shares the reference of an original list
Assigning an object to a new variable creates an alias of the original object. Image by the author.

How To Copy in Python

Let’s go over a common example: You want to take a copy of a list in such a way that the original list remains unchanged when updating the copied list. There are two ways to copy in Python:

  1. Shallow copy
  2. Deep copy

Both of these methods are implemented in the copy module. To use these, you need to import the copy module into your project.

  • To take a shallow copy, call copy.copy(object).
  • To take a deep copy, call copy.deepcopy().

Let’s dig deeper into the details.

Shallow Copy vs. Deep Copy

Shallow copy

A shallow copy is shallow because it only copies the object but not its child objects. Instead, the child objects refer to the original object’s child objects.

This can be demonstrated by the following example:

import copy

# Lists inside a list
groups = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

# Let's take a shallow copy of groups
new_groups = copy.copy(groups)

Here, you have groups, which is a list of lists of numbers. And you have a shallow copy called new_groups. Let’s examine these:

  1. The ID of groups is not the same as the ID of new_groups, which is reasonable, as new_groups is a copy of groups.
print(id(groups), id(new_groups))

Output:

140315092110600 140315092266184

It seems as if new_groups was a completely independent copy of groups. But let’s take it a step further to see that it’s not.

2. Here is where the shallowness becomes evident. The IDs of the lists in the new_groups are equal to the IDs of the lists of the original groups:

print(id(groups[0]), id(new_groups[0]))

Output:

140315092110664 140315092110664

The shallowness means only the “outer” list gets copied. But the inner lists still refer to the lists of the original list. Due to this, changing a number in the copied list affects the original list:

new_groups[0][0] = 100000print(new_groups[0])
print(groups[0])

Output:

[100000, 2, 3]
[100000, 2, 3]

3. The “outer” list of new_groups is a “real” copy of the original groups. Thus you can add new elements to it or even replace existing ones. These changes won’t affect the original groups list.

For example, let’s replace the first list in new_groups with a string. This should not affectgroups.

new_groups[0] = "Something else"print(new_groups)
print(groups)

Output:

Something else
[100000, 2, 3]

Deep copy

A deep copy creates a completely independent copy of the original object.

This is pretty straightforward. But for the sake of completeness, let’s repeat the experiments above with a deep copy too:

import copy

# Lists inside a list
groups = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

# Let's take a deep copy of groups
new_groups = copy.deepcopy(groups)

Here, you have groups, which is a list that contains lists of numbers. And you take a deep copy, new_groups. Let’s examine these:

  1. The IDs of groups and new_groups do not match:
print(id(groups), id(new_groups))

Output:

140416138566984 140416138739016

2. The IDs of the lists in new_groups are not equal to the IDs of the lists in groups:

print(id(groups[0]), id(new_groups[0]))

Output:

140416138567048 140416138566728

Changing a number in new_groups doesn’t change that value in the original groups.

new_groups[0][0] = 100000print(new_groups[0])
print(groups[0])

Output:

[100000, 2, 3]
[1, 2, 3]

3. The new_groups is an independent copy of groups. Thus, there is no way changes made in new_groups would be visible in the original groups.

new_groups[0] = "Something else"print(new_groups)
print(groups)

Output:

Something else
[1, 2, 3]

Conclusion

Today you learned what is the difference between shallow copy vs deepcopy in Python.

A shallow copy is a “one-level-deep” copy. It constructs a copied object. But the child objects refer to the children of the original object. Thus, it may seem a bit “strange” at first.

A deep copy is the “real copy.” It is an independent copy of the original object.

Most of the time, the deep copy is what you want.

Thanks for reading. I hope you learned something new today.

Happy coding!

Further Reading

Python Interview Questions and Answers

Useful Advanced Features of Python

Share on facebook
Facebook
Share on google
Google+
Share on twitter
Twitter
Share on linkedin
LinkedIn
Share on pinterest
Pinterest

Leave a Comment

Your email address will not be published.