Software, Tech & Coding simplified.

Python How to Convert Bytes to String

To convert bytes into a string in Python, use the bytes.decode() method.

For instance:

name_byte = b'Alice'
 
name_str = name_byte.decode()

print(name_str)

Output:

Alice

This is the quick answer.

However, depending on the context and your needs, there are other ways to convert bytes to strings.

In this guide, you learn how to convert bytes to string in 5 different ways in different situations.

Last but not least, you are going to learn what to do when the UTF-8 encoding produces an error.

Bytes vs Strings in Python

You probably know what bytes are, there is a chance you are looking to convert bytes to string because you do not know what they are. Before jumping into the conversions, let’s take a quick look at what are bytes in the first place.

You can only store bytes on a computer.

A computer does not know what is a string, image, or song. A computer can only read bytes of data.

In Python, a byte string is a sequence of bytes. This is the language computers only understand. Bytes are not human-readable.

Everything needs to be converted to a byte string before storing it on a computer.

A string, in turn, is a sequence of characters. A string is something we humans can understand.

However, you cannot store a string to a computer as-is because a computer does not understand the notion of strings or words.

Thus, any string needs to be converted to a byte string before the computer can use it.

In Python, a bytes object is a byte representation of a string. A bytes object is prefixed with the letter ‘b’.

For example, take a look at these two variables:

name1 = 'Alice'
name2 = b'Alice'

In this piece of code:

  • name1 is a str object.
  • name2 is a bytes object.

You can verify this by printing out the data types of these variables:

name1 = 'Alice'
name2 = b'Alice'

print(type(name1))
print(type(name2)) 

Output:

<class 'str'>
<class 'bytes'>

But what about human readability?

Let’s print the name1 character by character:

name1 = 'Alice'
name2 = b'Alice'

for c in name1:
    print(c)

Output:

A
l
i
c
e

Now, let’s print each byte in the name2 bytes object:

name1 = 'Alice'
name2 = b'Alice'

for c in name2:
    print(c)

Output:

65
108
105
99
101

As you can see, there is no way for you to tell what those numbers mean.

Those numbers are the byte values of the characters in a string.

This is something the computer can understand.

To make one more thing clear, let’s see what happens if we print the bytes object name2 as-is:

name1 = 'Alice'
name2 = b'Alice'

print(name2)

Output:

b'Alice'

But wait a minute. You can clearly see it says “Alice”.

This is because what you see is actually a string representation of the bytes object.

Python does this for your convenience.

If there was no special string representation for a bytes object, printing bytes would be nonsense.

Anyway, now you understand what is a bytes object in Python, and how it differs from the str object.

Now, let’s see how to convert between bytes and string.

1. decode() Function

decode bytes to string in Python

Given a bytes object, you can use the built-in decode() method to convert the byte to a string.

You can also pass the encoding type to this function as an argument.

For example, let’s use the UTF-8 encoding for converting bytes to a string:

byte_string = b"Do you want a slice of \xf0\x9f\x8d\x95?"

string = byte_string.decode('UTF-8')

print(string)

Output:

Do you want a slice of 🍕?

This is a clear and readable way to decode bytes into a string.

2. str() Function

Another approach to convert bytes to string is by using the built-in str() function.

This method does the exact same thing as the decode() method in the previous example.

For instance:

byte_string = b"Do you want a slice of \xf0\x9f\x8d\x95?"

string = str(byte_string, 'UTF-8')

print(string)

Output:

Alice

Perhaps the only downside to this approach is in the code readability.

If you compare these two lines:

name_str = str(byte_string, 'UTF-8')
name_str = byte_string.decode('UTF-8')

You can see the latter is more explicit about decoding the bytes.

3. Codecs decode() Function

Python also has a built-in codecs module for text decoding and encoding.

This module also has its own decode() function. You can use this function to convert bytes to strings (and vice versa).

For instance:

import codecs

byte_string = b"Do you want a slice of \xf0\x9f\x8d\x95?"
name_byte = codecs.decode(byte_string)

print(name_byte)

Output:

Do you want a slice of 🍕?

4. Pandas decode() Function

If you are working with pandas and you have a data frame that consists of bytes, you can easily convert them to strings by calling the str.decode() function on a column.

For instance:

import pandas as pd

data_bytes = {'column' : [b'Alice', b'Bob', b'Charlie']}
df = pd.DataFrame(data=data_bytes)
 
data_strings = df['column'].str.decode("utf-8")

print(data_strings)

Output:

0      Alice
1        Bob
2    Charlie
Name: column, dtype: object

5. map() Function: Convert a Byte List to String

In Python, a string is a group of characters.

Each Python character is associated with a Unicode value, which is an integer.

Thus, you can convert an integer to a character in Python.

To do this, you can call the built-in chr() function on an integer.

Given a list of integers, you can use the map() function to map each integer to a character.

Here is how it looks in code:

byte_data = [65, 108, 105, 99, 101]

strings = "".join(map(chr, byte_data))
print(strings)

Output:

Alice

This piece of code:

  1. Converts the integers to corresponding characters.
  2. Returns a list of characters.
  3. Merges the list of characters to a single string.

To learn more about the map() function in Python, feel free to read this article.

Be Careful with the Encoding

There are dozens of byte to string encodings out there.

In this guide, we only used the UTF-8 encoding, which is the most popular encoding type.

The UTF-8 is also the default encoding type in Python.

However, UTF-8 encoding is not always the correct one.

For instance:

s = b"test \xe7\xf8\xe9"
s.decode('UTF-8')

Output:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe7 in position 5: invalid continuation byte

This error means there is no character in the UTF-8 encoding that corresponds to the bytes in the string.

In other words, you should be using a different encoding.

But how can you determine which encoding should you use then?

You can use a module like chardet to detect the character encodings. (Notice that this module is not maintained, but most of the info you learn about it is still applicable.)

However, no approach is 100% foolproof. This module gives you its best guess about the encoding and the probability associated with it.

Anyway, let’s say the above byte string can be decoded using the latin1 encoding as well as the iso_8559_5 encoding.

Now let’s make the conversion:

s = b"test \xe7\xf8\xe9"
print(s.decode('latin1'))
print(s.decode('iso8859_5'))

Output:

test çøé
test чјщ

This time there is no error.

Instead, it works with both encodings and produces a different result.

So be careful with the encodings!

If you see an error like above, the first thing you need to do is to figure out the encoding being used.

Then you should use that particular encoding to encode/decode your values.

Conclusion

Today you learned how to convert bytes to string in Python.

To recap, there is a bunch of ways to convert bytes to strings in Python.

  • To convert a byte string to a string, use the bytes.decode() method.
  • If you have a list of bytes, call chr() function on each byte using the map() function (or a for loop)
  • If you have a pandas dataframe with bytes, call the .str.decode() method on the column with bytes.

By default, the Python character encoding is usually UTF-8.

However, this is not always applicable. Trying to encode a non-UTF-8 byte with UTF-8 produces an error. In this situation, you should determine the right character encoding before encoding/decoding. You can use a module like chardet to do this.

Further Reading

Python Interview Questions

Share

Share on twitter
Share on linkedin
Share on facebook
Share on pinterest
Share on email