Python String Comparison: A Step-by-Step Guide (with Examples)

Python string comparison is possible using the comparison operators: ==, !=, <, >, <=, >=.

For example:

"Alice" == "Bob" # False
"Alice" != "Bob" # True

"Alice" < "Bob" # True
"Alice" > "Bob" # False

"Alice" <= "Bob" # True
"Alice" >= "Bob" # False

String Comparison in Python

Python comes with a list of built-in comparison methods: ==, !=, <, >, <=, >=.

You commonly see comparisons made between numeric types in Python. But you can compare strings just as well. As it turns out, comparing strings translates to comparing numbers under the hood.

Before jumping into the details, let’s briefly see how to compare strings in Python.

Comparing Strings with == and !=

Comparing strings with equal to and not equal to operators is easy to understand. You can check if a string is or is not equal to another string.

For example:

name = "Jack"

print(name == "John")
print(name != "John")

Output:

False
True

Comparing Strings with <, >, <=, and >=

To compare strings alphabetically, you can use the operators <, >, <=, >=.

For instance, let’s compare the names “Alice” and “Bob”. This comparison corresponds to checking if Alice is before Bob in alphabetical order.

print("Alice" < "Bob")

Output:

True

Now you have the tools for comparing strings in Python. Next, let’s understand how the string comparison works behind the scenes.

String Unicodes in Python

In reality, comparing Python strings means comparing integers under the hood.

To understand how it works, you first need to understand the concept of Unicode.

Python string uses the Unicode Standard for representing characters. This means each character has a unique integer code assigned to it. It is this Unicode integer value that is compared when comparing strings in Python

Here is the Unicode table for English characters (also known as the ASCII values).

UnicodeCharacterUnicodeCharacterUnicodeCharacterUnicodeCharacter
64@80P96`112p
65A81Q97a113q
66B82R98b114r
67C83S99c115s
68D84T100d116t
69E85U101e117u
70F86V102f118v
71G87W103g119w
72H88X104h120x
73I89Y105i121y
74J90Z106j122z
75K91[107k123{
76L92\108l124|
77M93]109m125}
78N94^110n126~
79O95_111o

When a Python program compares strings, it compares the Unicode values of the characters.

By the way, to check the Unicode of a character, you do not have to look it up from this table. Instead, you can use the built-in ord() function.

For instance:

>>> ord('a')
97
>>> ord('b')
98
>>> ord('c')
99
>>> ord('d')
100

Now, let’s check the Unicode values for the capitalized versions of the above four characters:

>>> ord('A')
65
>>> ord('B')
66
>>> ord('C')
67
>>> ord('D')
68

As you can see, the Unicode values for capitals characters differ from their lowercase counterparts. This highlights an important point—Python is case-sensitive with characters and strings.

For example, the result of this comparison:

'A' < 'a'

Yields True.

This is because:

  • The ord() function returns 65 for ‘A’ .
  • The ord() function returns 97 for ‘a’.
  • –> The result of 65 < 97 is True.

How Python String Comparison Works Under the Hood

When you compare strings in Python the strings are compared character by character using the Unicode values.

When you compare two characters, the process is rather simple. But what happens when you compare strings, that is, sequences of characters?

Let’s demonstrate the process with examples.

Example 1—Which String Comes First in Alphabetic Order

Let’s compare the two names “Alice” and “Bob” to see if “Alice” is less than “Bob”:

>>> print("Alice" < "Bob")
True

This states that “Alice” is less than “Bob”. In real life, this means that Alice comes before Bob in alphabetical order, which totally makes sense.

But how does Python know it?

Python starts by comparing the first characters of the strings. In the case of “Alice” and “Bob” it starts by checking if ‘A’ is less than ‘B’ in Unicode:

>>> ord('A') < ord('B') # Corresponds to 65 < 66
True

As ord(‘A’) returns the Unicode value of 65 and ord(‘B’) 66, the comparison evaluates to True.

This means Python does not need to continue any further. Based on the first letters it is already able to determine that “Alice” is less than “Bob” because ‘A’ is less than ‘B’ in Unicode.

This is the simplest way to understand how Python compares strings.

Let’s see another a bit trickier example where the compared strings have same first letters.

Example 2—How to Compare Strings with Equal First Letters

What if the first letters are equal when comparing two strings? No problem, Python then compares the second letters.

For instance, let’s check if “Axel” comes before “Alex” in alphabetical order.

print("Axel" < "Alex")

The result:

False

This suggests that Alex comes before Axel, which is indeed the case.

Let’s see how Python was able to determine this:

  1. The first letters are compared. Both are ‘A’, so there is a “tie”. The comparison continues to the next characters.
  2. The second characters are are ‘x’ and ‘l’. The unicode value for ‘x’ is 120 and 108 for ‘l’. And 120 < 108 returns False. Thus the whole string comparison returns False.

Example 3—How to Compare Strings with Identical Beginning

What if the strings are otherwise equal, but there are additional characters at the end of the other one?

For instance, can you determine if “Alex” comes before “Alexis” in alphabetical order?

Let’s check this using Python:

print("Alex" < "Alexis")

Result:

True

In this case, the Python interpreter simply treats the longer string as the greater one. In other words, “Alex” is before “Alexis” in alphabetical order.

Now you understand how the string comparison works under the hood in Python.

Finally, let’s take a look at an interesting application of string comparison by comparing timestamps.

Compare Timestamps in Python with String Comparison

In this guide, you have learned that each character in Python has a Unicode value which is an integer. This is no exception to numeric strings.

For example, a string “1” has a Unicode value of 49 and “2” has a Unicode value of 50 and so on:

>>> ord("1")
49
>>> ord("2")
50

The Unicode value of a numeric character grows as the number grows.

This means comparing the order of numeric strings gives you a correct result:

>>> "5" < "8"
True

But why would you ever compare numbers as strings?

Comparing numeric strings is useful when talking about ISO 8601 timestamps of format 2021-12-14T09:30:16+00:00.

For example, let’s check if “2021-12-14T09:30:16+00:00” comes before “2022-01-01T00:00:00+00:00“:

>>> "2021-12-14T09:30:16+00:00" < "2022-01-01T00:00:00+00:00"
True

But wait a minute! Does the comparison operator < have any idea about dates and their precedence?

It does not. It only knows how to perform string comparison character by character.

As you learned in the previous examples, the comparison starts with the 1st character. If they are the same, the comparison continues from the 2nd character and so on.

When comparing an ISO 8601 timestamps in Python, the procedure is the same as comparing any other strings in Python. (Notice that this works because of the ordering of the time components. A year comes before the month. A month comes before the day, and so on. Thus if the years between two timestamps differ, you can draw the conclusion without looking at the rest of the timestamps.)

Here:

  • Both timestamps start with “2″. “2” < “2” is False.
  • The second character is “0” for both. “0” < “0” is False.
  • The third character is “2” for both. “2” < “2” is False.
  • But the fourth character is different. On the left it is “1” but on the right it is “2”. And “1” < “2” returns True.
  • The comparison terminates right here. The left hand side is less than the right hand side. Which means the date on the left happens before the date on the right.

This is exactly how you would compare the timestamps in real life. You would start with the year and notice that 2021 comes before 2022, so no matter what the rest of the timestamps say, the 2021 one must precede 2022.

Conclusion

Comparing strings is an important feature in Python.

Python’s built-in comparison operators can be used in string comparison. These built-in operators are:

  • equal to (==)
  • not equal to (!=)
  • greater than (>)
  • less than (<)
  • less than or equal to (<=)
  • greater than or equal to (>=)

Under the hood, there is no such thing as string comparison. Instead, the numeric codes (Unicodes) of the characters are compared with one another. When two strings have equal first letters, then the second letters are compared. If they are equal too, then the third ones are compared and so on.

Thanks for reading. I hope you find it useful.

Happy coding!

Further Reading

10+ Useful Python Tricks and Tips