Equality for Python


A few days ago in #chipy, the chat room for the Chicago Python Users Group, we had a chat about how Python determines equality. It's a pretty neat and extensible technique, so I'm going to walk through how I recently used it for playing cards.

Here's the basic Card class. Note that I'm going to totally skip things like error-checking and documentation to keep the example obvious.

values = ('2', '3', '4', '5', '6', '7', '8', '9', '10', 'J', 'Q', 'K', 'A')
suits = ('h', 'c', 'd', 's')

class Card:
    def __init__(self, value, suit):
        self.value, self.suit = value, suit

    def __repr__(self):
        return "<Card: %s%s>" % (self.value, self.card)

Man, does code get short when you don't bother checking for errors. The usage is pretty clear, but there's one odd issue:

>>> Card('3', 's')
<Card: 3s>
>>> Card('3', 's') == Card('3', 's')
False

Huh? That's odd, an instance of Card doesn't equal another Card just like itself? Well, let's look at the Python docs. It talks a bit about comparing the builtin types (numbers, strings, lists...) and then says: "Most other types compare unequal unless they are the same object".

Python does this by comparing the ids of the objects. You can call id() on your objects and see that even identically-constructed objects have different ids because they're in different locations in memory. This is decent default because you wouldn't want Python walking deeply through all of your objects, a potentially expensive operation. Python does do a little bit more for equality, as implied by that "most" in the documentation.

Python does one more thing for us. It looks for a function named __eq__ to call on the left-hand object and uses it to determine equality. So let's add it to Card:

    def __eq__(self, card):
        return self.value == card.value and self.suit == card.suit

Easy enough. And usage:

>>> Card('3', 's') == Card('3', 's')
True
>>> Card('3', 's') == Card('K', 'h')
False
>>> Card('3', 's') != Card('3', 's')
True

Now that last one is a bit surprising. Back at the docs, we learn "There are no implied relationships among the comparison operators. The truth of x==y does not imply that x!=y is false. Accordingly, when defining __eq__(), one should also define __ne__() so that the operators will behave as expected." That's:

    def __ne__(self, card):
        return not self.__eq__(card)

After that, the Cards equate properly and everything's happy. One of the best things about Python is that it regularly gives you a sensible default and then lets you customize your code to work seamlessly with the language. This is what us Python fans mean when we go on and on about code being "Pythonic".

In a later article, I'll address using the Flyweight design pattern to solve this example in a different way.

But consider performance

The advantage of comparing id's of objects is that it's faster. If you can live with it you might want to use a form of the singleton pattern to have only one card of each and then comparing with normal equality will work and be much faster.

There are cases when you need more than one card and then you need to ensure your comparisons will work. For these times it's good to know about such issues.

I'm aware comparing ids is

I'm aware comparing ids is faster, but it's only useful for examples with trivial objects which is why I presented the entire technique. As I noted at the bottom of the post, I'm going to post about the Flyweight (not Singleton) pattern to solve exactly that issue.