Tuesday, January 22, 2019

A subtle error when initializing multiple variables on one line in Python

From a code-style perspective, I don't often like to put multiple variable initializations on a single line, but once in a while it makes sense to do so:

foo = goo = None

Today, I ran into a subtle bug that can result from initializing multiple variables on one line, when the variables are of mutable types such as lists.

I was writing code to populate two lists, make some changes to the contents of each list, and then compare the two lists. Something like this:

list1 = list2 = []
list1.append(1)
list2.append(2)
print 'Are they equal?', list1 == list2  # True !!!

The output, to my surprise, showed that the two lists are equal. Upon changing the code to initialize each list on its own line::

list1 = []
list2 = []
list1.append(1)
list2.append(2)
print 'Are they equal?', list1 == list2  # False

... the lists are now shown to be unequal, as I expected. Here's why...

When the right-hand side of a Python statement is a mutable variable, nothing is created in memory. After the assignment, both variables refer to the already-existing object. (Thanks to https://medium.com/broken-window/many-names-one-memory-address-122f78734cb6 for this explanation).

In the assignment statement list1 = list2 = [], the right-hand side is list2 = [], which -- although it's also an assignment -- is a list and therefore mutable. This makes list1 an "alias" for list2, leading to the unexpected result.

If you'd rather never have to think about this, then it's reasonable to just avoid initializing multiple variables on the same line.