Python list vs set examples and performance

In Python there are two 'similar' data structures:

Which to be used can make a huge difference for the programmer, the code logic and the performance. In this post are listed when to use list/when to use set, several examples and performance tests.

Some key difference between lists and sets in Python with examples:

List allow duplicated values. Sets can't contain duplicates

The major difference for me is that list contains duplication while the set has only unique values. This can be seen from this example below:

mylist = [1,2,3,4,5, 1,2,3,4,5]
myset = {1,2,3,4,5}
list_to_set = set(mylist)
print(type(mylist))
print(type(myset))
print('{} - {}'.format(type(list_to_set), list_to_set))

result:

<class 'list'>
<class 'set'>
<class 'set'> - {1, 2, 3, 4, 5}

Sets are unordered

Another key difference is that list has order while the sets are without. In Other words if you try to get the first element of a set you will end with error: TypeError: 'set' object does not support indexing as the example below:

mylist = [1,2,3,4,5, 1,2,3,4,5]
myset = {1,2,3,4,5}
print(mylist[0])
print(myset[0])

result:

1
TypeError: 'set' object does not support indexing

Sets are more efficient than lists

Hash lookup is used for searching in sets which means that they are considerably faster than searching in list. The next example demonstrate how much faster are sets in comparison to lists. For 100000 times searching in list and set we have the following times:

  • list - 49.663 seconds
  • set - 0.007 seconds
import cProfile

def before():
    for i in range(1, 100000):
        i in mylist


def after():
    for i in range(1, 100000):
        i in myset

mylist = []
for i in range(1, 100000):
    mylist.append(i)
myset = set(mylist)

cProfile.run('before()')
cProfile.run('after()')

result:
4 function calls in 49.663 seconds
4 function calls in 0.007 seconds

As you can see the searching in list is much more slower in comparison to set. So if you want to improve the performance of your Python applications you can consider using sets where it's possible.

List can store anything while set stores only hashable items

This code example demonstrates this problem:

mylist = (([5],['b']))
myset = {((5),('b'))}
myset = {([5],['b'])}

The third line will raise error:

TypeError: unhashable type: 'list'

Because the set works only with hashable items. In other words you can add tuples to set but not lists. So if you want to get lists of lists then you need to use list.

Related Article