Have you tried to work with Pandas, but got errors like:

TypeError: unhashable type: 'list'

or

TypeError: unhashable type: 'dict'

The problem is that a list/dict can't be used as the key in a dict, since dict keys need to be immutable and unique.

This is a list:

If so, I'll show you the steps - how to investigate the errors and possible solution depending on the reason. In this article we are working with simple Pandas DataFrame like:

col1 col2 col3
0 1 [0.5, 0.1] {0: 'a', 1: 'b'}
1 2 [0.75, 0.25] {0: 'c', 1: 'd'}

created by:

df = pd.DataFrame({'col1': [1, 2], 'col2': [[0.5, 0.1], [0.75, 0.25]],'col3': [{0:'a', 1:'b'}, {0:'c', 1:'d'}]})

Step #1: TypeError: unhashable type: 'list'/'dict'

The errors is common for operations like:

  • value_counts
  • groupby
  • transform

when these operations are applied against column of type: 'dict' or 'list'.

Examples for the above DataFrame:

df.col2.value_counts()

will result in:

TypeError: unhashable type: 'list'
df.col3.value_counts()

will result in:

TypeError: unhashable type: 'dict'  
df.groupby('col3').transform({'col1': [min], 'col2': max})

will result in:

TypeError: unhashable type: 'dict'  

For the last examples there are issues in Pandas:

Step #2: How to detect if column contains list or dict

The first step when the error appears is to identify the columns and what is stored inside. If we try to use the basic df.dtypes this won't give proper results - whether the column is list, dict or string:

col1    int64 
col2    object
col3    object

So in order to identify the correct types stored in those columns we need to do something else. For example check which of all columns in a DataFrame have list values inside we can do::

# detect list columns
df.applymap(lambda x: isinstance(x, list)).all()

result in:

col1    False
col2    True 
col3    False

And for dicts we can do:

# detect list columns
df.applymap(lambda x: isinstance(x, dict)).all()

result in:

col1    False
col2    False
col3    True 

What about to test column is it list or dict? In this case we can combine both in:

# detect dict or list columns
df.applymap(lambda x: isinstance(x, dict) or isinstance(x, list)).all()

So we will have as result:

col1    False
col2    True 
col3    True 

When the problematic columns/data is identified we can continue with applying of the next solutions.

Step #3: Convert the column to string and apply value_counts

The first and the most basic solution would be to convert the column to string and apply the operation like: value_counts or groupby:

df['col2'].astype('str').value_counts()
[0.5, 0.1]      1
[0.75, 0.25]    1
df['col3'].astype('str').value_counts()
{0: 'a', 1: 'b'}    1
{0: 'c', 1: 'd'}    1

This will give results for the column as single entities. If you want to get the counts for the elements inside the list then you can check this video: Pandas count values in a column of type list.

Note: It's really important that you are able to distinguish:

  • column which contains list stored as string
  • column which contains list stored as list
    and the operations which can be applied on them. In the bonus step we will see these difference when we try to expand column!

The same error and solution are visible for groupby:

# TypeError: unhashable type: 'dict'
df[df.col3.notna()].groupby(['col3']).count()

while this will work:

df[df.col2.notna()].astype('str').groupby(['col2']).count()

output:

col1 col3
col2
[0.5, 0.1] 1 1
[0.75, 0.25] 1 1

Step #4: Convert list/dict column to tuple

Another possible solution is first to convert the list/dict columns to tuple and apply the operations on it. For this solution it's important to note that results differ for list and dict as shown below:

# for list
df['col2'].apply(tuple).value_counts()

result:

(0.5, 0.1)      1
(0.75, 0.25)    1

while for dict only the keys will be part of the final result:

# for dict
df['col3'].apply(tuple).value_counts()

result:

(0, 1)    2

Step #5: Expand the list column

Another possible solution is to expand the list column. The column should contain list stored as a list and not as a string. Otherwise the output will be unexpected:

df.col2.apply(pd.Series)[0].value_counts()

result:

0.75    1
0.50    1
df.col2.apply(pd.Series)[1].value_counts()

result:

0.10    1
0.25    1

Step #6: List column mixed: strings and list items

Sometimes the columns will have mixed values - for example: numbers, strings and lists. Alternative solution for this case is apply - check each value and convert the values - for example extract the item stored as list. So this time we will work with this DataFrame:

df = pd.DataFrame({'col1': [1, 2], 'col2': [[0.5], 3],'col3': [{0:'a', 1:'b'}, {0:'c', 1:'d'}]})

and we will get the first element of a list or we will keep the values as they are:

df.applymap(lambda x: x[0] if isinstance(x, list) else x)['col2'].value_counts()

this will result in:

3.0    1
0.5    1

Video Tutorial

Bonus Step #1: Correct way to expand list column

One common mistake for Pandas and newbies is applying operation on incorrect data type. Let check an example for using str.split on DataFrame column list. Some will expect the column to be expanded into several columns based on the split:

df.col2.str.split(',', expand=False)

but actually this will product:

0   NaN
1   NaN

And the problem is that we are trying to apply string operation on list values. We might think that simply converting the list column to string will solve the problem:

df.col2.astype('str').str.split(',', expand=True)

but this will add brackets to first and last cell like(which is not best option):

  • [0.5
  • 0.1]

So the correct way to expand list or dict columns by preserving the correct values and format will be by applying apply(pd.Series):

df.col2.apply(pd.Series)

This operation is the optimal way to expand list/dict column when the values are stored as list/dict. If the values are stored as a string than str.split(',', expand=True) might be used.