Pycharm DataFrame URLDecoder: Illegal hex characters in escape (%) pattern - Error at index 0 in: " E"

Have you tried to view a Pandas DataFrame but got a URLDecoder error like:

URLDecoder: Illegal hex characters in escape (%) pattern - Error at index 0 in: " E"

The error might slightly differ for different DataFrames and doesn't depend on the size or data stored in inside.

In this short post you can find:

  • Step 1. Detect the problem
  • Step 2. Keep original data (optional)
  • Step 3. Solve the problem - df2.columns.str.replace('%', '')

The reason for Pycharm DataFrame URLDecoder

The error shows the problem but not the place which cause this problem. Below you can find simply python code which demonstrate the error:

import pandas as pd

# data for our DataFrames
data = [['Python', 50], ['Java', 30], ['Javascript', 5]]

# Create two pandas DataFrames with the same data
df = pd.DataFrame(data, columns=['Language', '% Percent'])
df2 = pd.DataFrame(data, columns=['Language', '% Percent'])

# Clean column names from special symbols like %
df2.columns = df2.columns.str.replace('%', '%25')

Now if you try to view DataFrame - df in PyCharm you will get error like the one described above. While the second DataFrame can be viewed without any problems:

pycharm_dataframe_error_urldecoder_illegal_hex

The solution for Pycharm DataFrame Error

The simplest possible solution is to remove all bad escape characters like: % - percentage sign by:

df2.columns = df2.columns.str.replace('%', '')

There is also another solution if you need to keep column names as they are. Lets work with percentage sign - %. You discover that percentage symbol is causing problem for your DataFrame view. But you still need to represent it in your column names - then you can find the ASCII encoding for this character and then replace it by it - %25. In this case I used this table for reference: Percent-encoding

df2.columns = df2.columns.str.replace('%', '%25')

Detection of the error for big DataFrames

This error will be raised for bad escape characters in both: index and column names. If the problematic escape sequence is in the values - no errors will be raised when you try to view DataFrame. The simplest way to detect the problem is by this code:

columns = df2.columns.str.replace(r'[A-Za-z0-9]+','')
index = df2.index.astype(str).str.replace(r'[A-Za-z0-9]+','')

Which is going to show everything escape letters and numbers:

Index(['', '% '], dtype='object'

And you can find that percentage sign is causing the problem.

Note: Another possible display problem for PyCharm and Dataframes is related to quotes. If there are quoted values read from CSV file then PyCharm will show: nothing to show as below. Once the quotes are removed from the values of the CSV file - display will work fine:

pycharm_dataframe_nothing_to_show

Related Article