Pandas Display All Columns, Rows and Values

In this guide you can find how to display all(more) columns and rows of a Pandas DataFrame. By default Pandas truncates the display of rows and columns(and column width). This behavior might seem to be odd but prevents problems with Jupyter Notebook and display of huge datasets.

Pandas DISPLAY ALL ROWS, Values and Columns

This code force Pandas to display all rows and columns:

import pandas as pd
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)
pd.set_option('display.max_colwidth', None)

Intro

Let's demonstrate the problem. We are working with famous IMDB dataset: IMDB 5000 Movie Dataset

This is the size of this DataFrame:

df.shape

5043 rows × 28 columns

Trying to display this DataFrame in Jupyter Notebook by: df or df.head() results in:

  • 20 columns
  • 60 rows(5 rows)

Another problem is truncation of longer values like: genres:

  • Action|Adventure|Romance - fully displayed
  • Adventure|Animation|Comedy|Family|Fantasy|Musi... - truncated

Default display seems to be 50 characters length. Pandas use ellipsis for truncated columns, rows or values:

pandas_display_all_rows_columns

Step #1: Display all columns and rows with Pandas options

For small to medium datasets you can visualize full DataFrame by setting next options prior displaying your data:

import pandas as pd
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)
pd.set_option('display.max_colwidth', None)

Now display of the same DataFrame shows all columns and rows without limitations. It takes more time to load and 0.5 GB memory to display full dataset. Have in mind that bigger datasets might break your execution and all not saved changes might be lost.

You can find more information and options on this link: pandas.set_option

This is description of: display.max_colwidth : int or None

The maximum width in characters of a column in the repr of a pandas data structure. When the column overflows, a “…” placeholder is embedded in the output. A ‘None’ value means unlimited. [default: 50] [currently: 50]

Older versions of Pandas support negative numbers like:

pd.set_option('display.max_colwidth', -1)

But newer versions (after 1.0) will raise warning message like:

FutureWarning: Passing a negative integer is deprecated in version 1.0 and will not be supported in future version. Instead, use None to not limit the column width.
after removing the cwd from sys.path.

Step #2: Display more or all rows/categories

If you need to show more rows then 60 then you need to enable only this option. Using None will display all rows:

import pandas as pd
pd.set_option('display.max_rows', None)

This option helps to show all results from value_counts - which by default are limited to 10.

pandas_display_all_rows

Note: Please don't forget that if you want to see all results from value_counts you need to use parameter - dropna=False:

df.genres.value_counts(dropna=False).to_frame()

Bonus: You can convert results of value_counts to a DataFrame by .to_frame()

Step #3: Show all columns and column width

Display of all columns depends on several parameters and where Pandas works - Jupyter Notebook or terminal(PyCharm):

pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)
pd.set_option('display.max_colwidth', None)

Let's check their documentation:

display.width - Width of the display in characters. In case python/IPython is running in a terminal this can be set to None and pandas will correctly auto-detect the width. Note that the IPython notebook, IPython qtconsole, or IDLE do not run in a terminal and hence it is not possible to correctly detect the width. [default: 80] [currently: 80]

display.max_columns - If max_cols is exceeded, switch to truncate view. Depending on large_repr, objects are either centrally truncated or printed as a summary view. ‘None’ value means unlimited.

What is the difference? Why do you need all of them in order to display more columns?

display.width is important when Pandas is used with a terminal. If you increase only the display.max_columns then you will see split output for the DataFrame like(shown by backslash):

Company        Date       Date3       Date2  Country Country2 Country1  \
0   Samsung   10/9/2015   10/9/2015   10/9/2015    India    India    India  

    Sells  
0      15  

If you increase the display.width then you can see the whole data on one single row:

    Company        Date       Date3       Date2  Country Country2 Country1  Sells
0   Samsung   10/9/2015   10/9/2015   10/9/2015    India    India    India     15

display.max_colwidth - prevents truncation of values in a given cell like:

Before

  • Action|Adventure|Romance
  • Adventure|Animation|Comedy|Family|Fantasy|Musi...

After:

  • Action|Adventure|Romance
  • Adventure|Animation|Comedy|Family|Fantasy|Musical|Romance

Step #4:Reset pandas display options

If you like to restore previous display options after given cell or piece of code than you can use method reset_option:

pd.reset_option('display.max_rows')

Step #5: Increase Jupyter Notebook cell width

If you have a big monitor you may want to increase the cell width in order to use maximum visual space. This can be done by:

from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))

or

from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))
display(HTML("<style>.output_result { max-width:100% !important; }</style>"))
display(HTML("<style>.prompt { display:none !important; }</style>"))

Pandas will reuse the new space and will show more values at the same time on your output cells.

Features described in this post increase my productivity and efficiency using Pandas. If you have tips like this please share them in the comment section below.

Cheers.

Share Tweet Send
0 Comments
Loading...