Jupyter/iPython Notebook download files as CSV, zip

If you need to keep the Jupyter Notebook output data for some reason and download it as a file csv/zip(for larger files)/.ipynb(all cells) on your local machine then you have several options depending on the server and configuration. If tried to investigate all possible ways - the result is in this post:

Download the whole notebook

This is useful when you want to download the whole notebook with all cells, outputs and states. Sometimes the files is too big and this method is not convenient or you may need only a specific output file. In this case you can check the next section.

For Jupyter notebook you can download the notebooks as .ipynb then you can:

  • Open your notebook in Jupyter
  • Click File
  • Download as
  • Choose format:
    • Notebook (.ipynb)
    • Python (.py)
    • HTML (.html)
    • Markdown (.md)
    • LaTeX (.tex)
    • PDF via LaTeX (.pdf)
  • After the selection of your preferred format (.ipynb, .py, .html) the file will be downloaded on your local computer.

You can test it on the online demonstration here: Welcome to Jupyter!

jupyter_download

Create and download CSV/zip file

In this section you will see how to create a single output file in jupyter and download the file as CSV or zip depending on the size. Note that some browsers will work only with small sized dataframes. For example Chrome's max data URI size is 2MB. In this case, this is after Base64 encoding.

For the simple and small data frame which can be download as CSV file you can use:

from IPython.display import HTML

def create_download_link( df, title = "Download CSV file", filename = "data.csv"):  
    csv = df.to_csv()
    b64 = base64.b64encode(csv.encode())
    payload = b64.decode()
    html = '<a download="{filename}" href="data:text/csv;base64,{payload}" target="_blank">{title}</a>'
    html = html.format(payload=payload,title=title,filename=filename)
    return HTML(html)

create_download_link(df)

source of the code: How to Upload/Download Files to/from Notebook in my Local machine

This will not work for larger dataframes. In this case you can zip you file and download it as a zip. This ensures working with large dataframes without the need of server configurations or additional setup. This is the example which allows you creating csv or zip file with Jupyter/iPython Notebook:

def create_download_link( df, title = "Download CSV file", filename = "data.csv", type='zip'):
    if type != 'zip':
        csv = df.to_csv()
        b64 = base64.b64encode(csv.encode())
        payload = b64.decode()
    else:
        payload = df.decode()
    html = '<a download="{filename}" href="data:text/csv;base64,{payload}" target="_blank">{title}</a>'
    html = html.format(payload=payload,title=title,filename=filename)
    return html

def create_download_files(mydata):
    dir = os.path.dirname('home/user/')
    if not os.path.exists(dir):
        os.makedirs(dir)

    with open('/home/user/myfile.txt', 'w+') as csv:
        for line in mydata:
            csv.write(line + '\n')
    zipfile.ZipFile('/home/user/myfile.zip', 'w', zipfile.ZIP_DEFLATED).write("/home/user/myfile.txt")

the usage(in the cell) for csv and zip is:

create_download_link(df, Download CSV file,'mycsv.csv', csv)
create_download_link(df, Download ZIP file,'mycsv.zip', zip)

Direct download with FileLinks('/path/to/')

Assuming that you have the files stored on the server and proper configuration to download files then you can use this python code in order to get any file from the server. If the configuration is not correct you will be redirected to wrong address and the file will be unavailable for download:

from IPython.display import Filelink, FileLinks

p_df.to_csv('/path/to/data.csv', index=False)

FileLinks('/path/to/')

In order to ensure that the files can be downloaded from the server you may need to setup a small web server with python. You can see more information here: Ubuntu 18 Start simple web host with python on localhost - SoftHints. In short this is:

cd \home\folder\site
python -m http.server 8080

Using extension to download folders

This is extension - nbzip - provides a button to zip and download a jupyter server folder. You can check the latest version here:

Zips and downloads all the contents of a jupyter notebook

In order to enable it:

  • install it by:
pip install nbzip
  • enable it by:
jupyter serverextension enable --py nbzip --sys-prefix
jupyter nbextension install --py nbzip
jupyter nbextension enable --py nbzip
  • You can test that you have new button next to apply button which is downloading the folder content as an archive.

Download files with Linux command

Another option for downloading information from your jupyter server if the files are stored on the server is by linux commands like:

zip -r <Name_of_your_new_zip_file> <Path/of/folder/to/zip>
tar -czf archive.tar.gz foldername

and then using python code to create downloadable link:

from IPython.display import Filelink, FileLinks

p_df.to_csv('/path/to/data.csv', index=False)
p_df.to_excel('/path/to/data.xlsx', index=False)

FileLinks('/path/to/')

or anyother way to download information from your server like scp:

scp username@remotehost.edu:data.zip /local/directory 

Related Article