Python convert normal JSON to JSON separated lines 3 examples

Python convert normal JSON to JSON separated lines 3 examples

If you want to convert .json to .jl (or normal JSON files to JSON line separated) you can do this in several different ways:

  • using pandas
  • using package jsonlines
  • use pure python

What is JSON vs JSON lines

Simple JSON files have single JSON object on many lines while JSON lines have individual JSON objects on separated lines. Example:

normal JSON

[
  {
    "id": 1,
    "label": "A",
    "size": "S"
  },..

JSON separated lines

{"id":1,"label":"A","size":"S"}
{"id":2,"label":"B","size":"XL"}

Pandas convert normal JSON to JSONL

The first thing which came to my mind when working with JSON files and python is pandas. It gives a lot of options and ways to perform things. Let see how Pandas can be used in order to read normal JSON file and save the information in another file which has data separated on different lines for objects. This example show the conversion with pandas:

import pandas as pd

df = pd.read_json('/home/user/data/normal_json.json')

df.to_json("/home/user/data/json_lines.jl", orient="records", lines=True)

the output will be:

{"id":1,"label":"A","size":"S"}
{"id":2,"label":"B","size":"XL"}
{"id":3,"label":"C","size":"XXl"}

Convert normal JSON to JSONL with jsonlines

There is a python package which intention is to make work with JSON lines easier. The package can be installed by:

pip install jsonlines

Documentation of this package is here: jsonlines is a Python library to simplify working with jsonlines and ndjson data.

Example how to convert the normal JSON file to line separated:

import jsonlines
import json


with open('/home/user/data/normal_json.json', 'r') as f:
    json_data = json.load(f)

with jsonlines.open('/home/user/data/json_lines2.jl', 'w') as writer:
    writer.write_all(json_data)

result:

{"id": 1, "label": "A", "size": "S"}
{"id": 2, "label": "B", "size": "XL"}
{"id": 3, "label": "C", "size": "XXl"}

Note : there is a difference between the output of this package and pandas. The pandas one is shorted (no spaces) and for this example the difference is 114 Bytes vs 98 Bytes (pandas)

Convert normal JSON to JSONL only with Python

Finally pure python can be used in order to convert from .json to .jl. In the code snippet below you can find how to do it:

import json

with open('/home/user/data/normal_json.json', 'r') as f:
    json_data = json.load(f)
    
with open('/home/user/data/json_lines.jl', 'w') as outfile:
    for entry in json_data:
        json.dump(entry, outfile)
        outfile.write('\n')

result:

{"id": 1, "label": "A", "size": "S"}
{"id": 2, "label": "B", "size": "XL"}
{"id": 3, "label": "C", "size": "XXl"}

Data

Full JSON file:

[
  {
    "id": 1,
    "label": "A",
    "size": "S"
  },
  {
    "id": 2,
    "label": "B",
    "size": "XL"
  },
  {
    "id": 3,
    "label": "C",
    "size": "XXl"
  }
]
Share Tweet Send
0 Comments
Loading...