Python normalize JSON data with pandas

Very frequently JSON data needs to be normalized in order to presented in different way. Pandas offers easy way to normalize JSON data. There are two option:

  • default - without providing parameters
  • explicit - giving explicit parameters for the normalization

In this post:

Default JSON normalization with Pandas and Python

The basic usage of pandas normalization method is simple as:

data = [{'key': 1, 'values': {'first': 'Java', 'second': 'Python'}},
        {'values': {'first': 'Ruby', 'second': 'PHP'}},
        {'key': 2, 'values': 'Go'}]
print(json_normalize(data))

result:

   key values values.first values.second
0  1.0    NaN         Java        Python
1  NaN    NaN         Ruby           PHP
2  2.0     Go          NaN           NaN

as you can see the result is normalized and the missing information is completed with NaN

Default JSON normalization with Pandas and Python

The basic usage of pandas normalization method is simple as:

data = [{'key': 1, 'values': {'first': 'Java', 'second': 'Python'}},
        {'values': {'first': 'Ruby', 'second': 'PHP'}},
        {'key': 2, 'values': 'Go'}]
print(json_normalize(data))

result:

   key values values.first values.second
0  1.0    NaN         Java        Python
1  NaN    NaN         Ruby           PHP
2  2.0     Go          NaN           NaN

as you can see the result is normalized and the missing information is completed with NaN

Explicit JSON normalization with Pandas and Python

You can define the normalization process by providing parameters. This is very helpful for huge JSON files from which you need only a few fields. The example below is showing the normalization for such files

data = [{'key': 'X','id': 1, 'langs': [{'first': 'Java', 'second': 'Python'},{'first': 'Groovy', 'second': 'C'}]},
        {'key': 'Y', 'id': 2, 'langs': [{'first': 'Ruby', 'second': 'PHP'},{'first': 'Java', 'second': 'Python'}]},
        {'key': "Z",'id': 3, 'langs': [{'first': 'Go', 'second': 'PHP'},{'first': 'JavaScript', 'second': 'Haskel'}]}]
result = json_normalize(data, 'langs',['key', 'id'])
print(result)

result:

        first  second key  id
0        Java  Python   X   1
1      Groovy       C   X   1
2        Ruby     PHP   Y   2
3        Java  Python   Y   2
4          Go     PHP   Z   3
5  JavaScript  Haskel   Z   3

Errors

If you have errors like:

KeyError: "Try running with errors='ignore' as key 'second' is not always present"

you can solve them by adding: errors='ignore'

result = json_normalize(data, 'values',['key', 'second'],errors='ignore')

Real world example with pandas normalization

JSON is very popular format which can be found on many different places. Lets check the Youtube API and request for playlist. The result is in JSON format as you may expect. This is the API request:

https://www.googleapis.com/youtube/v3/playlistItems?part=snippet&playlistId=UUtwKon9qMt5YLVgQt1tvJKg&key={YOURKEY}

Note you need to have Youtube key in order to get result from this query.

Below you can find the return result and the normalized data:

data = [{
 "kind": "youtube#playlistItemListResponse",
 "etag": "\"XI7nbFXulYBIpL0ayR_gDh3eu1k/mdajA0Vg9eMgulyv2Fj_kQdPITw\"",
 "nextPageToken": "CAUQAA",
 "pageInfo": {
  "totalResults": 177,
  "resultsPerPage": 5
 },
 "items": [
  {
   "kind": "youtube#playlistItem",
   "etag": "\"XI7nbFXulYBIpL0ayR_gDh3eu1k/-732mLgEsS0C4OF4tJkU7seYw0g\"",
   "id": "VVV0d0tvbjlxTXQ1WUxWZ1F0MXR2SktnLlNmeWQtMHRYQXJj",
   "snippet": {
    "publishedAt": "2018-07-24T10:06:44.000Z",
    "channelId": "UCtwKon9qMt5YLVgQt1tvJKg",
    "title": "Climbing the Royal Albert Hall - Objectivity #175",
    "description": "Check out The Great Courses Plus (free trial): http://ow.ly/VDOZ30k3p8T \nMore links and info about Royal Albert Hall below ↓↓↓\n\nThis video features archivist Liz Harper from the Royal Albert Hall speaking with Brady.\n\nRoyal Albert Hall: https://www.royalalberthall.com/\n\nMore videos from the world-famous hall coming soon.\n\nObjectivity on Patreon: https://www.patreon.com/objectivity\nThanks Patrons: http://www.bradyharanblog.com/objectivity-patrons/\n\nSubscribe to Objectivity: http://bit.ly/Objectivity_Sub\nFilms by James Hennessy and Brady Haran\n\nRoyal Society website: http://bit.ly/Royal_Society\n\nThe Royal Society's own YouTube channel: https://www.youtube.com/user/RoyalSociety\n\nFacebook: https://www.facebook.com/ObjectivityVideos\nTwitter: https://twitter.com/objectivity_vid\n\nPatron thank you page: http://www.bradyharanblog.com/objectivity-patrons\n\nObjectivity T-Shirts: https://teespring.com/en-GB/stores/objectivity\n\nThe Great Courses Plus is currently available to watch through a web browser to almost anyone in the world and optimized for the US, UK, and Australian markets. The Great Courses Plus is currently working to both optimize the product globally and accept credit card payments globally.\n\nPlease note guests and institutions appearing in our videos are not affiliated with episode sponsors.",
    "thumbnails": {
     "default": {
      "url": "https://i.ytimg.com/vi/Sfyd-0tXArc/default.jpg",
      "width": 120,
      "height": 90
     },
     "medium": {
      "url": "https://i.ytimg.com/vi/Sfyd-0tXArc/mqdefault.jpg",
      "width": 320,
      "height": 180
     },
     "high": {
      "url": "https://i.ytimg.com/vi/Sfyd-0tXArc/hqdefault.jpg",
      "width": 480,
      "height": 360
     },
     "standard": {
      "url": "https://i.ytimg.com/vi/Sfyd-0tXArc/sddefault.jpg",
      "width": 640,
      "height": 480
     },
     "maxres": {
      "url": "https://i.ytimg.com/vi/Sfyd-0tXArc/maxresdefault.jpg",
      "width": 1280,
      "height": 720
     }
    },
    "channelTitle": "Objectivity",
    "playlistId": "UUtwKon9qMt5YLVgQt1tvJKg",
    "position": 0,
    "resourceId": {
     "kind": "youtube#video",
     "videoId": "Sfyd-0tXArc"
    }
   }
  },
  {
   "kind": "youtube#playlistItem",
   "etag": "\"XI7nbFXulYBIpL0ayR_gDh3eu1k/0uFgwY3htzKsLWY3V7TPOaYXcIg\"",
   "id": "VVV0d0tvbjlxTXQ1WUxWZ1F0MXR2SktnLlE0YkQ5dzNKbmFV",
   "snippet": {
    "publishedAt": "2018-07-17T11:13:19.000Z",
    "channelId": "UCtwKon9qMt5YLVgQt1tvJKg",
    "title": "The Cayley Expansion - Objectivity #174",
    "description": "David Eisenbud joins us at The Royal Society to look at the work of one of his all time favourite mathematicians. More links below ↓↓↓\n\nFeaturing David Eisenbud speaking with Brady and Keith Moore.\n\nSubscribe to Objectivity: http://bit.ly/Objectivity_Sub\n\nCheck out David on Numberphile: \nhttps://youtu.be/87uo2TPrsl8\nhttps://youtu.be/MxiTG96QOxw\n\nPrecious Rocks: https://youtu.be/-YOfn0MnM4s\n\nForgotten Plants: https://youtu.be/7rdXsQgfs_g\n\nObjectivity on Patreon: https://www.patreon.com/objectivity\n\nFilms by James Hennessy and Brady Haran\n\nRoyal Society website: http://bit.ly/Royal_Society\n\nThe Royal Society's own YouTube channel: https://www.youtube.com/user/RoyalSociety\n\nFacebook: https://www.facebook.com/ObjectivityVideos\nTwitter: https://twitter.com/objectivity_vid\n\nPatron thank you page: http://www.bradyharanblog.com/objectivity-patrons\n\nObjectivity T-Shirts: https://teespring.com/en-GB/stores/objectivity",
    "thumbnails": {
     "default": {
      "url": "https://i.ytimg.com/vi/Q4bD9w3JnaU/default.jpg",
      "width": 120,
      "height": 90
     },
     "medium": {
      "url": "https://i.ytimg.com/vi/Q4bD9w3JnaU/mqdefault.jpg",
      "width": 320,
      "height": 180
     },
     "high": {
      "url": "https://i.ytimg.com/vi/Q4bD9w3JnaU/hqdefault.jpg",
      "width": 480,
      "height": 360
     },
     "standard": {
      "url": "https://i.ytimg.com/vi/Q4bD9w3JnaU/sddefault.jpg",
      "width": 640,
      "height": 480
     },
     "maxres": {
      "url": "https://i.ytimg.com/vi/Q4bD9w3JnaU/maxresdefault.jpg",
      "width": 1280,
      "height": 720
     }
    },
    "channelTitle": "Objectivity",
    "playlistId": "UUtwKon9qMt5YLVgQt1tvJKg",
    "position": 1,
    "resourceId": {
     "kind": "youtube#video",
     "videoId": "Q4bD9w3JnaU"
    }
   }
  },
  {
   "kind": "youtube#playlistItem",
   "etag": "\"XI7nbFXulYBIpL0ayR_gDh3eu1k/wmFacX2RYcyl6hVOsuhURZPk3oQ\"",
   "id": "VVV0d0tvbjlxTXQ1WUxWZ1F0MXR2SktnLmt5RU1iTTJFZ1pr",
   "snippet": {
    "publishedAt": "2018-07-03T15:26:57.000Z",
    "channelId": "UCtwKon9qMt5YLVgQt1tvJKg",
    "title": "Keith Wears the White Gloves of Destiny - Objectivity #173",
    "description": "Royal Society Head Librarian Keith Moore dons the white gloves and takes a lucky dip into the card catalogue. More links below ↓↓↓\n\nFeaturing Keith Moore from The Royal Society speaking with Brady.\n\nObjectivity on Patreon: https://www.patreon.com/objectivity\n\nCheck out our White Gloves of Destiny playlist: http://bit.ly/White_Gloves\n\nSubscribe to Objectivity: http://bit.ly/Objectivity_Sub\nFilms by James Hennessy and Brady Haran\n\nRoyal Society website: http://bit.ly/Royal_Society\n\nThe Royal Society's own YouTube channel: https://www.youtube.com/user/RoyalSociety\n\nFacebook: https://www.facebook.com/ObjectivityVideos\nTwitter: https://twitter.com/objectivity_vid\n\nPatron thank you page: http://www.bradyharanblog.com/objectivity-patrons\n\nObjectivity T-Shirts: https://teespring.com/en-GB/stores/objectivity",
    "thumbnails": {
     "default": {
      "url": "https://i.ytimg.com/vi/kyEMbM2EgZk/default.jpg",
      "width": 120,
      "height": 90
     },
     "medium": {
      "url": "https://i.ytimg.com/vi/kyEMbM2EgZk/mqdefault.jpg",
      "width": 320,
      "height": 180
     },
     "high": {
      "url": "https://i.ytimg.com/vi/kyEMbM2EgZk/hqdefault.jpg",
      "width": 480,
      "height": 360
     },
     "standard": {
      "url": "https://i.ytimg.com/vi/kyEMbM2EgZk/sddefault.jpg",
      "width": 640,
      "height": 480
     },
     "maxres": {
      "url": "https://i.ytimg.com/vi/kyEMbM2EgZk/maxresdefault.jpg",
      "width": 1280,
      "height": 720
     }
    },
    "channelTitle": "Objectivity",
    "playlistId": "UUtwKon9qMt5YLVgQt1tvJKg",
    "position": 2,
    "resourceId": {
     "kind": "youtube#video",
     "videoId": "kyEMbM2EgZk"
    }
   }
  },
  {
   "kind": "youtube#playlistItem",
   "etag": "\"XI7nbFXulYBIpL0ayR_gDh3eu1k/psXgQ2Mq8ArcWt-7EKYiSEX9ZzY\"",
   "id": "VVV0d0tvbjlxTXQ1WUxWZ1F0MXR2SktnLjZMWDMzRnNyX1VF",
   "snippet": {
    "publishedAt": "2018-06-26T13:34:13.000Z",
    "channelId": "UCtwKon9qMt5YLVgQt1tvJKg",
    "title": "The Death of Emily Davison - Objectivity #172",
    "description": "We return to the AP Archive to look at the oldest and newest films from their Movietone collection. More on 23andMe: https://www.23andMe.com/Objectivity\n\nFeaturing Jenny Hammerton from AP Archive speaking with Brady.\n\nAP Archive website: http://www.aparchive.com\nFollow them on twitter: @AP_Archive\nAnd YouTube: https://www.youtube.com/c/BritishMovietone\nMore YouTube: https://www.youtube.com/c/aparchive\n\nObjectivity on Patreon: https://www.patreon.com/objectivity\n\nPS: Further information on the bricks video...  If a film was found in the vaults that didn’t have an identifying issue number, Movietone would assign it the next number in the queue, hence this one being assigned 86/101.  Upon further investigation, it seems that this film was shot in 1974 rather than 1986.\n\nSubscribe to Objectivity: http://bit.ly/Objectivity_Sub\nFilms by James Hennessy and Brady Haran\n\nFacebook: https://www.facebook.com/ObjectivityVideos\nTwitter: https://twitter.com/objectivity_vid\n\nPatron thank you page: http://www.bradyharanblog.com/objectivity-patrons\n\nObjectivity T-Shirts: https://teespring.com/en-GB/stores/objectivity\n\nThanks to our Patreon supporters and sponsors for helping cover the cost of production - we couldn't make videos without them. However our special guests and organisations featured in the videos do not endorse or benefit from any sponsorship.",
    "thumbnails": {
     "default": {
      "url": "https://i.ytimg.com/vi/6LX33Fsr_UE/default.jpg",
      "width": 120,
      "height": 90
     },
     "medium": {
      "url": "https://i.ytimg.com/vi/6LX33Fsr_UE/mqdefault.jpg",
      "width": 320,
      "height": 180
     },
     "high": {
      "url": "https://i.ytimg.com/vi/6LX33Fsr_UE/hqdefault.jpg",
      "width": 480,
      "height": 360
     },
     "standard": {
      "url": "https://i.ytimg.com/vi/6LX33Fsr_UE/sddefault.jpg",
      "width": 640,
      "height": 480
     },
     "maxres": {
      "url": "https://i.ytimg.com/vi/6LX33Fsr_UE/maxresdefault.jpg",
      "width": 1280,
      "height": 720
     }
    },
    "channelTitle": "Objectivity",
    "playlistId": "UUtwKon9qMt5YLVgQt1tvJKg",
    "position": 3,
    "resourceId": {
     "kind": "youtube#video",
     "videoId": "6LX33Fsr_UE"
    }
   }
  },
  {
   "kind": "youtube#playlistItem",
   "etag": "\"XI7nbFXulYBIpL0ayR_gDh3eu1k/7VfG6kYO6lMC6TAweKLU6DD3QS4\"",
   "id": "VVV0d0tvbjlxTXQ1WUxWZ1F0MXR2SktnLkI0Z3p0OHY0MDFR",
   "snippet": {
    "publishedAt": "2018-06-19T12:59:06.000Z",
    "channelId": "UCtwKon9qMt5YLVgQt1tvJKg",
    "title": "Squaring The Circle (feat. James Grime) - Objectivity #171",
    "description": "Dr James Grime joins us at The Royal Society to look at all manner of things from secret codes to fancy objects... More links below ↓↓↓\n\nFeaturing Dr James Grime speaking with Brady and Keith Moore.\n\nAccompanying documents: http://www.bradyharanblog.com/blog/squaring-the-circle-the-documents\n\nSquaring the circle: https://youtu.be/CMP9a2J4Bqw\n\nWhite Gloves of Destiny: http://bit.ly/White_Gloves\n\nCheck out Numberphile: https://www.youtube.com/user/numberphile\n\nObjectivity on Patreon: https://www.patreon.com/objectivity\n\nSubscribe to Objectivity: http://bit.ly/Objectivity_Sub\nFilms by James Hennessy and Brady Haran\n\nRoyal Society website: http://bit.ly/Royal_Society\n\nThe Royal Society's own YouTube channel: https://www.youtube.com/user/RoyalSociety\n\nFacebook: https://www.facebook.com/ObjectivityVideos\nTwitter: https://twitter.com/objectivity_vid\n\nPatron thank you page: http://www.bradyharanblog.com/objectivity-patrons\n\nObjectivity T-Shirts: https://teespring.com/en-GB/stores/objectivity",
    "thumbnails": {
     "default": {
      "url": "https://i.ytimg.com/vi/B4gzt8v401Q/default.jpg",
      "width": 120,
      "height": 90
     },
     "medium": {
      "url": "https://i.ytimg.com/vi/B4gzt8v401Q/mqdefault.jpg",
      "width": 320,
      "height": 180
     },
     "high": {
      "url": "https://i.ytimg.com/vi/B4gzt8v401Q/hqdefault.jpg",
      "width": 480,
      "height": 360
     },
     "standard": {
      "url": "https://i.ytimg.com/vi/B4gzt8v401Q/sddefault.jpg",
      "width": 640,
      "height": 480
     },
     "maxres": {
      "url": "https://i.ytimg.com/vi/B4gzt8v401Q/maxresdefault.jpg",
      "width": 1280,
      "height": 720
     }
    },
    "channelTitle": "Objectivity",
    "playlistId": "UUtwKon9qMt5YLVgQt1tvJKg",
    "position": 4,
    "resourceId": {
     "kind": "youtube#video",
     "videoId": "B4gzt8v401Q"
    }
   }
  }
 ]
}]

resulty = json_normalize(data[0]['items'])
print(resulty)

result(the result is truncated):

etag                                                id                  kind         snippet.channelId snippet.channelTitle                                snippet.de
0  "XI7nbFXulYBIpL0ayR_gDh3eu1k/-732mLgEsS0C4OF4t...  VVV0d0tvbjlxTXQ1WUxWZ1F0MXR2SktnLlNmeWQtMHRYQXJj  youtube#playlistItem  UCtwKon9qMt5YLVgQt1tvJKg          Objectivity  Check out The Great Courses Plus (free t
1  "XI7nbFXulYBIpL0ayR_gDh3eu1k/0uFgwY3htzKsLWY3V...  VVV0d0tvbjlxTXQ1WUxWZ1F0MXR2SktnLlE0YkQ5dzNKbmFV  youtube#playlistItem  UCtwKon9qMt5YLVgQt1tvJKg          Objectivity  David Eisenbud joins us at The Royal Soc
2  "XI7nbFXulYBIpL0ayR_gDh3eu1k/wmFacX2RYcyl6hVOs...  VVV0d0tvbjlxTXQ1WUxWZ1F0MXR2SktnLmt5RU1iTTJFZ1pr  youtube#playlistItem  UCtwKon9qMt5YLVgQt1tvJKg          Objectivity  Royal Society Head Librarian Keith Moore
3  "XI7nbFXulYBIpL0ayR_gDh3eu1k/psXgQ2Mq8ArcWt-7E...  VVV0d0tvbjlxTXQ1WUxWZ1F0MXR2SktnLjZMWDMzRnNyX1VF  youtube#playlistItem  UCtwKon9qMt5YLVgQt1tvJKg          Objectivity  We return to the AP Archive to look at t
4  "XI7nbFXulYBIpL0ayR_gDh3eu1k/7VfG6kYO6lMC6TAwe...  VVV0d0tvbjlxTXQ1WUxWZ1F0MXR2SktnLkI0Z3p0OHY0MDFR  youtube#playlistItem  UCtwKon9qMt5YLVgQt1tvJKg          Objectivity  Dr James Grime joins us at The Royal Soc

This can help you to flatten JSON files with Python and pandas.

If you want you can filter the final columns in the output by:

from pandas.io.json import json_normalize
resultx = json_normalize((data[0]['items']))
print(resultx)
resultx = resultx.reindex(columns=['etag', 'snippet.description', 'id' ])
print(resultx)

result:

etag                                snippet.description                                                id
0  "XI7nbFXulYBIpL0ayR_gDh3eu1k/-732mLgEsS0C4OF4t...  Check out The Great Courses Plus (free trial):...  VVV0d0tvbjlxTXQ1WUxWZ1F0MXR2SktnLlNmeWQtMHRYQXJj
1  "XI7nbFXulYBIpL0ayR_gDh3eu1k/0uFgwY3htzKsLWY3V...  David Eisenbud joins us at The Royal Society t...  VVV0d0tvbjlxTXQ1WUxWZ1F0MXR2SktnLlE0YkQ5dzNKbmFV
2  "XI7nbFXulYBIpL0ayR_gDh3eu1k/wmFacX2RYcyl6hVOs...  Royal Society Head Librarian Keith Moore dons ...  VVV0d0tvbjlxTXQ1WUxWZ1F0MXR2SktnLmt5RU1iTTJFZ1pr
3  "XI7nbFXulYBIpL0ayR_gDh3eu1k/psXgQ2Mq8ArcWt-7E...  We return to the AP Archive to look at the old...  VVV0d0tvbjlxTXQ1WUxWZ1F0MXR2SktnLjZMWDMzRnNyX1VF
4  "XI7nbFXulYBIpL0ayR_gDh3eu1k/7VfG6kYO6lMC6TAwe...  Dr James Grime joins us at The Royal Society t...  VVV0d0tvbjlxTXQ1WUxWZ1F0MXR2SktnLkI0Z3p0OHY0MDFR

Resources

Related Article