Nowadays, it is a quite popular to store semi-structured information using JSON format. Indeed, JSON files have quite simple structure and can be easily read by human beings. JSON syntax allows one to represent complex dependencies in data and avoid data duplication. Moreover, all modern programming languages have libraries that facilitate JSON parsing and storing data into this format. Not surprisingly, JSON is extensively used to return data in Application Programming Interfaces (APIs) .
At the same time, data analysts prefer to deal with structured data represented in the form of series and dataframes. Unfortunately, transforming JSON data into structured format is not that straightforward. Previously, I preferred to develop code to parse manually complex JSON files and create a pandas dataframe from the parsed data. However, recently I have discovered a pandas function called json_normalize
that saved me some time in my projects. In this article, I explain how you can start using it in your projects.