Data Formatting
In this topic, we will format a given dataset. The main motivations behind formatting data properly are as follows:
It helps all the downstream systems to have a single and pre-agreed form of data for each data point, thus avoiding surprises and, in effect, breaking it.
To produce a human-readable report from lower-level data that is, most of the time, created for machine consumption.
To find errors in data.
There are a few ways to do data formatting in Python. We will begin with the modulus operator.
The % operator
Python gives us the % operator to apply basic formatting on data. To demonstrate this, we will load the data first by reading the CSV file, and then we will apply some basic formatting on it.
Load the data from the CSV file by using the following command:
from csv import DictReader raw_data = [] with open("combinded_data.csv", "rt") as fd: data_rows = DictReader(fd) for data in data_rows: raw_data.append(dict(data))
Now, we have a list called raw_data that...