Data Formatting
In this section, we will format a given dataset. The main motivations behind formatting data properly are as follows:
- It helps all the downstream systems have a single and pre-agreed form of data for each data point, thus avoiding surprises and, in effect, there is no risk which might break the system.
- To produce a human-readable report from lower-level data that is, most of the time, created for machine consumption.
- To find errors in data.
There are a few ways to perform data formatting in Python. We will begin with the modulus %
operator.
The % operator
Python gives us the modulus %
operator to apply basic formatting on data. To demonstrate this, we will load the data by reading the combined_data.csv
file, and then we will apply some basic formatting to it.
Note
The combined_data.csv
file contains some sample medical data for four individuals. The file can be found here: https://packt.live/310179U.
We can load the data from the...