What is tidy data
To tidy something means to arrange it, to put it in order. Consequently, tidy data means that our data has a specific order and should follow a set of rules to be considered ready to be worked.
A dataset can be arranged in different ways. For those that, like me, worked for many years with Microsoft Excel, at first sight, a tidy dataset may seem odd, as there will be plenty of repeated cells. Many datasets I worked with in MS Excel had the same measurement split among many columns. A classic example of that is the monthly reports that bring the first columns as the descriptive part of the data (for example, product, profit, and loss), and the values refering to them are shown in one column each month.
Figure 8.2 – Example of dataset not in Tidy format
The table from Figure 8.2 is comfortable to look at but not useful for an algorithm or a programming language. If you try to determine what is the best month for sales, it will require...