Data cleaning
First of all, we need to actually import the data to our R environment (oh yeah, I was taking for granted that we are going to use R for this, hope you do not mind).
We can leverage our old friend the rio
package, running it on all of the three files we were provided, once we have unzipped them. Take a minute to figure out if you can remember the function needed to perform the task.
Done? OK, find the solution as follows:
cash_flow_report <- import("cash_flow.csv") customer_list <- import("customer_list.txt") stored_data <- import("stored_data.rds")
Tidy data
Before actually looking at our data, we should define how we want it to be arranged in order to allow for future manipulation and analyses. Currently, one of the most adopted frameworks for data arrangement and handling is the so called tidy data
framework. The concepts behind this framework were originally defined by Hadley Wickham, and nowadays come paired with a couple of R packages that help to apply it...