Data analysis in three steps
There are three steps involved:
- Get the data.
- Clean it up.
- Analyze it (and then act on it).
Obviously, analysis (and action) is the crucial step.
Unfortunately, we spend too much time cleaning up the data, as mentioned in step 2.
Why does the data need cleaning? Because we do not understand how exactly to get data in the first step! It is a bad, vicious cycle.
Let's solve the problem in a simple, logical manner.
When you buy products, or log in to an app and use it, or get a medical test done, or just travel to a place – all these activities are generating data. Someone is typing it somewhere in an app. In other cases, data can be generated automatically, such as a history of which videos you have seen. All this is input data.
Input data is usually lengthy as it has an ever-growing number of rows. By looking at and scrolling the input data...