Working with missing data
There are several problems you need to fix to ensure good-quality data. One of those problems is the case of missing data. This is because missing data gets in the way of us having a representative dataset and it may result in an incomplete view of reality. For many different reasons, we may have ended up with some empty rows. This may have happened when collecting, storing, or migrating the data. First and foremost, it is good to try and find out what has caused data to be missing and to see whether it can be fixed at the source. However, sometimes we just have to accept that there is data that we cannot retrieve anymore, and then questions remain regarding how we can find it and what to do with it.
Before we get into how to work with missing data, it is good to understand why we need to fix it. The problem with missing data is that it gives wrong results. For example, if we want to look at simple summary statistics, missing data can fool us. It may seem...