Treating missing data
Usually, those observations won’t be valid for statistics calculations, as there is no value present. Therefore, despite having calculated the descriptive statistics before handling the missing values, it won’t affect our results or insights. However, for the continuation of the data analysis, we must handle the NA
values to understand whether those carry a meaning or not and then decide how to proceed with them.
A missing data point is also information. It can mean that the data was erroneously missed by human or system error, or it can mean that a person did not respond to a question, for example. So, if we were dealing with a system log and seeing a bunch of NA
values, it would be necessary to check whether the measurements were being correctly registered or whether those missing data points should be expected. Another example to be considered: on poll data, if there are a lot of missing answers, it can be either that nobody is answering the...