Analyzing the data
As we have seen in our journey in this book, data preprocessing is not an island and the best data preprocessing is done by being informed about the analytics goals. So we will continue preprocessing the data as we go about answering the four questions in this case study. Let's progress in this subsection one AQ at a time.
Analysis question one – is there a significant difference between the mental health of employees across the attribute of gender?
To answer this question, we need to visualize the interaction between three attributes: Gender
, Mental Illness
, and Treatment
. We are aware that the Mental Illness
attribute has 536 missing MAR
values and those missing values have a relationship with the Treatment
attribute. However, as the goal of the analysis is to see the mental health across Gender
, we can avoid interacting with Treatment
and Mental Illness
and bring the focus of our analysis to the interaction of the Gender
attribute with both of...