The mission
Imagine you are an analyst for a national health ministry, and there’s a Cardiovascular Diseases (CVDs) epidemic. The minister has made it a priority to reverse the growth and reduce the caseload to a 20-year low. To this end, a task force has been created to find clues in the data to ascertain the following:
- What risk factors can be addressed.
- If future cases can be predicted, interpret predictions on a case-by-case basis.
You are part of this task force!
Details about CVD
Before we dive into the data, we must gather some important details about CVD in order to do the following:
- Understand the problem’s context and relevance.
- Extract domain knowledge information that can inform our data analysis and model interpretation.
- Relate an expert-informed background to a dataset’s features.
CVDs are a group of disorders, the most common of which is coronary heart disease (also known as Ischaemic Heart Disease). According to the World Health Organization, CVD is the leading cause of death globally, killing close to 18 million people annually. Coronary heart disease and strokes (which are, for the most part, a byproduct of CVD) are the most significant contributors to that. It is estimated that 80% of CVD is made up of modifiable risk factors. In other words, some of the preventable factors that cause CVD include the following:
- Poor diet
- Smoking and alcohol consumption habits
- Obesity
- Lack of physical activity
- Poor sleep
Also, many of the risk factors are non-modifiable and, therefore, known to be unavoidable, including the following:
- Genetic predisposition
- Old age
- Male (varies with age)
We won’t go into more domain-specific details about CVD because it is not required to make sense of the example. However, it can’t be stressed enough how central domain knowledge is to model interpretation. So, if this example was your job and many lives depended on your analysis, it would be advisable to read the latest scientific research on the subject and consult with domain experts to inform your interpretations.